This section describes in details each of the components involved in the Connector Framework.
The main component's bundle jar component the framework uses is aspire-connector-framework, this bundle contains all the Stages, components and also provides interfaces for the specific connector implementations.
Table of Contents | ||
---|---|---|
|
In Aspire every component needs to be referenced from an AppBundle or application.xml file, which describes the job execution flow. For the Connector Framework we have one common AppBundle called app-rap-connector.
This AppBundle is automatically loaded when Aspire detects it needs to load a connector. It contains all the PipelineManagers, Pipelines and references to the Connector Framework Components and Stages from the aspire-connector-framework bundle.
The CrawlController is the main entry point for incoming crawl start signals, also it controls the ConnectionPool and manages the NoSQLConnections to Mongo used by the rest of the components. It also handles the distributed crawl start and synchronizes the crawl status with Mongo so all Aspire servers have the same.
It is an instance of the QueueLoader class which claims items from the scanQueue collection in Mongo, marks them as in-progress "P" and enqueues a job into the ScanPipelineManager for each container that needs to be scanned.