You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »


Aspire 3.0 Connectors works by implementing an interface called RepositoryAccessProvider. It specifies the minimum required methods to access, fetch, and scan a given Repository. The Aspire 3.0 Connector Framework is a layer that provides common control code for Full/Incremental Crawling, Distributed Processing, Group Expansion, Schedules and the link between the Aspire Admin User Interface and the crawls, all of this by calling the RepositoryAccessProvider methods when it requires to access the Repository.

This separation of the Connector Framework and the Connectors Implementations allows a very natural usage of the Connectors Implementations outside Connector Framework and even outside Aspire at all.


Among the tasks that the RepositoryAccessProvider is responsible for, there are five main tasks

  1. Initializing the crawl configuration or SourceInfo, from the user configuration properties
    1. Initial URL, username, passwords, etc.
    2. Method: newSourceInfo(AspireObject properties)
  2. Extract the initial or root crawl items
    1. The discovered items are sent into a ScanListener
    2. Method: processCrawlRoot(SourceItem root, SourceInfo info, ScanListener listener)
  3. Populate the extracted items metadata
    1. Method: populate(SourceItem item, SourceInfo info, RepositoryConnection conn)
  4. Scan a container item 
    1. The discovered sub-items are sent into a ScanListener
    2. Method: scan(SourceItem item, SourceInfo info, RepositoryConnection conn, ScanListener listener)
  5. Fetch the content stream for an item
    1. Open an input stream of the content of each of the items if available
    2. Method: getFetcher()


Putting all together is responsibility of the Connector Framework or you as a stand-alone developer of a given connector implementation. Here is a guide on how to do so:

Step 1

Create a Java maven project and import the following dependencies into its pom.xml file

Pom dependencies
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-connector-services</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-connector-framework</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-services</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>org.osgi</groupId>
   <artifactId>org.osgi.core</artifactId>
   <version>4.2.0</version>
   <scope>provided</scope>
</dependency>
<dependency>
   <groupId>org.osgi</groupId>
   <artifactId>org.osgi.compendium</artifactId>
   <version>4.2.0</version>
   <scope>provided</scope>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>THE-CONNECTOR-TO-USE</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-core</artifactId>
   <version>3.0</version>
</dependency>



For Legacy connectors standalone crawls see:

Connector Scanner Stage Test Harness

  • No labels