Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This separation of the Connector Framework and the Connectors Implementations allows a very natural usage of the Connectors Implementations outside Connector Framework and even outside Aspire at all.


The Among the tasks that the RepositoryAccessProvider is responsible for three , there are five main different tasks

  1. Initializing the crawl configuration or SourceInfo, from the user configuration properties
    1. Initial URL, username, passwords, etc.
    2. Method: newSourceInfo(AspireObject properties)
  2. Extract the initial or root crawl items
    1. The discovered items are sent into a ScanListener
    2. Method: processCrawlRoot(SourceItem root, SourceInfo info, ScanListener listener)
  3. Populate the extracted items metadata
    1. Method: populate(SourceItem item, SourceInfo info, RepositoryConnection conn)
  4. Scan a container item 
    1. The discovered sub-items are sent into a ScanListener
    2. Method: scan(SourceItem item, SourceInfo info, RepositoryConnection conn, ScanListener listener)
  5. Fetch the content stream for an item
    1. Open an input stream of the content of each of the items if available
    2. Method: getFetcher()


Putting all together is responsibility of the Connector Framework or you as a stand-alone developer of a given connector implementation. Here is a guide on how to do so:

Step 1

Create a Java maven project and import the following dependencies into its pom.xml file

Code Block
languagexml
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-connector-services</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-connector-framework</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-services</artifactId>
   <version>3.0</version>
</dependency>
<dependency>
   <groupId>org.osgi</groupId>
   <artifactId>org.osgi.core</artifactId>
   <version>4.2.0</version>
   <scope>provided</scope>
</dependency>
<dependency>
   <groupId>org.osgi</groupId>
   <artifactId>org.osgi.compendium</artifactId>
   <version>4.2.0</version>
   <scope>provided</scope>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>THE-CONNECTOR-TO-USE</artifactId>
   <version>3.0-SNAPSHOT</version>
</dependency>
<dependency>
   <groupId>com.searchtechnologies.aspire</groupId>
   <artifactId>aspire-core</artifactId>
   <version>3.0</version>
</dependency>




For Legacy connectors standalone crawls see:

...