Page History
...
This separation of the Connector Framework and the Connectors Implementations allows a very natural usage of the Connectors Implementations outside Connector Framework and even outside Aspire at all.
The Among the tasks that the RepositoryAccessProvider is responsible for three , there are five main different tasks
- Initializing the crawl configuration or SourceInfo, from the user configuration properties
- Initial URL, username, passwords, etc.
- Method: newSourceInfo(AspireObject properties)
- Extract the initial or root crawl items
- The discovered items are sent into a ScanListener
- Method: processCrawlRoot(SourceItem root, SourceInfo info, ScanListener listener)
- Populate the extracted items metadata
- Scan a container item
- The discovered sub-items are sent into a ScanListener
- Method: scan(SourceItem item, SourceInfo info, RepositoryConnection conn, ScanListener listener)
- Fetch the content stream for an item
- Open an input stream of the content of each of the items if available
- Method: getFetcher()
Putting all together is responsibility of the Connector Framework or you as a stand-alone developer of a given connector implementation. Here is a guide on how to do so:
Step 1
Create a Java maven project and import the following dependencies into its pom.xml file
Code Block | ||
---|---|---|
| ||
<dependency>
<groupId>com.searchtechnologies.aspire</groupId>
<artifactId>aspire-connector-services</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>com.searchtechnologies.aspire</groupId>
<artifactId>aspire-connector-framework</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>com.searchtechnologies.aspire</groupId>
<artifactId>aspire-services</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>org.osgi</groupId>
<artifactId>org.osgi.core</artifactId>
<version>4.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.osgi</groupId>
<artifactId>org.osgi.compendium</artifactId>
<version>4.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.searchtechnologies.aspire</groupId>
<artifactId>THE-CONNECTOR-TO-USE</artifactId>
<version>3.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>com.searchtechnologies.aspire</groupId>
<artifactId>aspire-core</artifactId>
<version>3.0</version>
</dependency>
|
For Legacy connectors standalone crawls see:
...
Overview
Content Tools