The SharePoint 2013 Scanner component performs full and incremental scans over a SharePoint 2013 repository, maintaining the last SharePoint change token of the repository to get updates next time an incremental crawl is executed. Updated content is then submitted to the configured pipeline in AspireObjects attached to Jobs. As well as the URL of the changed item, the AspireObject will also contain metadata extracted from the repository. Updated content is split in to three types: add, update, and delete. Each type of content is published on a different event so that it may be handled by different Aspire pipelines.

The scanner reacts to an incoming job. This job may instruct the scanner to start, stop, pause, resume or cacheGroups. Typically the start job will contain all information required by the job to perform the crawl. However, the scanner can be configured with default values via application.xml file. When pausing or stopping, the scanner will wait until all the jobs it published have completed before itself completing.

SharePoint 2013 Scanner
Factory Name	com.searchtechnologies.aspire:aspire-sharepoint2013-connector
subType	default
Inputs	AspireObject from a content source submitter holding all the information required for a crawl
Outputs	Jobs from the crawl

Configuration

This section lists all configuration parameters available to configure the SharePoint 2013 Scanner component.

General Scanner Component Configuration

Basic Configuration

Element	Type	Default	Description

Branch Handler Configuration

Element	Type	Default	Description

SharePoint Scanner Configuration

Element	Type	Default	Description

Page tree

Configuration

General Scanner Component Configuration

Basic Configuration

Branch Handler Configuration

SharePoint Scanner Configuration

Example Configuration

Simple

Complex