The "Scanner" section of the Connector contains the configuration elements specific to the Processing and Scanning of documents. They are described below:

  • Scanner Threads: the maximum number of threads that will scan the repository at any one time..
  • Scan Queue Size: the size of the in-memory queue for items that need to be scanned in the repository. The recommended minimum size is a 1 to 1 relation of queue items to scanned items and then evaluate if it is necessary to make it 2 to 1 or even 3 to 1. Larger queues allow smoother data retrieval from the NoSQL storage.
  • Processing Threads: the maximum number of threads that will process items from the repository at any one time.
  • Processing Queue Size: the size of the in-memory queue for items that need to be processed. The recommended minimum size is a 1 to 1 relation of queue items to processing items and then evaluate if it is necessary to make it 2 to 1 or even 3 to 1. Larger queues allow smoother data retrieval from the NoSQL storage.
  • Queue Claim Time: maximum depth for a file inner structure. This value is useful to avoid corrupted files and to block Denial of Service attacks.
  • Delete Completed Entries: enable if completed queue entries should be deleted (or just marked as complete).
  • Snapshot Flush Synchronization Time: time to wait for all servers to finish their flushes to the snapshot at the end of each incremental crawl.
  • Check If Errored Items Should Be Deleted: checks "delete" candidates after incremental when they are part of scan error.
  • Number of Crawls Before Cleaning Identities From the Cache : number of crawls to execute before removing the oldest identity crawl items.


  • No labels