Where are the metrics and statistics for a crawl?

Aspire 5.0 doesn't store any metrics or statistics for a crawl. Instead, each event is logged in a NoSQL database so that the user can extract the statistics that better suit their needs. See Logging & Metrics for more information.

Why can't I see all metadata fields in my index?

All publishers provide a default transformation file which maps a basic set of fields. If there are any other fields that need to be mapped, those need to be added.

Why does an incremental crawl last as long as a full crawl?

Some connectors perform incremental crawls based on snapshot entries, which are meant to match the exact documents that have been indexed by the connector to the search engine. On an incremental crawl, the connector fully crawls the repository the same way as a full crawl, but it only indexes the modified, new or deleted documents during that crawl.

Why do some crawls take longer to start?

On Aspire 5.0 crawl related components (Connector, Extract Text, Workflow Applications, Publishers, etc.) are loaded on demand. This means that when the worker gets an item to be processed, it checks that it has all required components already loaded. If not, components will be loaded before continuing with the crawl, which may add some time to the crawl process. Components are offloaded after a certain idle time, and will be loaded again if required.

  • No labels