Page History
...
Each database holds several collections for its crawl usage:
audit
Holds all the actions of each of the items being processedThis is an example of a document in this collection:
Code Block language js theme DJango { "_id" : ObjectId("571f94498cd956261c112156"), "id" : "C:\\dev-temp\\testdata\\A\\0\\0\\0\\3.txt", "crawlStart" : NumberLong(1461687363561), "url" : "file://C:/dev-temp/testdata/A/0/0/0/3.txt", "type" : "job", "action" : "ADD", "batch" : null, "ts" : NumberLong(1461687366086) }
- _id: Automatically generated unique id
- id: Id of the document
- crawlStart: the identification of the crawl that generated this audit entry (the ID is the time the crawl started in UNIX format)
- url: The url used for fetching the document
- type: can be either job or batch,this is used to identify if the audit correspond to a single document or a batch metadata
- batch: the id of the batch that processed the document
- ts: the time when this entry was added
errors
Holds all errors that happened during the crawls.
This is an example of a document in this collection:Code Block language js theme DJango { "_id" : ObjectId("571f940e8cd956261c112151"), "error" : { "@time" : NumberLong(1461687310975), "@crawlTime" : NumberLong(1461686819675), "@cs" : "File_System_Source", "@processor" : "File_System_Source-10.10.20.203:50505", "@type" : "S", "_$" : "Error starting crawl\ncom.searchtechnologies.aspire.services.AspireException: Bad 'exclude' regex pattern: C:\\dev-temp\\testdata\\A ..." } }
- _id Automatically generated unique id
- error
- @time The time when this error happened
- @crawlTime the identification of the crawl that generated this audit entry (the ID is the time the crawl started in UNIX format)
- @cs The name of the content source that generated this error
- @processor The name of the server that generated this error
- @type The type of error, it can be "S" for scanner error, "D" for document error, "B" for batch error, "F" for Content Source startup failure, or "U" for Unknown
- _$ The detailed error message
- @time The time when this error happened
- _id Automatically generated unique id
hierarchy
Holds the parent hierarchy for all the container items. This is used to generate the item hierarchy in the Populate & Fetch stage.
This is an example of a document in this collection:Code Block language js theme DJango { "_id" : "C:\\dev-temp\\testdata\\B\\5\\9\\3", "name" : "3", "ancestors" : { "_id" : "C:\\dev-temp\\testdata\\B\\5\\9", "name" : "9", "ancestors" : { "_id" : "C:\\dev-temp\\testdata\\B\\5", "name" : "5", "ancestors" : { "_id" : "C:\\dev-temp\\testdata\\B", "name" : "B", "ancestors" : { "_id" : "C:\\dev-temp\\testdata", "name" : "testdata", "ancestors" : null } } } } }
- _id The Id of the container item
- name The name of the container item
- ancestors All the ancestors of this item
- _id Id of the parent container item
- name The name of the parent container item
- ancestors All the ancestors of the parent item
- ... All the grandparents and beyond
- _id Id of the parent container item
- processQueue
- scanQueue
- snapshots
- statistics
- status
- usersAndGroups
Overview
Content Tools