Some of the features of the eRoom connector include:
- Performs incremental crawling (so that only new/updated documents are indexed).
- Fetches LDAP (including Active Directory) access control lists (ACLs) for document-level security (including users and groups).
- Metadata extraction.
- Is search engine independent
- Runs from any machine with access to the given eRoom site.
- Designed to support early binding mechanisms and group expansion of nested permissions.
- Filter the crawled documents by file names using regex patterns.
- Supports Windows/Linux/MacOS file shares.
The eRoom connector retrieves several types of documents, listed below are the inclusions and exclusions of these documents.
- Calendar and Events
- Project Plan and Tasks
- Database and Rows
- Inbox (basic information)
- Other Files
- Comments on items
- Emails on Inbox item
- Content of Link addressed pages
Due to API limitations, the eRoom connector has the following limitations:
Multi-Threads Technical limitation
The connector will use SOAP / XML over HTTP or HTTPs to acquire information of eRoom content. The connector acquires content by doing the following:
- Go recursively through all items and documents of an eRoom site, creates sub-jobs for each object discovered. Each sub-job contains all metadata available, including ACLs.
- Saves the item states into a MongoDB instance in order to compare and perform the incremental crawls with added, updated and deleted items.