![](/download/thumbnails/722076238/confl1.PNG?version=1&modificationDate=1568804190000&api=v2)
![](/download/thumbnails/722076238/confl2.PNG?version=1&modificationDate=1568804218000&api=v2)
![](/download/thumbnails/722076238/confl3.PNG?version=1&modificationDate=1568804235000&api=v2)
In the "Connector" tab, specify the connection information to crawl the Confluence Identity.
Confluence url: URL to access the Confluence server in the form of: http://{servername}{:port}. In some Confluence installations you must add "/confluence" to the end of the server name – e.g http://wiki.local.search/confluence . The connector uses REST API to communicate with Confluence. To verify REST append /rest/api/space at the end of the URL. Test it in a browser.
Domain: Domain used to login to Confluence. If the domain is not required by the environment it is ignored.
Username: Username with admin privileges to access all Confluence content, this will be the user used to crawl the Confluence instance. i.e part of the confluence-admin group
- Password: Password
Use login.action.form: Use login.action POST action to authenticate instead of using BASIC Authorization headers
- ---------------------------------------------------------------------------------------------------------------------------
- Include users: Select to include users in the crawl
- All Users Group(s): The group(s) that contains all Confluence server users - use comma delimited values for multiple entries. Drives the selection of the entire set of users.
- Populate User Email: Confluence provides a prototype API to retrieve a user's email. Check this box if the API should be used to populate the user's email field. The email will be populated in the displayUrl field in the Aspire output object.
- ----------------------------------------------------------------------------------------------------------------------------
- Include groups: Select to include groups in the crawl.
- Group(s) selection: The group(s) that should be included - use comma delimited values for multiple entries. Nothing here means all groups.
- Ingest Only Groups In ACLs: Select to ingest only groups in ACLs.
- Content Sources: Confluence connector Content Sources Databases: a comma delimited list of content sources ids that should have their aclMaps collections queried to see if the groups exist in ACLs
- Publish Scope: Group related entities that should be sent for processing to the connector workflow:
- Publish Groups Only
- Publish Members Only
- Publish Groups And Members
- ------------------------------------------------------------------------------------------------------------------------------
- Page Result Set Limit: The maximum number of records to be retrieved at a time per page through the Confluence REST API.
- ------------------------------------------------------------------------------------------------------------------------------
- Scan excluded items: Select so that the scanner will scan sub items of container items excluded by a pattern (because it matches an exclude pattern or because it doesn't match an include pattern).
- Include patterns: Specify regex display URL patterns to include
- Exclude patterns: Specify regex display URL patterns to exclude
- -------------------------------------------------------------------------------------------------------------------------------
- Connection timeout: Maximum time to wait (in millis) for the connection
- Read timeout: Maximum time to wait for read (in millis)
- Retry policy:
- Always use the same delay (retryDelay) - fixed,
- Multiple the retryDelay by the times we have attempted this call (up to maxRetryDelay) - increasing,
- Increase the delay by a factor (retryDelayMultiplier) of the retryDelay everytime a call is made - cumulative
- Retries: Maximum number of retries a failed document
- Retry delay: Retry delay (in millis)
- Maximum retry delay: Maximum retry delay (in millis)
- Retry delay multiplier
- -------------------------------------------------------------------------------------------------------------------------------
- Ignore scan errors: If selected, the scanning errors won't stop the crawl from continuing, but they are still going to be logged.
Now that the content source is set up, the crawl can be initiated.
- Click on the crawl type option to set it as "Full" (is set as "Incremental" by default and the first time it'll work like a full crawl. After the first crawl, set it to "Incremental" to crawl for any changes done in the repository).
- Click on the Start button.
During the crawl, you can do the following:
If there are errors, you will get a clickable "Error" flag that will take you to a detailed error message page.
If you only want to process content updates from the Confluence Identity (documents which are added, modified, or removed), then click on the "Incremental" button instead of the "Full" button. The Confluence Identity connector will automatically identify only changes which have occurred since the last crawl.
If this is the first time that the connector has crawled, the action of the "Incremental" button depends on the exact method of change discovery. It may perform the same action as a "Full" crawl crawling everything, or it may not crawl anything. Thereafter, the Incremental button will only crawl updates.