Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "confluence". | "confluence" |
description | Yes | - | No | Name of the confluence object. | "My Confluence" |
properties | Yes | - | No | Configuration object | |
user | Yes | - | No |
Username. | "admin" | ||||
password | Yes | - | No | Password or the token in case of Cloud | "adminPassword" |
domain | No | - | No | Domain used to |
log in to Confluence. If the domain is not required by the environment, it is ignored. | "CompanyDomain" | ||||
userFormsAuth | No | false | No | Use login.action POST action to authenticate instead of using BASIC Authorization headers (only for On-Premise version) | false |
cookieTimeout | No | 3000 | No | Cookie Timeout (in secs) | 2000 |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "confluence", "description": "My Confluence Credential", "properties": { "user": "admin", "password": "adminPassword" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "confluence". | "confluence" |
description | Yes | - | No | Name of the connection object. | "My Confluence Connection" |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["17f75ce7d0c7", "d42780003b36"] |
credential | Yes | - | No | Id ID of the credential | "6b235b333a1b" |
properties | Yes | - | No | Configuration object | |
url | Yes | - | No | URL to access the Confluence server in the form of: http://{servername}{:port} In some Confluence installations you must add '/confluence' to the end of the server name – e.g http://wiki.local.search/confluence. The connector uses the REST API to communicate with Confluence. To verify REST , append /rest/api/space at the end of the URL. Test it in a browser. | "http://confluence.company.com/" |
cloud | No | false | No | Select if your server is in the Cloud | true |
indexContainers | No | true | No | Select if containers (space, page, blog) are to be indexed. Clear to index attachments only. | false |
scanRecursively | No | true | No | Select if subfolders are to be scanned. | false |
scanExcludedItems | No | false | No | Select so that the scanner will scan sub items of container items excluded by a pattern (because it matches an exclude pattern or because it doesn't match an include pattern). | true |
stopCrawlOnScannerError | No | true | No | If enabled, crawled will stop if a scanner error is thrown (i.e., a space has no sufficient permissions or does not exist). An error is logged otherwise, and the crawl continues. | false |
anonymousAccessAllowed | No | false | No | Select to indicate anonymous access is allowed in the Confluence instance. If anonymous (or public) access is allowed on your Confluence instance, you can check the "Anonymous access allowed" checkbox. To see if anonymous access is allowed, please see access in your Confluence instance. This has its meaning when Aspire creates ACL's. Basically, if Confluence space has anonymous access allowed, Aspire will assign ACL "public" “public” to it instead of other defined space permissions. But it does not work that way that for all objects to get automatically ACL “public” when anonymous access is allowed. Pages that have explicit restrictions should retain their ACL’s. Only pages that have inherited security from the space with anonymous access allowed would get ACL’s “public”. | true |
limitItemContentSize | No | false | No | Impose a max limit for on the size of the page content that can be extracted from Confluence or the time it takes to read the content. Pages /w content over this size or which take longer then than the timeout will have their content replaced with a configurable string. These pages will still have their metadata extracted. | true |
maxItemContentSize | true | 10000 | No | The maximum allowed content size (in kilobytes.) | 20000 |
readItemContentTimeout | true | 30 | No | The maximum amount of time (in secs) to wait while reading the content bytes. | 20 |
fetchMetadataWhenContentFails | false | false | No | If the REST API call to get the Page content fails, fetch the metadata only. | true |
removedContentReplacement | false | ItemContentRemoved | No | A string/token to replace the content when the content exceeds the max allowed size, or it cannot be read in the allotted time or the REST content fetch request fails. | "ContentRemoved" |
connectionTimeout | No | 15000 | No | Maximum time to wait (in millismilliseconds) for the connection | 30000 |
readTimeout | No | 30000 | No | Maximum time to wait for read (in millismilliseconds) | 40000 |
retries | No | 3 | No | Maximum number of retries for a failed document | 1 |
retryDelay | No | 15000 | No | Retry delay (in millismilliseconds) | 30000 |
maxRetryDelay | No | 600000 | No | Maximum retry delay (in millismilliseconds) | 500000 |
retryDelayMultiplier | No | 1.0 | No | Retry delay multiplier | 1.5 |
resultSetLimit | No | 100 | No | The maximum number of records to be retrieved at a time per page through the Confluence REST API. | 200 |
logRestAPI | No | false | No | Select to Log REST API requests details on the INFO level. | true |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": confluence, "description": "Confluence", "properties": { "url": "https://coreengtest.atlassian.net", "cloud": true, "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "stopCrawlOnScannerError": true, "anonymousAccessAllowed": false, "limitItemContentSize": false, "connectionTimeout": "15000", "readTimeout": "30000", "retries": "3", "retryDelay": "15000", "maxRetryDelay": "600000", "retryDelayMultiplier": "1.0", "resultSetLimit": "100", "logRestAPI": false } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id ID of the connection to update | "d442adcab4b0", |
description | No | - | No | Name of the connection object. | "My RDB Connection" |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["17f75ce7d0c7", "d42780003b36"] |
credential | No | - | No | Id ID of the credential | "6b235b333a1b" |
properties | No | - | No | Configuration object | |
(see create connection) |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "id": "89d6632a-a296-426c-adb0-d442adcab4b0", "description": "Confluence", "properties": { "url": "https://coreengtest.atlassian.net", "cloud": true, "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "stopCrawlOnScannerError": true, "anonymousAccessAllowed": false, "limitItemContentSize": false, "connectionTimeout": "15000", "readTimeout": "30000", "retries": "3", "retryDelay": "15000", "maxRetryDelay": "600000", "retryDelayMultiplier": "1.0", "resultSetLimit": "100", "logRestAPI": false } } |
For the creation of the Connector object using the Rest API check , please refer to this page
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "confluence". | "confluence" |
description | Yes | - | No | Name of the seed object. | "My Confluence Seed" |
connector | Yes | - | No | The id ID of the connector to be used with this seed. The connector type must match the seed type. | "e3ca414b0d31" |
connection | Yes | - | No | The id ID of the connection to be used with this seed. The connection type must match the seed type. | "e4a663fe9ee6" |
workflows | No | [ ] | Yes | The ids IDs of the workflows that will be executed for the documents crawled. | ["5696c3f0bda4"] |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this seed object. | "6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this seed will use. | ["17f75ce7d0c7", "d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed. | ["tag1", "tag2"] |
properties | Yes | - | No | Configuration object | |
useKeysForSpacesList | No | true | No | If turned on, all Space Inclusion/Exclusion lists should specify Space Keys. Otherwise, Space Names should be used. | false |
spaces | No | - | Yes | Crawl only these spaces | [ { "space": "PEPO" } ] |
space | Yes | - | No | The key or name of the space to be crawled. | |
spacesFile | No | - | No | Path to the file that contains spaces, keys, or names to be crawled. 1 space per line. If set, the spaces coming from this file override the space list provided in the Config UI. | "/path/to/file/that/contains/spaces" |
excludedSpaces | No | - | Yes | Do Not crawl only spaces | [ { "space": "PEPO" } ] |
space | No | - | No | Key or Name of space to be excluded from crawling. Use the display name of the space. | |
excludedSpacesFile | No | - | No | Path to the file that contains spaces, keys, or names to be excluded from the crawl. 1 space per line. If set, the spaces coming from this file override the excluded space list provided in the Config UI. | "/path/to/file/that/contains/excluded_spaces" |
excludePersonalSpaces | No | true | No | Exclude personal spaces | false |
excludeArchivedSpaces | No | true | No | Exclude archived spaces | false |
includes | YesNo | - [ ] | No | The name of the column in the returned data which holds the sequence number of the update. This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here. | "sequence" |
include | Yes | - | No | The name of the column in the returned data which holds action of the update (ie Insert, Update or Delete). This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here | "action" |
excludes | No | false | No | Checking this option allows incremental crawls to use SQL that is bounded by a condition. When entering SQL you may use the variables {lowerBound} and {upperBound} in a WHERE clause to limit the data collected. The {upperBound} will be calculated at the start of the crawl. The {lowerBound} will be the {upperBound} from the previous crawl. Two types of bounding are available - 'Timestamp' returns the bounds as a 'long' value representing the current system time whilst 'SQL' allows you to define SQL to return the new upper bound when the crawl starts | true | exclude | No | - | No | The SQL run when the crawl starts to return the new upper bound. The upper bound will be taken from the first column of the first row returned |
includeAttachments | Yes (aclColumn or aclSQL) | - | No | The column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT | "acl" |
includeComments | Yes (aclColumn or aclSQL) | - | No | The query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' || user || '@NT;' FROM myTable | "SELECT * FROM table_acl" |
Yes | Patterns to match against document URL, if any of them match, the document will be included in the crawl. | [ ".*pdf$", ".*docx$" ] | |||
excludes | No | [ ] | Yes | Patterns to match against document URL, if any of them match, the document will be excluded from the crawl. | [ ".*png$", ".*jpeg$" ] |
includeAttachments | No | false | No | Select to include attachments in the crawl | true |
includeComments | No | false | No | Select to include comments in the crawl | true |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "confluence", "description": "Confluence", "properties": { "useKeyForSpacesLists": true, "spaces": [ { "space": "PEPO" } ], "spacesFile": "", "excludedSpaces": [], "excludedSpacesFile": "", "excludePersonalSpaces": true, "excludeArchivedSpaces": true, "includes": [], "excludes": [], "includeAttachments": false, "includeComments": false } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id ID of the seed to update | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
(see the "Create seed" section on this page for other fields) |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "id": "2f287669-d163-4e35-ad17-6bbfe9df3778", "seed": "test_db", "description": "RDB_Test", "properties": { "useKeyForSpacesLists": true, "spaces": [ { "space": "PEPO" } ], "spacesFile": "", "excludedSpaces": [], "excludedSpacesFile": "", "excludePersonalSpaces": true, "excludeArchivedSpaces": true, "includes": [], "excludes": [], "includeAttachments": false, "includeComments": false } } |