The Confluence Connector can be configured using the Rest API. It requires the following entities to be created:

Credential
Connection
Connector
Seed

Below are the examples of how to create the Credential, Connection, and the Seed. For the Connector, please check refer to this page.

Easy Heading Free

navigationTitle	On this Page
navigationExpandOption	expand-all-by-default

Create Credential

Field	Required	Default	Multiple	Notes	Example
type	Yes	-	No	The value must be "confluence".	"confluence"
description	Yes	-	No	Name of the confluence object.	"My Confluence"
properties	Yes	-	No	Configuration object
user	Yes	-	No

User name

Username.	"admin"
password	Yes	-	No	Password or the token in case of Cloud	"adminPassword"
domain	No	-	No	Domain used to

login

log in to Confluence. If the domain is not required by the environment, it is ignored.	"CompanyDomain"
userFormsAuth	No	false	No	Use login.action POST action to authenticate instead of using BASIC Authorization headers (only for On-Premise version)	false
cookieTimeout	No	3000	No	Cookie Timeout (in secs)	2000

Example

Code Block

theme	RDark
title	POST aspire/_api/credentials

{
    "type": "confluence",
    "description": "My Confluence Credential",
    "properties": {
        "user": "admin",
        "password": "adminPassword"
    }
}

Create Connection

Field	Required	Default	Multiple	Notes	Example
type	Yes	-	No	The value must be "confluence".	"confluence"
description	Yes	-	No	Name of the connection object.	"My Confluence Connection"
throttlePolicy	No	-	No	Id ID of the throttle policy that applies to this connection object.	"6b235b333a1b"
routingPolicies	No	[ ]	Yes	The ids IDs of the routing policies that this connection will use.	["17f75ce7d0c7", "d42780003b36"]
credential	Yes	-	No	Id ID of the credential	"6b235b333a1b"
properties	Yes	-	No	Configuration object
url	Yes	-	No	URL to access the Confluence server in the form of: http://{servername}{:port} In some Confluence installations you must add '/confluence' to the end of the server name – e.g http://wiki.local.search/confluence. The connector uses the REST API to communicate with Confluence. To verify REST , append /rest/api/space at the end of the URL. Test it in a browser.	"http://confluence.company.com/"
cloud	No	false	No	Select if your server is in the Cloud	true
indexContainers	No	true	No	Select if containers (space, page, blog) are to be indexed. Clear to index attachments only.	false
scanRecursively	No	true	No	Select if subfolders are to be scanned.	false
scanExcludedItems	No	false	No	Select so that the scanner will scan sub items of container items excluded by a pattern (because it matches an exclude pattern or because it doesn't match an include pattern).	true
stopCrawlOnScannerError	No	true	No	If enabled, crawled will stop if a scanner error is thrown (i.e., a space has no sufficient permissions or does not exist). An error is logged otherwise, and the crawl continues.	false
anonymousAccessAllowed	No	false	No	Select to indicate anonymous access is allowed in the Confluence instance. If anonymous (or public) access is allowed on your Confluence instance, you can check the "Anonymous access allowed" checkbox. To see if anonymous access is allowed, please see access in your Confluence instance. This has its meaning when Aspire creates ACL's. Basically, if Confluence space has anonymous access allowed, Aspire will assign ACL "public" “public” to it instead of other defined space permissions. But it does not work that way that for all objects to get automatically ACL “public” when anonymous access is allowed. Pages that have explicit restrictions should retain their ACL’s. Only pages that have inherited security from the space with anonymous access allowed would get ACL’s “public”.	true

limitItemContentSize	No	false	No	Impose a max limit for on the size of the page content that can be extracted from Confluence or the time it takes to read the content. Pages /w content over this size or which take longer then than the timeout will have their content replaced with a configurable string. These pages will still have their metadata extracted.	true
maxItemContentSize	true	10000	No	The maximum allowed content size (in kilobytes.)	20000
readItemContentTimeout	true	30	No	The maximum amount of time (in secs) to wait while reading the content bytes.	20
fetchMetadataWhenContentFails	false	false	No	If the REST API call to get the Page content fails, fetch the metadata only.	true
removedContentReplacement	false	ItemContentRemoved	No	A string/token to replace the content when the content exceeds the max allowed size, or it cannot be read in the allotted time or the REST content fetch request fails.	"ContentRemoved"

connectionTimeout	No	15000	No	Maximum time to wait (in millismilliseconds) for the connection	30000
readTimeout	No	30000	No	Maximum time to wait for read (in millismilliseconds)	40000
retries	No	3	No	Maximum number of retries for a failed document	1
retryDelay	No	15000	No	Retry delay (in millismilliseconds)	30000
maxRetryDelay	No	600000	No	Maximum retry delay (in millismilliseconds)	500000
retryDelayMultiplier	No	1.0	No	Retry delay multiplier	1.5

resultSetLimit	No	100	No	The maximum number of records to be retrieved at a time per page through the Confluence REST API.	200
logRestAPI	No	false	No	Select to Log REST API requests details on the INFO level.	true

Example

Code Block

theme	RDark
title	POST aspire/_api/connections

{
   "type": confluence,
   "description": "Confluence",
    "properties": {
        "url": "https://coreengtest.atlassian.net",
        "cloud": true,
        "indexContainers": true,
        "scanRecursively": true,
        "scanExcludedItems": false,
        "stopCrawlOnScannerError": true,
        "anonymousAccessAllowed": false,
        "limitItemContentSize": false,
        "connectionTimeout": "15000",
        "readTimeout": "30000",
        "retries": "3",
        "retryDelay": "15000",
        "maxRetryDelay": "600000",
        "retryDelayMultiplier": "1.0",
        "resultSetLimit": "100",
        "logRestAPI": false
    }
}

Update Connection

Field	Required	Default	Multiple	Notes	Example
id	Yes	-	No	Id ID of the connection to update	"d442adcab4b0",
description	No	-	No	Name of the connection object.	"My RDB Connection"
throttlePolicy	No	-	No	Id ID of the throttle policy that applies to this connection object.	"b3a9-6b235b333a1b"
routingPolicies	No	[ ]	Yes	The ids IDs of the routing policies that this connection will use.	["17f75ce7d0c7", "d42780003b36"]
credential	No	-	No	Id ID of the credential	"6b235b333a1b"
properties	No	-	No	Configuration object
(see create connection)

Example

Code Block

theme	RDark
title	PUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0

{
   "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
   "description": "Confluence",
     "properties": {
        "url": "https://coreengtest.atlassian.net",
        "cloud": true,
        "indexContainers": true,
        "scanRecursively": true,
        "scanExcludedItems": false,
        "stopCrawlOnScannerError": true,
        "anonymousAccessAllowed": false,
        "limitItemContentSize": false,
        "connectionTimeout": "15000",
        "readTimeout": "30000",
        "retries": "3",
        "retryDelay": "15000",
        "maxRetryDelay": "600000",
        "retryDelayMultiplier": "1.0",
        "resultSetLimit": "100",
        "logRestAPI": false
    } 
}

Create Connector

For the creation of the Connector object using the Rest API check , please refer to this page

Update Connector

For the update of the Connector object using the Rest API, check this page

Create Seed

seed name of the database. It will replace the marker {DATABASE} used in the field jdbcUrl of connection object

Field	Required	Default	Multiple	Notes	Example
type	Yes	-	No	The	"test_db"	type	Yes	-	No	The value must be "rdb-tablesconfluence".	"rdb-tablesconfluence"
description	Yes	-	No	Name of the seed object.	"My RDB Confluence Seed"
connector	Yes	-	No	The id ID of the connector to be used with this seed. The connector type must match the seed type.	"e3ca414b0d31"
connection	Yes	-	No	The id ID of the connection to be used with this seed. The connection type must match the seed type.	"e4a663fe9ee6"
workflows	No	[ ]	Yes	The ids IDs of the workflows that will be executed for the documents crawled.	["5696c3f0bda4"]
throttlePolicy	No	-	No	Id ID of the throttle policy that applies to this seed object.	"6b235b333a1b"
routingPolicies	No	[ ]	Yes	The ids IDs of the routing policies that this seed will use.	["17f75ce7d0c7", "d42780003b36"]
tags	No	[ ]	Yes	The tags of the seed. These can be used to filter the seed.	["tag1", "tag2"]
properties	Yes	-	No	Configuration object
fullSQLuseKeysForSpacesList	YesNo	-true	No	Full SQL. Run a "SELECT" query to retrieve all documents. This query is used only for full scans. Use the "WHERE" clause to specify any required condition for crawling just the desired documents. If slicing is enabled, add a "WHERE" clause: "SELECT idCol, col1, col2, col3 FROM data_table WHERE {SLICES}	If turned on, all Space Inclusion/Exclusion lists should specify Space Keys. Otherwise, Space Names should be used.	false

spaces	No	-	Yes	Crawl only these spaces	[ { "space": "PEPO" } ]
space	"SELECT * FROM table"	idColumn	Yes	-	No	The column name that holds the unique key. The default key or name of the column which holds the value to use as the document id. This column must be present in fullSQL. SQL aliases are NOT supported.	"id"
stringIdColumn	No	false	No	Check if the unique key is a string value	true
space to be crawled.

spacesFile	No	postCrawlSQL	No	-	No	The SQL to run after a crawl	Path to the file that contains spaces, keys, or names to be crawled. 1 space per line. If set, the spaces coming from this file override the space list provided in the Config UI.	"/path/to/file/that/contains/spaces"

excludedSpaces	Incremental Crawl	preUpdateSQL	No	-	No	The SQL to run before an incremental crawl. This SQL can be used to mark documents for update, save timestamps, clear update tables, etc. as needed to prepare for an incremental crawl	"UPDATE updates_table SET status='I'"	Yes	Do Not crawl only spaces	[ { "space": "PEPO" } ]
space	No	updateSQL	Yes	-	No	The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)	"SELECT updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN updates_table ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC"
postUpdateSQL	No	-	No	The SQL to run after each record is processed. This SQL can be used un-mark / delete each document in the tables after it is complete. Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively	UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}
postUpdateFailedSQL	No	-	No	The SQL to run after each record if processing fails. If not configured, the 'Post update SQL' will be run instead Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively
seqColumn	Yes	-	No	The name of the column in the returned data which holds the sequence number of the update. This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here.	"sequence"
actionColumn	Yes	-	No	The name of the column in the returned data which holds action of the update (ie Insert, Update or Delete). This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here	"action"
useBounding	No	false	No	Checking this option allows incremental crawls to use SQL that is bounded by a condition. When entering SQL you may use the variables {lowerBound} and {upperBound} in a WHERE clause to limit the data collected. The {upperBound} will be calculated at the start of the crawl. The {lowerBound} will be the {upperBound} from the previous crawl. Two types of bounding are available - 'Timestamp' returns the bounds as a 'long' value representing the current system time whilst 'SQL' allows you to define SQL to return the new upper bound when the crawl starts	true
boundingSQL	No	-	No	The SQL run when the crawl starts to return the new upper bound. The upper bound will be taken from the first column of the first row returned	ACL
aclColumn	Yes (aclColumn or aclSQL)	-	No	The column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT	"acl"
aclSQL	Yes (aclColumn or aclSQL)	-	No	The query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' \|\| user \|\| '@NT;' FROM myTable	"SELECT * FROM table_acl"
Key or Name of space to be excluded from crawling. Use the display name of the space.

excludedSpacesFile	No	-	No	Path to the file that contains spaces, keys, or names to be excluded from the crawl. 1 space per line. If set, the spaces coming from this file override the excluded space list provided in the Config UI.	"/path/to/file/that/contains/excluded_spaces"

excludePersonalSpaces	No	true	No	Exclude personal spaces	false
excludeArchivedSpaces	No	true	No	Exclude archived spaces	false
includes	No	[ ]	Yes	Patterns to match against document URL, if any of them match, the document will be included in the crawl.	[ ".pdf$", ".docx$" ]
excludes	No	[ ]	Yes	Patterns to match against document URL, if any of them match, the document will be excluded from the crawl.	[ ".png$", ".jpeg$" ]
includeAttachments	No	false	No	Select to include attachments in the crawl	true
includeComments	No	false	No	Select to include comments in the crawl	true

Example

Code Block

theme	RDark
title	POST aspire/_api/seeds_api/seeds

{{
  "seed": "test_db",
  "type": "rdb-tablesconfluence",
  "description": "RDB_TestConfluence",
   "properties": {
        "idColumnuseKeyForSpacesLists": "id"true,
        "stringIdColumnspaces": true,
[
           "aclSQL": null,
 {
                "aclColumnspace": "aclPEPO",
           "quoteId": "doNotQuote", }
    "fullSQL": "SELECT * FROM students"],
        "preUpdateSQLspacesFile": ""UPDATE updates_table SET status='I'",
,
        "updateSQLexcludedSpaces": "SELECT  updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC",
 [],
        "excludedSpacesFile": "",
        "excludePersonalSpaces": true,
        "excludeArchivedSpaces": true,
        "postUpdateSQLincludes": "UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}",
[],
        "excludes": [],
        "seqColumnincludeAttachments": "sequence"false,
        "actionColumnincludeComments": "action"false
    }
}

Update Seed

Field	Required	Default	Multiple	Notes	Example
id	Yes	-	No	Id ID of the seed to update	"2f287669-d163-4e35-ad17-6bbfe9df3778"
(see the "Create seed" section on this page for other fields)

Example

Code Block

theme	RDark
title	PUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778

{
  "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
  "seed": "test_db",
  "description": "RDB_Test",
  "properties": {
      "idColumnuseKeyForSpacesLists": "id"true,
      "stringIdColumnspaces": true,
[
         "aclSQL": null,
 {
         "aclColumn": "acl",
    "quoteIdspace": "doNotQuotePEPO",
    "fullSQL": "SELECT * FROM students",
  }
  "preUpdateSQL": "UPDATE updates_table SET status='I'" ],
      "updateSQLspacesFile": ""SELECT,
  updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC",
    "excludedSpaces": [],
      "excludedSpacesFile": "",
      "excludePersonalSpaces": true,
      "excludeArchivedSpaces": true,
      "postUpdateSQLincludes": "UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}",
[],
      "excludes": [],
      "seqColumnincludeAttachments": "sequence"false,
      "actionColumnincludeComments": "action"false
  }
}

Page tree

Versions Compared

Old Version 8

New Version Current

Key

Create Credential

Example

Create Connection

Example

Update Connection

Example

Create Connector

Update Connector

Create Seed

Example

Update Seed

Example

Page tree

Page History

Versions Compared

Old Version 8

New Version Current

Key

Create Credential

Example

Create Connection

Example

Update Connection

Example

Create Connector

Update Connector

Create Seed

Example

Update Seed

Example