Create Credential

Field	Required	Default	Multiple	Notes	Example
type	Yes	-	No	The value must be "elasticsearch".	"elasticsearch"
description	Yes	-	No	Name of the credential object.	"ElasticsearchCredential"
properties	Yes	-	No	Configuration object
authentication	Yes	"None"	No	The selected authentication method	"Basic"
username	No	-	No	Only required if "Use Basic Authentication" is selected. The name of elasticsearch user to use.	testuser
password	No	-	No	Only required if "Use Basic Authentication" is selected. The password of elasticsearch user to use.	Password123
region	No	-	No	Only required if "AWS Signature V4 Authentication" is selected. The Region of the ES service to use.	us-east-1
defaultAWS	No	TRUE	No	Enable this to use the Default AWS Credentials
accessKey	No	-	No	Only required if "Use the Default AWS Credentials" is false. The Access key of the ES service to use
secretKey	No	-	No	Only required if "Use the Default AWS Credentials" is false. The Secret key of the ES service to use

Example

POST aspire/_api/credentials

{
    "type": "elasticsearch",
    "description": "Elasticsearch Credential",
    "properties": {
         "authentication": "Basic",
         "username": "testuser",
         "password": "Password123",
         "region": "us-east-1",
         "defaultAWS": true,
         "accessKey": "xxxxxxxxxxxxxxxxxxxxxxx",
         "secretKey": "xxxxxxxxxxxxxxxxxxxxxxx"
    }
}

Update Credential

Field	Required	Default	Multiple	Notes	Example
id	Yes	-	No	Id of the credential to update.	"2f287669-d163-4e35-ad17-6bbfe9df3778"
description	Yes	-	No	Name of the credential object.	"ElasticsearchCredential"
properties	Yes	-	No	Configuration object
authentication	Yes	"None"	No	The selected authentication method	"Basic"
username	No	-	No	Only required if "Use Basic Authentication" is selected. The name of elasticsearch user to use.	testuser
password	No	-	No	Only required if "Use Basic Authentication" is selected. The password of elasticsearch user to use.	Password123
region	No	-	No	Only required if "AWS Signature V4 Authentication" is selected. The Region of the ES service to use.	us-east-1
defaultAWS	No	TRUE	No	Enable this to use the Default AWS Credentials
accessKey	No	-	No	Only required if "Use the Default AWS Credentials" is false. The Access key of the ES service to use
secretKey	No	-	No	Only required if "Use the Default AWS Credentials" is false. The Secret key of the ES service to use

Example

PUT aspire/_api/credentials/2f287669-d163-4e35-ad17-6bbfe9df3778

{
   "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "description": "Elasticsearch Credential",
    "properties": {
         "authentication": "Basic",
         "username": "testuser",
         "password": "Password123",
         "region": "us-east-1",
         "defaultAWS": true,
         "accessKey": "xxxxxxxxxxxxxxxxxxxxxxx",
         "secretKey": "xxxxxxxxxxxxxxxxxxxxxxx"
    }
}

Create Connection

Field	Required	Default	Multiple	Notes	Example
type	Yes	-	No	The value must be "elasticsearch"	"elasticsearch"
description	Yes	-	No	Name of the connection object.	"MyElasticsearchConnection"
credential	No	-	No	The ID of the credential to be used with this seed. The credential type must match the seed type.	"2f287669-d163-4e35-ad17-6bbfe9df3778"
throttlePolicy	No	-	No	ID of the throttle policy that applies to this connection object.	"f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies	No	[ ]	Yes	The IDs of the routing policies that this connection will use.	["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
properties	Yes	-	No	Configuration object
hostname	Yes	"localhost"	No	The elastic server hostname	localhost
port	Yes	9200	No	The elastic server port	9200
protocol	No	-	No	The elastic server URL protocol	https
fetchDocuments	No	TRUE	No	Check to fetch the documents content	TRUE
useMGET	No	TRUE	No	Check to user MGET for fetching the documents. If not, individual GET requests will be executed for each document	TRUE
waitBeforeFetching	No	FALSE	No	Check to make the fetch process wait for discovery process to be done	FALSE
includeFields	No	-	Yes	The specified fields will be included in the fetch process of the document.	[{"includeField":"field1"}, {"includeField":"field2"}]
includeField	No	-	No	the name of the field to include in the fetch process.	field1
excludeFields	No	-	Yes	The specified fields will be excluded in the fetch process of the document.	[{"excludeField":"field3"}, {"excludeField":"field4"}]
excludeField	No	-	No	the name of the field to exclude in the fetch process.	field3
verifyFinalCount	No	FALSE	No	Check to execute an initial document count query that will be used at the end of the crawl to validate the total of crawled documents.	False
slice	Yes	5	No	The number of slices to use for the queries	5
pageSize	Yes	1000	No	The number of documents to get per request	1000
scrollTime	Yes	5m	No	The time to keep each scroll request active	5m
timeout	Yes	20000	No	The timeout to use for the connections to elastic	20000
retries	Yes	3	No	The number of retries for each slice processing	3
retryWaitTime	Yes	10000	No	The time in milliseconds to wait between each slice retry	10000
retriesConnection	Yes	5	No	The number of retries for each elasticsearch request	5
retryWaitTimeConnection	Yes	60000	No	The time in milliseconds to wait between each elasticsearch request retry	60000
useThrottling	No	FALSE	No	Check to enable connection throttling	FALSE
throttleRateInMillis	No	5000	No	Only required if "Use Throttling" is true. The throttle rate in milliseconds	5000
throttleConnectionRate	No	750	No	Only required if "Use Throttling" is true. The number of connections to allow in the specified throttle rate	750

Example

POST aspire/_api/connections

{
    "type": "elasticsearch",
    "description": "MyElasticsearchConnection",
	"credential": null,
    "properties": {
        "hostname": "localhost",
        "port": 9200, 
        "protocol": "https",
        "fetchDocuments": true,
        "useMGET": true,
        "waitBeforeFetching": false,
        "includeFields": [
            {"includeField": "field1"},
            {"includeField": "field2"}
        ],
        "excludeFields": [
            {"excludeField": "field3"},
            {"excludeField": "field4"}
        ],
        "verifyFinalCount": false,
        "slice": 5,
        "pageSize": 1000,
        "scrollTime": "5m",
        "timeout": 20000,
        "retries": 3,
        "retryWaitTime": 10000,
        "retriesConnection": 5,
        "retryWaitTimeConnection": 60000,
        "useThrottling": true,
        "throttleRateInMillis": 5000,
        "throttleConnectionRate": 750
    }
}

Update Connection

Field	Required	Default	Multiple	Notes	Example
id	Yes	-	No	ID of the connection to update	"89d6632a-a296-426c-adb0-d442adcab4b0",
description	No	-	No	Name of the connection object.	"MyElasticsearchConnection"
credential	No	-	No	The ID of the credential to be used with this seed. The credential type must match the seed type.	"2f287669-d163-4e35-ad17-6bbfe9df3778"
throttlePolicy	No	-	No	ID of the throttle policy that applies to this connection object.	"f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies	No	[ ]	Yes	The IDs of the routing policies that this connection will use.	["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
properties	Yes	-	No	Configuration object
hostname	Yes	"localhost"	No	The elastic server hostname	localhost
port	Yes	9200	No	The elastic server port	9200
protocol	No	-	No	The elastic server URL protocol	https
fetchDocuments	No	TRUE	No	Check to fetch the documents content	TRUE
useMGET	No	TRUE	No	Check to user MGET for fetching the documents. If not, individual GET requests will be executed for each document	TRUE
waitBeforeFetching	No	FALSE	No	Check to make the fetch process wait for discovery process to be done	FALSE
includeFields	No	-	Yes	The specified fields will be included in the fetch process of the document.	[{"includeField":"field1"}, {"includeField":"field2"}]
includeField	No	-	No	the name of the field to include in the fetch process.	field1
excludeFields	No	-	Yes	The specified fields will be excluded in the fetch process of the document.	[{"excludeField":"field3"}, {"excludeField":"field4"}]
excludeField	No	-	No	the name of the field to exclude in the fetch process.	field3
verifyFinalCount	No	FALSE	No	Check to execute an initial document count query that will be used at the end of the crawl to validate the total of crawled documents.	False
slice	Yes	5	No	The number of slices to use for the queries	5
pageSize	Yes	1000	No	The number of documents to get per request	1000
scrollTime	Yes	5m	No	The time to keep each scroll request active	5m
timeout	Yes	20000	No	The timeout to use for the connections to elastic	20000
retries	Yes	3	No	The number of retries for each slice processing	3
retryWaitTime	Yes	10000	No	The time in milliseconds to wait between each slice retry	10000
retriesConnection	Yes	5	No	The number of retries for each Elasticsearch request	5
retryWaitTimeConnection	Yes	60000	No	The time in milliseconds to wait between each Elasticsearch request retry	60000
useThrottling	No	FALSE	No	Check to enable connection throttling	FALSE
throttleRateInMillis	No	5000	No	Only required if "Use Throttling" is true. The throttle rate in milliseconds	5000
throttleConnectionRate	No	750	No	Only required if "Use Throttling" is true. The number of connection to allow in the the specified throttle rate	750

Example

PUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0

{
    "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
    "description": "MyElasticsearchConnection",
	"credential": null,
     "properties": {         
		"hostname": "localhost",
        "port": 9200, 
        "protocol": "https",
        "fetchDocuments": true,
        "useMGET": true,
        "waitBeforeFetching": false,
        "includeFields": [
            {"includeField": "field1"},
            {"includeField": "field2"}
        ],
        "excludeFields": [
            {"excludeField": "field3"},
            {"excludeField": "field4"}
        ],
        "verifyFinalCount": false,
        "slice": 5,
        "pageSize": 1000,
        "scrollTime": "5m",
        "timeout": 20000,
        "retries": 3,
        "retryWaitTime": 10000,
        "retriesConnection": 5,
        "retryWaitTimeConnection": 60000,
        "useThrottling": true,
        "throttleRateInMillis": 5000,
        "throttleConnectionRate": 750  
	}   
}

Create Connector Instance

For the creation of the Connector object using the Rest API, check this page

Update Connector Instance

For the update of the Connector object using the Rest API, check this page

Create Seed

Field	Required	Default	Multiple	Notes	Example
seed	Yes	-	No	The elastic server hostname	localhost
type	Yes	-	No	The value must be "elasticsearch".	"elasticsearch"
description	Yes	-	No	Name of the seed object.	"My Elasticsearch Seed"
connector	Yes	-	No	The ID of the connector to be used with this seed. The connector type must match the seed type.	"82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connection	Yes	-	No	The ID of the connection to be used with this seed. The connection type must match the seed type.	"602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflows	No	[ ]	Yes	The IDs of the workflows that will be executed for the documents crawled.	["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicy	No	-	No	ID of the throttle policy that applies to this connection object.	"f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies	No	[ ]	Yes	The IDs of the routing policies that this seed will use.	["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
tags	No	[ ]	Yes	The tags of the seed. These can be used to filter the seed	["tag1", "tag2"]
properties	Yes	-	No	Configuration object
indexes	Yes	-	Yes	The list of Elasticsearch indexes to crawl, it supports multiple indexes and the use of the wildcard "*".	index1
index	Yes	-	No	The elastic index to crawl. Index name limitations: 1) Lowercase only. 2) Cannot include \\, \/, ?, \", <, >, \|, (space character), ,, # 3) Cannot start with -, _, + 4) Cannot be . or ..	[{"index":"index1"}]
snapshots	Yes	TRUE	No	Select the crawl mode, a snapshot based crawl with deletes support or a timestamp based crawl with better performance but without support for deleted documents .	TRUE
discoveryFields	No	-	Yes	Only required if "Use Snapshots" is true. List of field names to be used to generate the documents' signature.	[{"discoveryField":"last_modified"}]
discoveryField	No	-	No	Only required if "Use Snapshots" is true. Name of the field to be used to generate the documents' signature.	last_modified
discoveryQuery	No	-	No	Only required if "snapshot" is true. The query to run for discovering documents. This query is used for full and incremental crawls.	{ "track_total_hits": true, "slice": { "id": {{sliceNumber}}, "max": {{sliceTotal}} }, "size": {{pageSize}}, "_source": { "includes": ["last_modified"] }, "query": { "match_all": {} } }
timestampField	No	-	No	Only required if "snapshot" is false. The field that contains the timestamp of the document	timestamp
discoveryQueryInc	No	-	No	Only required if "snapshot" is false. The query to run for discovering documents for incremental crawls.	{ "track_total_hits": true, "slice": { "id": {{sliceNumber}}, "max": {{sliceTotal}} }, "size": {{pageSize}}, "_source": { "includes": ["last_modified"] }, "query": { "range" : { "connectorSpecific.timestamp" : { "gt" : {{timestamp}} } } } }
useLimit	No	FALSE	No	Check to limit how many items are selected from the index	FALSE
topLimit	No	-	No	Only required if "useLimit" is true. The number of items to be crawled, since this connector uses slices and scrolls, this number is an approximation, and you could get a little more items	100
makeIdUnique	No	FALSE	No	Check to ensure unique documents IDs when crawling multiple indexes, if not checked ID collision could happen. This will be done by appending the index name with a delimiter to the ID. This option will be ignored if only a single index without wildcard (*) is specified.	FALSE
idDelimiter	No	-	No	Only required if "makeIdUnique" is true. The delimiter that will be used to append the index name to the document ID	_
storeSpecific	No	TRUE	No	Check to keep the elastic connector metadata and to store all the fields of the elastic source as connector specific fields. If not checked, the elastic source will be used as the document metadata in the same format that it was retrieved	TRUE

Example

POST aspire/_api/seeds

{
    "type": "elasticsearch",
    "seed": "localhost",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "Elasticsearch_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag1", "tag2"],
    "properties": {
        "indexes": [
            {"index": "index1"},
            {"index": "index2"}
        ],
        "snapshots": true,
        "discoveryFields":  [
            {"discoveryField": "last_modified"}
        ],
        "discoveryQuery": "{ \"track_total_hits\": true, \"slice\": { \"id\": {{sliceNumber}}, \"max\": {{sliceTotal}}  }, \"size\": {{pageSize}}, \"_source\": { \"includes\": [\"last_modified\"] }, \"query\": { \"match_all\": {} } }",
        "timestampField": "timestamp",
        "discoveryQueryInc": "{ \"track_total_hits\": true, \"slice\": { \"id\": {{sliceNumber}}, \"max\": {{sliceTotal}}  }, \"size\": {{pageSize}}, \"_source\": { \"includes\": [\"last_modified\"] }, \"query\": { \"range\" : { \"connectorSpecific.timestamp\" : { \"gt\" : {{timestamp}} } } } }",
        "useLimit": true,
        "topLimit": 100,
        "makeIdUnique": true,
        "idDelimiter": "_",
        "storeSpecific": true 
    }
}

Update Seed

Field	Required	Default	Multiple	Notes	Example
id	Yes	-	No	ID of the seed to update.	"2f287669-d163-4e35-ad17-6bbfe9df3778"
seed	No	-	No	The elastic server hostname	localhost
description	No	-	No	Name of the seed object.	"MyElasticsearchSeed"
connector	No	-	No	The ID of the connector to be used with this seed. The connector type must match the seed type.	"82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connection	No	-	No	The ID of the connection to be used with this seed. The connection type must match the seed type.	"602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflows	No	[ ]	Yes	The IDs of the workflows that will be executed for the documents crawled.	["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.add	No	[ ]	Yes	The IDs of the workflows to add.	["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.remove	No	[ ]	Yes	The IDs of the workflows to remove.	["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicy	No	-	No	ID of the throttle policy that applies to this connection object.	"f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies	No	[ ]	Yes	The IDs of the routing policies that this seed will use.	["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.add	No	[ ]	Yes	The IDs of the routingPolicies to add.	["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.remove	No	[ ]	Yes	The IDs of the routingPolicies to remove.	["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"]
tags	No	[ ]	Yes	The tags of the seed. These can be used to filter the seed	["tag1", "tag3"]
tags.add	No	[ ]	Yes	The tags to add	["tag4"]
tags.remove	No	[ ]	Yes	The tags to remove	["tag2"]
properties	Yes	-	No	Configuration object
indexes	Yes	-	Yes	The list of Elasticsearch indexes to crawl, it supports multiple indexes and the use of the wildcard "*".	index1
index	Yes	-	No	The elastic index to crawl. Index name limitations: 1) Lowercase only. 2) Cannot include \\, \/, ?, \", <, >, \|, (space character), ,, # 3) Cannot start with -, _, + 4) Cannot be . or ..	[{"index":"index1"}]
snapshots	Yes	TRUE	No	Select the crawl mode, a snapshot based crawl with deletes support or a timestamp based crawl with better performance but without support for deleted documents .	TRUE
discoveryFields	No	-	Yes	Only required if "Use Snapshots" is true. List of field names to be used to generate the documents' signature.	[{"discoveryField":"last_modified"}]
discoveryField	No	-	No	Only required if "Use Snapshots" is true. Name of the field to be used to generate the documents' signature.	last_modified
discoveryQuery	No	-	No	Only required if "snapshot" is true. The query to run for discovering documents. This query is used for full and incremental crawls.	{ "track_total_hits": true, "slice": { "id": {{sliceNumber}}, "max": {{sliceTotal}} }, "size": {{pageSize}}, "_source": { "includes": ["last_modified"] }, "query": { "match_all": {} } }
timestampField	No	-	No	Only required if "snapshot" is false. The field that contains the timestamp of the document	timestamp
discoveryQueryInc	No	-	No	Only required if "snapshot" is false. The query to run for discovering documents for incremental crawls.	{ "track_total_hits": true, "slice": { "id": {{sliceNumber}}, "max": {{sliceTotal}} }, "size": {{pageSize}}, "_source": { "includes": ["last_modified"] }, "query": { "range" : { "connectorSpecific.timestamp" : { "gt" : {{timestamp}} } } } }
useLimit	No	FALSE	No	Check to limit how many items are selected from the index.	FALSE
topLimit	No	-	No	Only required if "useLimit" is true. The number of items to be crawled, since this connector uses slices and scrolls, this number is an approximation, and you could get a little more items	100
makeIdUnique	No	FALSE	No	Check to ensure unique documents IDs when crawling multiple indexes, if not checked id collision could happen. This will be done by appending the index name with a delimiter to the ID. This option will be ignored if only a single index without wildcard (*) is specified.	FALSE
idDelimiter	No	-	No	Only required if "makeIdUnique" is true. The delimiter that will be used to append the index name to the document ID.	_
storeSpecific	No	TRUE	No	Check to keep the elastic connector metadata and to store all the fields of the elastic source as connector specific fields. If not checked, the elastic source will be used as the document metadata in the same format that it was retrieved.	TRUE

Example

PUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778

{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "seed": "localhost",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "Elasticsearch_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag", "tag2"],
    "properties": {          
		"indexes": [
            {"index": "index1"},
            {"index": "index2"}
        ],
        "snapshots": true,
        "discoveryFields":  [
            {"discoveryField": "last_modified"}
        ],
        "discoveryQuery": "{ \"track_total_hits\": true, \"slice\": { \"id\": {{sliceNumber}}, \"max\": {{sliceTotal}}  }, \"size\": {{pageSize}}, \"_source\": { \"includes\": [\"last_modified\"] }, \"query\": { \"match_all\": {} } }",
        "timestampField": "timestamp",
        "discoveryQueryInc": "{ \"track_total_hits\": true, \"slice\": { \"id\": {{sliceNumber}}, \"max\": {{sliceTotal}}  }, \"size\": {{pageSize}}, \"_source\": { \"includes\": [\"last_modified\"] }, \"query\": { \"range\" : { \"connectorSpecific.timestamp\" : { \"gt\" : {{timestamp}} } } } }",
        "useLimit": true,
        "topLimit": 100,
        "makeIdUnique": true,
        "idDelimiter": "_",
        "storeSpecific": true       
	}
}

Page tree

Create Credential

Example

Update Credential

Example

Create Connection

Example

Update Connection

Example

Create Connector Instance

Update Connector Instance

Create Seed

Example

Update Seed

Example

Contact Us: [email protected]

Page tree

REST API - Elasticsearch Connector Configuration

Create Credential

Example

Update Credential

Example

Create Connection

Example

Update Connection

Example

Create Connector Instance

Update Connector Instance

Create Seed

Example

Update Seed

Example

Contact Us: [email protected]