Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "azure-data-lake". | "azure-data-lake" |
description | Yes | - | No | Name of the credential object. | "Azure Data Lake Credential" |
properties | Yes | - | No | Configuration object | |
authTokenEndpoint | Yes | - | No | Azure Authorization Token End Point | https://login.microsoftonline.com/yourkey/oauth2/token |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx |
accountFQDN | Yes | - | No | Fully Qualified Domain Name | yourname.azuredatalakestore.com |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>", "description": "<Connector Name> Credential", "properties": { "authTokenEndpoint": "https://login.microsoftonline.com/yourkey/oauth2/token", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDN": "yourname.azuredatalakestore.com" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id of the credential to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
description | Yes | - | No | Name of the credential object. | "Azure Data LakeCredential" |
properties | Yes | - | No | Configuration object | |
authTokenEndpoint | Yes | - | No | Azure Authorization Token End Point | https://login.microsoftonline.com/yourkey/oauth2/token |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx |
accountFQDN | Yes | - | No | Fully Qualified Domain Name | yourname.azuredatalakestore.com |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "id": "2a5ca234-e328-4d40-bb2a-2df3e550b065", "description": "<Connector Name> Credential", "properties": { "authTokenEndpoint": "https://login.microsoftonline.com/yourkey/oauth2/token", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDN": "yourname.azuredatalakestore.com" } } |
Field | Required | Default | Multiple | Notes | Example | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
type | Yes | - | No | The value must be azure-data-lake | azure-data-lake | ||||||
description | Yes | - | No | Name of the connection object. | "MyAzure Data LakeConnection" | ||||||
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||||||
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||||||
properties | Yes | - | No | Configuration object | |||||||
sourceTypeindexContainers | YesNo | "useRootPath" | No | Source type ("useRootPath", "useSeedsFile", "useSpecificPaths") | "useRootPath" | ||||||
seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" | ||||||
TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE | ||||||||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | ||||||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE | ||||||
includespathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected.List of path to crawl.regex URL patterns to include | [{"pathCollectioninclude":"/path/to/file1"},{"pathCollection": "/path/to/file2"}].*tmp[^/]$"}] | ||||||
includepathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} | ||||||
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE | ||||||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | ||||||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE | ||||||
regex URL patterns to include | ".*tmp[^/]$" | ||||||||||
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] | ||||||
exclude | No | - | No | regex URL patterns to exclude | ".* | includes | No | - | Yes | List of regex URL patterns to include | [{"include":".*tmp[^/]$"}]include |
Code Block |
---|
|
| |||
{
"type": "<Connector Type>",
"description |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Name> Test Type>Connector", "descriptionproperties": "<Connector{ Name> Test Connector", "propertiesindexContainers": {true, "sourceTypescanRecursively":"useSpecificPaths" true, "seedsFilePathscanExcludedItems":"" false, "pathCollectionsToCrawlincludes": [ {"pathCollectioninclude": "/path/to/file1.*tmp[^/]$"}, ], { "pathCollectionexcludes": "/path/to/file2"}[ ], "indexContainers{"exclude": true, "scanRecursively": true,".*tmp[^/]$"} "scanExcludedItems": false,] "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
Field
Required
Default
Multiple
Notes
Example
}
} |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id of the connection to update | "89d6632a-a296-426c-adb0-d442adcab4b0", |
description | No | - | No | Name of the connection object. | "MyConnection" |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
properties | Yes | - | No | Configuration object | |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE |
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE |
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE |
includes | No | - | Yes | List of regex URL patterns to include | [{"include":".*tmp[^/]$"}] |
include | No | - | No | regex URL patterns to include | ".*tmp[^/]$" |
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] |
exclude | No | - | No | regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||
---|---|---|
| ||
{
"id": "89d6632a-a296-426c-adb0-d442adcab4b0",
"description": "<Connector Name> Test Connector",
"properties": {
"indexContainers": true,
"scanRecursively": true,
"scanExcludedItems": false,
"includes": [
{"include": |
"MyConnection"
".*tmp[^/]$"} |
], "excludes": [ {"exclude": ".*tmp[^/]$"} |
Code Block | ||
---|---|---|
| ||
{
"id": "89d6632a-a296-426c-adb0-d442adcab4b0",
"description": "<Connector Name> Test Connector",
"properties": {
"sourceType":"useSpecificPaths",
"seedsFilePath":"",
"pathCollectionsToCrawl":[
{"pathCollection": "/path/to/file1"},
{"pathCollection": "/path/to/file2"}
],
"indexContainers": true,
"scanRecursively": true,
"scanExcludedItems": false,
"includes": [
{"include": ".*tmp[^/]$"}
],
"excludes": [
{"exclude": ".*tmp[^/]$"}
]
}
} |
]
}
} |
For the creation of the Connector object using the Rest API check this page
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
seed | Yes | - | No | <seed description> | |
type | Yes | - | No | The value must be azure-data-lake. | azure-data-lake |
description | Yes | - | No | Name of the seed object. | "My Azure Data Lake Seed" |
connector | Yes | - | No | The id of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | Yes | - | No | The id of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The ids of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
properties | Yes | - | No | Configuration object | |
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed |
sourceType | Yes | "useRootPath" | No | Source type ("useRootPath", "useSeedsFile", "useSpecificPaths") | "useRootPath" |
seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" |
pathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] |
pathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"type": "<Connector Type>",
"seed": "directory",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag1", "tag2"],
"properties": {
"seed": "azure_data_lake_seed",
"sourceType":"useSpecificPaths",
"seedsFilePath":"",
"pathCollectionsToCrawl":[
{"pathCollection": "/path/to/file1"},
{"pathCollection": "/path/to/file2"}
]
}
} |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id of the seed to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
seed | No | - | No | <seed description> | |
description | No | - | No | Name of the seed object. | "MyAzure Data LakeSeed" |
connector | No | - | No | The id of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | No | - | No | The id of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The ids of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.add | No | [ ] | Yes | The ids of the workflows to add. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.remove | No | [ ] | Yes | The ids of the workflows to remove. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.add | No | [ ] | Yes | The ids of the routingPolicies to add. | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.remove | No | [ ] | Yes | The ids of the routingPolicies to remove. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag3"] |
tags.add | No | [ ] | Yes | The tags to add | ["tag4"] |
tags.remove | No | [ ] | Yes | The tags to remove | ["tag2"] |
properties | Yes | - | No | Configuration object | |
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed |
sourceType | Yes | "useRootPath" | No | Source type ("useRootPath", "useSeedsFile", "useSpecificPaths") | "useRootPath" |
seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" |
pathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] |
pathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
"seed": "<seed example>",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag", "tag2"],
"properties": {
"seed": "azure_data_lake_seed",
"sourceType":"useSpecificPaths",
"seedsFilePath":"",
"pathCollectionsToCrawl":[
{"pathCollection": "/path/to/file1"},
{"pathCollection": "/path/to/file2"}
] |
For the creation of the Connector object using the Rest API check this page
For the update of the Connector object using the Rest API check this page
Field
Required
Default
Multiple
Notes
Example
The value must be azure-data-lake.
azure-data-lake
"My Azure Data Lake Seed"
Code Block | ||||
---|---|---|---|---|
| ||||
{
"type": "<Connector Type>",
"seed": "directory",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag1", "tag2"],
"properties": {
"seed": "azure_data_lake_seed"
}
} |
Field
Required
Default
Multiple
Notes
Example
"MyAzure Data LakeSeed"
Code Block | ||||
---|---|---|---|---|
| ||||
{
"id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
"seed": "<seed example>",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag", "tag2"],
"properties": {
"seed": "azure_data_lake_seed"
}
} |