Field | Required | Default | Multiple | Notes | Example | |
---|---|---|---|---|---|---|
type | Yes | - | No | The value must be "azure-data-lake". | "azure-data-lake" | |
description | Yes | - | No | Name of the credential object. | "Azure Data Lake Credential" | |
properties | Yes | - | No | Configuration object | ||
authTokenEndpointaccountName | Yes | - | No | Azure Authorization Token End Point | Storage Account name | samplestorageaccountnamehttps://login.microsoftonline.com/yourkey/oauth2/token |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid | |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx | |
accountFQDNtenantId | Yes | - | No | Fully Qualified Domain Name | Tenant ID | sampletenantidyourname.azuredatalakestore.com |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>azure-data-lake", "description": "<Connector Name> Credential", "properties": { "authTokenEndpointaccountName": "https://login.microsoftonline.com/yourkey/oauth2/token", samplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDNtenantId": "yourname.azuredatalakestore.comsampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example | |||
---|---|---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the credential to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" | |||
description | Yes | - | No | Name of the credential object. | "Azure Data LakeCredential" | |||
properties | Yes | - | No | Configuration object | ||||
authTokenEndpointaccountName | Yes | - | No | Azure Authorization Token End Point | Storage Account name | samplestorageaccountnamehttps://login.microsoftonline.com/yourkey/oauth2/token | ||
appID | Yes | - | No | Azure application ID registered | sampleapplicationid | |||
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx | |||
accountFQDNtenantId | Yes | - | No | Fully Qualified Domain Name | yourname.azuredatalakestore.com | Tenant ID | sampletenantid |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "idtype": "2a5ca234azure-e328-4d40-bb2a-2df3e550b065data-lake", "description": "<Connector Name> Credential", "properties": { "authTokenEndpointaccountName": "https://login.microsoftonline.com/yourkey/oauth2/tokensamplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDNtenantId": "yourname.azuredatalakestore.comsampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be azure-data-lake | azure-data-lake |
description | Yes | - | No | Name of the connection object. | "MyAzure Data LakeConnection" |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
credential | NoYes | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
properties | Yes | - | No | Configuration object | |
indexContainersscanAllFileSystems | NoYes | TRUE | No | Select if containers all file systems are to be scanned | TRUE |
fileSystem | No | - | No | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE |
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE |
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE |
includes | No | - | Yes | List of regex URL patterns to include | [{"include":".*tmp[^/]$"}] |
include | No | - | No | regex Regex URL patterns to include | ".*tmp[^/]$" |
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] |
exclude | No | - | No | regex Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>azure-data-lake", "credential": "d42e1872-02c8-4a90-a714-44f15577389a", "throttlePolicy": "", "routingPolicies": ["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"], "description": "<Connector Name> Test Connector", "properties": { "scanAllFileSystems": false, "fileSystem": "fileSystemName1", "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
Field | Required | Default | Multiple | Notes | Example | ||
---|---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the connection to update | "89d6632a-a296-426c-adb0-d442adcab4b0", | ||
description | No | - | No | Name of the connection object. | "My Connection" | ||
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||
credential | No | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" | ||
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||
properties | Yes | - | No | Configuration object | indexContainers|||
scanAllFileSystems | NoYes | TRUE | No | Select if | containersall file systems are to be | indexed. Clear to index files only.scanned | TRUE |
fileSystem | No | TRUE- | No | Select if subfolders are to be scanned. | TRUE | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE | ||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | ||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern. | FALSE | ||
includes | No | - | Yes | List of regex URL patterns to include. | [{"include":".*tmp[^/]$"}] | ||
include | No | - | No | regex Regex URL patterns to include. | ".*tmp[^/]$" | ||
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] | ||
exclude | No | - | No | regex Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||
---|---|---|
| ||
{ "id": "89d6632a-a296-426c-adb0-d442adcab4b0", "credentialtype": "azure-data-lake", "credential": "d42e1872-02c8-4a90-a714-44f15577389a", "throttlePolicy": "", "routingPolicies": ["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"], "description": "<Connector Name> Test Connector", "properties": { "scanAllFileSystems": false, "fileSystem": "fileSystemName1", "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
For the creation of the Connector object using the Rest API, check this page
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
seed | Yes | - | No | <seed description> | |
type | Yes | - | No | The value must be azure-data-lake. | azure-data-lake |
description | Yes | - | No | Name of the seed object. | "My Azure Data Lake Seed" |
connector | Yes | - | No | The id ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | Yes | - | No | The id ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The ids IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
properties | Yes | - | No | Configuration object | |
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed |
sourceTypespecificPathYes | No | "useRootPath"- | No | Source type ("useRootPath", "useSeedsFile", "useSpecificPaths") | "useRootPath" |
seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" |
pathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] |
pathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} |
Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"type": "azure-data-lake",
"seed": "directory",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", | ||||
Code Block | ||||
| ||||
{ "typedescription": "<Connector Type><connector>_Test_Seed", "seedthrottlePolicy": "directory6b8b5f23-fc77-47a1-9b58-106577162e7b", "connectorroutingPolicies": ["82f7f0a4313de87c-8d283cb9-47ce4fe0-8c9da2cb-e3ca414b0d3117f75ce7d0c7", "description": "<connector>_Test_Seed"b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "throttlePolicyconnection": "6b8b5f23602d3700-fc7728dd-47a14a6a-9b588b51-106577162e7be4a663fe9ee6", "routingPoliciesworkflows": ["313de87cf8c414cb-3cb91f5d-4fe042ef-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"9cc9-5696c3f0bda4"], "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag1", "tag2"], "properties": { "seed": "azure_data_lake_seed", "sourceType":"useSpecificPaths", "seedsFilePathspecificPath":"", "pathCollectionsToCrawl":[ {"pathCollection": "/path/to/file1"}, {"pathCollection": "/path/to/file2"} ] } }"/sample/path" } } |
Field | Required | Default | Multiple | Notes | Example | ||
---|---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the seed to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" | ||
seed | No | - | No | <seed description> | |||
description | No | - | No | Name of the seed object. | "MyAzure Data LakeSeed" | ||
connector | No | - | No | The id ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" | ||
connection | No | - | No | The id ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" | ||
workflows | No | [ ] | Yes | The ids IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
workflows.add | No | [ ] | Yes | The ids IDs of the workflows to add. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
workflows.remove | No | [ ] | Yes | The ids IDs of the workflows to remove. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||
routingPolicies.add | No | [ ] | Yes | The ids IDs of the routingPolicies to add. | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||
routingPolicies.remove | No | [ ] | Yes | The ids IDs of the routingPolicies to remove. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"] | ||
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag3"] | ||
tags.add | No | [ ] | Yes | The tags to add | ["tag4"] | ||
tags.remove | No | [ ] | Yes | The tags to remove | ["tag2"] | ||
properties | Yes | - | No | Configuration object | |||
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed | ||
sourceType | Yes | "useRootPath" | No | Source type ("useRootPath", "useSeedsFile", "useSpecificPaths") | "useRootPath" | ||
specificPath | No | seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" |
pathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] | ||
pathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} |
Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{ | ||||
Code Block | ||||
| ||||
{ "id": "2f287669-d163-4e35-ad17-6bbfe9df3778", "seed": "<seed example>", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "description": "<connector>_Test_Seed", "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "connectionid": "602d37002f287669-28ddd163-4a6a4e35-8b51ad17-e4a663fe9ee66bbfe9df3778", "workflowsseed": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]"<seed example>", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "tagsdescription": ["tag", "tag2"]"<connector>_Test_Seed", "propertiesthrottlePolicy": { "seed"6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": "azure_data_lake_seed", "sourceType":"useSpecificPaths", ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "seedsFilePathconnection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "pathCollectionsToCrawl":["workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag", "tag2"], "properties": { "pathCollectionseed": "/path/to/file1"}, azure_data_lake_seed", {"pathCollectionspecificPath": "/sample/path/to/file2"} ] ", } } |