Field | Required | Default | Multiple | Notes | Example | |
---|---|---|---|---|---|---|
type | Yes | - | No | The value must be "azure-data-lake". | "azure-data-lake" | |
description | Yes | - | No | Name of the credential object. | "Azure Data Lake Credential" | |
properties | Yes | - | No | Configuration object | ||
authTokenEndpointaccountName | Yes | - | No | Azure Authorization Token End Point | Storage Account name | samplestorageaccountnamehttps://login.microsoftonline.com/yourkey/oauth2/token |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid | |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx | |
accountFQDNtenantId | Yes | - | No | Fully Qualified Domain Name | Tenant ID | sampletenantidyourname.azuredatalakestore.com |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>azure-data-lake", "description": "<Connector Name> Credential", "properties": { "authTokenEndpointaccountName": "https://login.microsoftonline.com/yourkey/oauth2/token"samplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDNtenantId": "yourname.azuredatalakestore.comsampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example | |
---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the credential to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" | |
description | Yes | - | No | Name of the credential object. | "Azure Data LakeCredential" | |
properties | Yes | - | No | Configuration object | ||
authTokenEndpointaccountName | Yes | - | No | Azure Authorization Token End Point | Storage Account name | samplestorageaccountnamehttps://login.microsoftonline.com/yourkey/oauth2/token |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid | |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx | |
accountFQDNtenantId | Yes | - | No | Fully Qualified Domain Name | Tenant ID | sampletenantidyourname.azuredatalakestore.com |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "idtype": "2a5ca234azure-e328-4d40-bb2a-2df3e550b065data-lake", "description": "<Connector Name> Credential", "properties": { "authTokenEndpointaccountName": "https://login.microsoftonline.com/yourkey/oauth2/tokensamplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "accountFQDNtenantId": "yourname.azuredatalakestore.comsampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example | |||||
---|---|---|---|---|---|---|---|---|---|---|
type | Yes | - | No | The value must be azure-data-lake | azure-data-lake | |||||
description | Yes | - | No | Name of the connection object. | "MyAzure Data LakeConnection" | |||||
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b"routingPolicies | |||||
credential | Yes | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" | |||||
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | |||||
properties | Yes | - | No | Configuration object | ||||||
sourceTypescanAllFileSystems | Yes | "useRootPath"TRUE | No | Source type | "useRootPath" | Select if all file systems are to be scanned | TRUE | |||
fileSystemseedsFilePath | No | - | No | Only required if sourceType "Use seed FilescanAllFileSystems" is selected. Seeds File path. | "/path/to/file" | disabled. The name of the file system. | fileSystemName1 | |||
indexContainerspathCollectionsToCrawl | No | -TRUE | YesNo | Only required if sourceType "Specific path" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] | |||||
pathCollection | No | - | No | Only required if sourceType "Specific path" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} | |||||
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE | |||||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | |||||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE | |||||
Select if containers are to be indexed. Clear to index files only. | TRUE | |||||||||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | |||||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE | |||||
includes | No | - | Yes | List of regex | includes | No | - | Yes | List of regex URL patterns to include | [{"include":".*tmp[^/]$"}] |
include | No | - | No | regex Regex URL patterns to include | ".*tmp[^/]$" | |||||
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] | |||||
exclude | No | - | No | regex Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>azure-data-lake", "description "credential": "<Connector Name> Test Connectord42e1872-02c8-4a90-a714-44f15577389a", "propertiesthrottlePolicy": {"", "routingPolicies": } }["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"], "description": "<Connector Name> Test Connector", "properties": { "scanAllFileSystems": false, "fileSystem": "fileSystemName1", "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | ID of the connection to update | "89d6632a-a296-426c-adb0-d442adcab4b0", |
description | No | - | No | Name of the connection object. | "MyConnection" |
throttlePolicy | No | - | No | ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
credential | No | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" |
routingPolicies | No | [ ] | Yes | The IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
properties | Yes | - | No | Configuration object | |
scanAllFileSystems | Yes | TRUE | No | Select if all file systems are to be scanned | TRUE |
fileSystem | No | - | No | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE |
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE |
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern. | FALSE |
includes | No | - | Yes | List of regex URL patterns to include. | [{"include":".*tmp[^/]$"}] |
include | No | - | No | Regex URL patterns to include. | ".*tmp[^/]$" |
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] |
exclude | No | - | No | Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||
---|---|---|
| ||
{
"id": "89d6632a-a296-426c-adb0-d442adcab4b0",
"type": "azure-data-lake",
"credential": "d42e1872-02c8-4a90-a714-44f15577389a",
"throttlePolicy": "",
"routingPolicies": ["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"],
"description": "<Connector Name> Test Connector",
"properties": {
"scanAllFileSystems": false,
"fileSystem": "fileSystemName1",
"indexContainers": true,
"scanRecursively": true,
"scanExcludedItems": false,
"includes": [
{"include": ".*tmp[^/]$"}
],
"excludes": [
{"exclude": ".*tmp[^/]$"} |
Field
Required
Default
Multiple
Notes
Example
"MyConnection"
Code Block | ||
---|---|---|
| ||
{ "id": "89d6632a-a296-426c-adb0-d442adcab4b0", "description": "<Connector Name> Test Connector", "properties": { ] } } |
For the creation of the Connector object using the Rest API, check this page
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
seed | Yes | - | No | <seed description> | |
type | Yes | - | No | The value must be azure-data-lake. | azure-data-lake |
description | Yes | - | No | Name of the seed object. | "My Azure Data Lake Seed" |
connector | Yes | - | No | The ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | Yes | - | No | The id ID of the connector connection to be used with this seed. The connector connection type must match the seed type. | "82f7f0a4602d3700-8d2828dd-47ce4a6a-8c9d8b51-e3ca414b0d31e4a663fe9ee6"connection |
workflows | No | [ ] | Yes | The IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | The id ID of the connection to be used with this seed. The connection type must match the seed typethrottle policy that applies to this connection object. | "602d3700f5587cee-28dd9116-4a6a4011-8b51b3a9-e4a663fe9ee66b235b333a1b" |
workflowsroutingPolicies | No | [ ] | Yes | The ids IDs of the workflows routing policies that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] | properties | Yes | - | No | Configuration object |
this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||||
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
properties | Yes | - | No | Configuration object | |
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed |
specificPath | No | - | No | Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "<Connector Type>azure-data-lake", "seed": "directory", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "description": "<connector>_Test_Seed", "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag1", "tag2"], "properties": { "seed": "azure_data_lake_seed", "specificPath": "/sample/path" } } |
Field | Required | Default | Multiple | Notes | Example | ||
---|---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the seed to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" | ||
seed | No | - | No | <seed description> | |||
description | No | - | No | Name of the seed object. | "MyAzure Data LakeSeed" | ||
connector | No | - | No | The id ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" | ||
connection | No | - | No | The ID of the connection to be used with this seed. The connection type must match the seed type. | "82f7f0a4602d3700-8d2828dd-47ce4a6a-8c9d8b51-e3ca414b0d31e4a663fe9ee6" | ||
connectionworkflows | No- | [ ] | NoYes | The id IDs of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" | workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.add | No | [ ] | Yes | The IDs of the workflows to add. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
workflows.removeworkflows | No | [ ] | Yes | The ids IDs of the workflows that will be executed for the documents crawledto remove. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
workflows.addthrottlePolicy | No | [ ]- | YesNo | The ids ID of the workflows to addthrottle policy that applies to this connection object. | ["f8c414cbf5587cee-1f5d9116-42ef4011-9cc9b3a9-5696c3f0bda46b235b333a1b"] | ||
workflows.removeroutingPolicies | No | [ ] | Yes | The ids IDs of the workflows to removerouting policies that this seed will use. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||
"313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | |||||||
routingPolicies.add | No | [ ] | Yes | The IDs of the routingPolicies to add. | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||
routingPolicies.removeroutingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this seed will useroutingPolicies to remove. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36-4fe0-a2cb-17f75ce7d0c7"] | ||
routingPolicies.addtags | No | [ ] | Yes | The ids tags of the routingPolicies to add.seed. These can be used to filter the seed | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"tag1", "tag3"] | ||
routingPoliciestags.removeadd | No | [ ] | Yes | The ids of the routingPolicies to remove.tags to add | ["tag4["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"] | ||
tags.remove | No | [ ] | Yes | The tags of the seed. These can be used to filter the seedto remove | ["tag1tag2", "tag3"] | ||
tags.add | No | [ ] | Yes | The tags to add | ["tag4"] | ||
tags.remove | No | [ ] | Yes | The tags to remove | ["tag2"] | properties | Yes | - | No | Configuration object |
] | |||||||
properties | Yes | - | No | Configuration object | |||
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed | ||
specificPath | No | - | No | Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "id": "2f287669-d163-4e35-ad17-6bbfe9df3778", "seed": "<seed example>", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "description": "<connector>_Test_Seed", "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag", "tag2"], "properties": { "seed": "azure_data_lake_seed", "specificPath": "/sample/path", } } |