Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "azure-data-lake". | "azure-data-lake" |
description | Yes | - | No | Name of the credential object. | "Azure Data Lake Credential" |
properties | Yes | - | No | Configuration object | |
accountName | Yes | - | No | Storage Account name | samplestorageaccountname |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx |
tenantId | Yes | - | No | Tenant ID | sampletenantid |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "azure-data-lake", "description": "<Connector Name> Credential", "properties": { "accountName": "samplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "tenantId": "sampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id ID of the credential to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
description | Yes | - | No | Name of the credential object. | "Azure Data LakeCredential" |
properties | Yes | - | No | Configuration object | |
accountName | Yes | - | No | Storage Account name | samplestorageaccountname |
appID | Yes | - | No | Azure application ID registered | sampleapplicationid |
appSecret | Yes | - | No | Azure application secret | xxxxxxxxxxxxxxxxxxxxxxxxxx |
tenantId | Yes | - | No | Tenant ID | sampletenantid |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "azure-data-lake", "description": "<Connector Name> Credential", "properties": { "accountName": "samplestorageaccountname", "appID": "sampleapplicationid", "appSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "tenantId": "sampletenantid" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be azure-data-lake | azure-data-lake |
description | Yes | - | No | Name of the connection object. | "MyAzure Data LakeConnection" |
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
credential | Yes | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" |
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
properties | Yes | - | No | Configuration object | |
scanAllFileSystems | Yes | TRUE | No | Select if all file systems are to be scanned | TRUE |
fileSystem | No | - | No | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE |
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE |
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern | FALSE |
includes | No | - | Yes | List of regex URL patterns to include | [{"include":".*tmp[^/]$"}] |
include | No | - | No | regex Regex URL patterns to include | ".*tmp[^/]$" |
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] |
exclude | No | - | No | regex Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "azure-data-lake", "credential": "d42e1872-02c8-4a90-a714-44f15577389a", "throttlePolicy": "", "routingPolicies": ["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"], "description": "<Connector Name> Test Connector", "properties": { "scanAllFileSystems": false, "fileSystem": "fileSystemName1", "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
Field | Required | Default | Multiple | Notes | Example | ||
---|---|---|---|---|---|---|---|
id | Yes | - | No | Id ID of the connection to update | "89d6632a-a296-426c-adb0-d442adcab4b0", | ||
description | No | - | No | Name of the connection object. | "MyConnection" | ||
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||
credential | No | - | No | ID of the credential that applies to this connection object. | "d42e1872-02c8-4a90-a714-44f15577389a" | ||
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this connection will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||
properties | Yes | - | No | Configuration object | indexContainers|||
scanAllFileSystems | NoYes | TRUE | No | Select if | containersall file systems are to be | indexed. Clear to index files only.scanned | TRUE |
fileSystem | No | TRUE- | No | Select if subfolders are to be scanned. | TRUE | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 |
indexContainers | No | TRUE | No | Select if containers are to be indexed. Clear to index files only. | TRUE | ||
scanRecursively | No | TRUE | No | Select if subfolders are to be scanned. | TRUE | ||
scanExcludedItems | No | FALSE | No | Select so that the scanner will scan sub items of container items excluded by a pattern. | FALSE | ||
includes | No | - | Yes | List of regex URL patterns to include. | [{"include":".*tmp[^/]$"}] | ||
include | No | - | No | regex Regex URL patterns to include. | ".*tmp[^/]$" | ||
excludes | No | - | Yes | List of regex URL patterns to exclude | [{"include":".*tmp[^/]$"}] | ||
exclude | No | - | No | regex Regex URL patterns to exclude | ".*tmp[^/]$" |
Code Block | ||
---|---|---|
| ||
{ "id": "89d6632a-a296-426c-adb0-d442adcab4b0", "type": "azure-data-lake", "credential": "d42e1872-02c8-4a90-a714-44f15577389a", "throttlePolicy": "", "routingPolicies": ["5c7274ef-429b-46ef-8f73-f010e479a467", "9dee4fba-14f2-4afc-a74d-297bcbbd359a"], "description": "<Connector Name> Test Connector", "properties": { "scanAllFileSystems": false, "fileSystem": "fileSystemName1", "indexContainers": true, "scanRecursively": true, "scanExcludedItems": false, "includes": [ {"include": ".*tmp[^/]$"} ], "excludes": [ {"exclude": ".*tmp[^/]$"} ] } } |
For the creation of the Connector object using the Rest API, check this page
Field | Required | Default | Multiple | Notes | Example | ||||
---|---|---|---|---|---|---|---|---|---|
seed | Yes | - | No | <seed description> | |||||
type | Yes | - | No | The value must be azure-data-lake. | azure-data-lake | ||||
description | Yes | - | No | Name of the seed object. | "My Azure Data Lake Seed" | ||||
connector | Yes | - | No | The id ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" | ||||
connection | Yes | - | No | The id ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" | ||||
workflows | No | [ ] | Yes | The ids IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] | ||||
throttlePolicy | No | - | No | Id ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" | ||||
routingPolicies | No | [ ] | Yes | The ids IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] | ||||
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] | ||||
properties | Yes | - | No | Configuration object | |||||
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed | ||||
scanAllFileSystemsspecificPath | YesNo | true- | No | Select if all file systems are to be scanned | fileSystems | No | - | Yes | Only required if "scanAllFileSystems" is disabled. List of file system names and configurations. |
fileSystem | No | - | No | Only required if "scanAllFileSystems" is disabled. The name of the file system. | fileSystemName1 | ||||
sourceType | No | "scanAllPaths" | No | Source type ("scanAllPaths", "useSeedsFile", "useSpecificPaths") | "scanAllPaths" | ||||
seedsFilePath | No | - | No | Only required if sourceType "useSeedsFile" is selected. Seeds File path. | "/path/to/file" | ||||
pathCollectionsToCrawl | No | - | Yes | Only required if sourceType "useSpecificPaths" is selected. List of path to crawl. | [{"pathCollection": "/path/to/file1"},{"pathCollection": "/path/to/file2"}] | ||||
pathCollection | No | - | No | Only required if sourceType "useSpecificPaths" is selected. Path to crawl. | {"pathCollection": "/path/to/file1"} |
theme | RDark |
---|---|
title | POST aspire/_api/seeds |
Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"type": "azure-data-lake",
"seed": "directory",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag1", "tag2"],
"properties": {
"seed": "azure_data_lake_seed",
"specificPath": "/sample/path"
}
}
|
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | ID of the seed to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
seed | No | - | No | <seed description> | |
description | No | - | No | Name of the seed object. | "MyAzure Data LakeSeed" |
connector | No | - | No | The ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | No | - | No | The ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
Field
Required
Default
Multiple
Notes
Example
"MyAzure Data LakeSeed"
workflows.add | No | [ ] | Yes | The IDs of the workflows to add. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.remove | No | [ ] | Yes | The IDs of the workflows to remove. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.add | No | [ ] | Yes | The IDs of the routingPolicies to add. | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.remove | No | [ ] | Yes | The IDs of the routingPolicies to remove. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag3"] |
tags.add | No | [ ] | Yes | The tags to add | ["tag4"] |
tags.remove | No | [ ] | Yes | The tags to remove | ["tag2"] |
properties | Yes | - | No | Configuration object | |
---|---|---|---|---|---|
seed | Yes | - | No | This value must be azure_data_lake_seed | azure_data_lake_seed |
specificPath | No | - | No | Path to crawl. Not required. If “Scan all Filesystems” in the Connection was checked, this path will be ignored. | /sample/path |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
"seed": "<seed example>",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag", "tag2"],
"properties": {
"seed": "azure_data_lake_seed",
"specificPath": "/sample/path", |
Code Block | ||||
---|---|---|---|---|
| ||||
{
"id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
"seed": "<seed example>",
"connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
"description": "<connector>_Test_Seed",
"throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
"routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
"connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
"workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
"tags": ["tag", "tag2"],
"properties": {
"seed": "azure_data_lake_seed",
"scanAllFileSystems": false,
"fileSystems": [
{
"fileSystem" : "fileSystem1",
"sourceType":"useSpecificPaths",
"pathCollectionsToCrawl":[
{"pathCollection": "/path/to/file1"},
{"pathCollection": "/path/to/file2"}
]
}
]
}
} |