Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The File System Connector can be configured using the Rest API. It requires the following entities to be created:

  • Connection
  • Connector
  • Seed

Bellow are the examples of how to create the Connection and the Seed. For the Connector please check this page.

Easy Heading Free
navigationTitle

Paneltitle

On this

pagetoc

Page
navigationExpandOptionexpand-all-by-default


Create Connection


Field
Optional
RequiredDefaultMultipleNotesExample
type
No
Yes-NoThe value must be "filesystem"."filesystem"
description
No
Yes-NoName of the connection object."MyFileSystemConnection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
properties
No
Yes-NoConfiguration object
url
No
Yes-NoPath of the base directory to crawl. All the seeds will be prefixed with this value to form the full path. All the seeds will be prefixed with this value to form the full path"C:\
\
Directory"
ignoreSymLinks
Yes
NofalseNoIf enabled symbolic links will not be processed and links in the root items will cause an error.true / false
stopOnScanError
Yes
NotrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / false
indexContainers
Yes
NofalseNoEnable to index the directories.true / false
scanRecursively
Yes
NotrueNoEnable to scan discovered directories recursively.true / false
include
Yes
No

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]
exclude
Yes
No[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]
scanExcludedItems
Yes
NofalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
staticAcl
Yes
No

[ ]

YesStatic ACL configuration object
name
No
Yes-NoName of the static ACL."group1"
domain
Yes
No""NoDomain of the static ACL."testDomain"
entity
Yes
No"user"NoEntity (user / group) represented by the static ACL."user" / "group"
access
Yes
No"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

Code Block
themeRDark
title
Saga_json
TitlePOST aspire/_api/connections
{
    "type": "filesystem",
    "description": "FileSystem Test Connector",
    "properties": {
        "url": "C:\\Directory",
        "ignoreSymLinks": true,
        "stopOnScanError": true,
        "indexContainers": true,
        "scanExcludedItems": true,
        "ignoreSymLinks": true,
        "includes": ".*\\.txt",
        "excludes": ".*\\.png",
        "staticAcl": [{
                "name": "test-user",
                "domain": "test-domain",
                "entity": "user",
                "access": "allow"
            }, {
                "name": "test-group",
                "domain": "",
                "entity": "group",
                "access": "deny"
            }
        ]
    }
}

Update

connection

Connection


Field
Optional
RequiredDefaultMultipleNotesExample
id
No
Yes-NoId of the connection to update"89d6632a-a296-426c-adb0-d442adcab4b0",
type
No
Yes-NoThe value must be "filesystem"."filesystem"
description
Yes
No-NoName of the connection object."MyFileSystemConnection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
properties
No
Yes-NoConfiguration object
url
No
Yes-NoPath of the base directory to crawl. All the seeds will be prefixed with this value to form the full path. All seeds will be prefixed with this value to form the full path"C:\\Directory"
ignoreSymLinks
Yes
NofalseNoIf enabled symbolic links will not be processed and links in the root items will cause an error.true / false
stopOnScanError
Yes
NotrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / false
indexContainers
Yes
NofalseNoEnable to index the directories.true / false
scanRecursively
Yes
NotrueNoEnable to scan discovered items recursively.true / false
include
Yes
No

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]
exclude
Yes
No[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]
scanExcludedItems
Yes
NofalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
staticAcl
Yes
No

[ ]

YesStatic ACL configuration object
name
No
Yes-NoName of the static ACL."group1"
domain
Yes
No""NoDomain of the static ACL."testDomain"
entity
Yes
No"user"NoEntity (user / group) represented by the static ACL."user" / "group"
access
Yes
No"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

Code Block
themeRDark
title
Saga_json
TitlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
    "type": "filesystem",
    "description": "FileSystem Test Connector",
    "properties": {
        "url": "C:\\Directory",
        "ignoreSymLinks": true,
        "stopOnScanError": true,
        "indexContainers": true,
        "scanRecursively": true,
        "scanExcludedItems": true,
        "includes": ".*\\.txt",
        "excludes": ".*\\.png",
        "staticAcl": [{
                "name": "test-user",
                "domain": "test-domain",
                "entity": "user",
                "access": "allow"
            }, {
                "name": "test-group",
                "domain": "",
                "entity": "group",
                "access": "deny"
            }
        ]
    }
}

Create Connector


For the creation of the Connector object using the Rest API check this page

Update Connector


For the update of the Connector object using the Rest API check this page

Create Seed


Field
Optional
RequiredDefaultMultipleNotesExample
seed
No
Yes-NoThe subdirectory to crawl. This value will be appended to the url of the connection."directory"
type
No
Yes-NoThe value must be "filesystem"."filesystem"
description
No
Yes-NoName of the seed object."MyFileSystemConnection"
connector
No
Yes-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connection
No
Yes-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflows
Yes
No[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
tags
Yes
No[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]

Example

Code Block
themeRDark
title
Saga_json
TitlePOST aspire/_api/seeds
{
    "type": "filesystem",
    "seed": "directory",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag1", "tag2"],
    "properties": {
        "seed": "directory"
    }
}

Update Seed


Field
Optional
RequiredDefaultMultipleNotesExample
id
No
Yes-NoId of the seed to update."2f287669-d163-4e35-ad17-6bbfe9df3778"
seed
Yes
No-NoThe subdirectory to crawl. This value will be appended to the url of the connection."directory"
description
Yes
No-NoName of the seed object."MyFileSystemConnection"
connector
Yes
No-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connection
Yes
No-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflows
Yes
No[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.add
Yes
No[ ]YesThe ids of the workflows to add.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.remove
Yes
No[ ]YesThe ids of the workflows to remove.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.add
Yes
No[ ]YesThe ids of the routingPolicies to add.["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.remove
Yes
No[ ]YesThe ids of the routingPolicies to remove.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"]
tags
Yes
No[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag3"]
tags.add
Yes
No[ ]YesThe tags to add["tag4"]
tags.remove
Yes
No[ ]YesThe tags to remove["tag2"]

Example

Code Block
themeRDark
title
Saga_json
TitlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "type": "filesystem",
    "seed": "directory",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag", "tag2"],
    "properties": {
        "seed": "directory"
    }
}