Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The File System SMB Connector can be configured using the Rest API. It requires the following entities to be created:

  • Connection
  • Connector
  • Credential
  • Seed

Bellow are the examples of how to create the Connection and the Seed. For the Connector please check this page.

Easy Heading Free
navigationTitleOn this Page
navigationExpandOptionexpand-all-by-default


Create Connection


FieldOptionalDefaultMultipleNotesExample
typeNo-NoThe value must be "
filesystem
smb"."
filesystem
smb"
descriptionNo-NoName of the connection object."
MyFileSystemConnection
smbConnection"
throttlePolicy
hostname
Yes
No-No
Id of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"routingPoliciesYes[ ]YesThe ids of the routing policies that this connection will use.
Hostname where the shared directory is located."shared.example.com"
portNo445NoPort where the SMB protocol is used."445"
["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
propertiesNo-NoConfiguration object
url

disableFetch
No
Yes
-
falseNo
Path of the base directory to crawl. All the seeds will be prefixed with this value to form the full path. All the seeds will be prefixed with this value to form the full path"C:\\Directory"
Check to disable the connector fetcher, only metadata will be collected.true / false
verboseSMBJ
ignoreSymLinks
YesfalseNo
If enabled symbolic links will not be processed and links in the root items will cause an error
Check to enable SMBJ logging. (WARNING) Enabling this would decrease performance.true / false
stopOnScanErrorYestrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / false
indexContainersYesfalseNoEnable to index the directories.true / false
scanRecursivelyYestrueNoEnable to scan discovered directories recursively.true / false
includeYes

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]
excludeYes[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]
scanExcludedItemsYesfalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
staticAcl
fetchACLsYes
[ ]
true
YesStatic ACL configuration objectnameNo-NoName of the static ACL."group1"domainYes""NoDomain of the static ACL."testDomain"entityYes"user"NoEntity (user / group) represented by the static ACL."user" / "group"accessYes"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

NoCheck to retrieve owner, group and ACL information.true / false
resolveSIDs
YestrueNo
Check to resolve retrieved SIDs from owner, group and ACL.
true / false
addACLSID
YesfalseNoCheck to include SID value on ACL output.true / false
addACLEncodedSID
YesfalseNoCheck to include Encoded SID (Base 32) value on ACL output.true / false
addACLFlags
YesfalseNoCheck to include ACL flags on ACL output.true / false
addACLType
YesfalseNoCheck to include ACL type on ACL output.true / false
addACLAccessMask
YesfalseNoCheck to include ACL access mask on ACL output.true / false
enableDFS
YestrueNo
true /false
connectionTimeout
No6000No
"6000"
maxRetries
No5No
"5"
baseBackoff
No500NoBase time for the backoff sleeps (in ms)."500"
backoffMultiplier
No2.0NoMultiplier factor to be used for the backoff time."2.0"
lastAccessedUpdates
YesfalseNoCheck to restore the last accessed date on the documents processed by the connector. WARNING: Requires a user with permissions for writing. This is not supported by windows.true /false
staticAclYes

[ ]

YesStatic ACL configuration object
nameNo-NoName of the static ACL."group1"
domainYes""NoDomain of the static ACL."testDomain"
entityYes"user"NoEntity (user / group) represented by the static ACL."user" / "group"
accessYes"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example


Code Block
themeRDark
titlePOST aspire/_api/connections
{
    "type": "smb",
    "description": "SMB Test Connector",
    "properties": {
Code Block
themeRDark
titlePOST aspire/_api/connections
{
    "type": "filesystem",
    "description": "FileSystem Test Connector",
    "properties": {
        "url": "C:\\Directory",
        "ignoreSymLinks": true,
        "stopOnScanError": true,
        "indexContainers": true,
        "scanExcludedItems": true,
        "ignoreSymLinks": true,
        "includes": ".*\\.txt",
        "excludes": ".*\\.png",
        "staticAcl": [{
                "namehost": "test-user192.168.0.80",
                "domain"port": "test-domain445",
        "disableFetch": false,
        "entityverboseSMBJ": "user"false,
        "stopOnScanError": true,
        "accessindexContainers": "allow"true,
            }, {"scanExcludedItems": true,
          "includes": ".*\\.txt",
        "nameexcludes": "test-group.*\\.png",
                "domain": "",
   		"fetchACLs":true,
             "entityresolveSIDs": "group"true,
                "access": "deny"
    		"addACLSID": false,
		"addACLEncodedSID": false,
		"addACLFlags": false,
		"addACLType" : false,
		"addACLAccessMask": false,
		"enableDFS": true,
		"connectionTimeout": 60000,
		"maxRetries": 5,
		"baseBackoff": 500,
		"backoffMultiplier": 2.0,
		"lastAccessedUpdates": false,
         }
"staticAcl": [{
             ]
    }
}

Update Connection

FieldOptionalDefaultMultipleNotesExampleidNo-NoId of the connection to update"89d6632a-a296-426c-adb0-d442adcab4b0",typeNo-NoThe value must be "filesystem"."filesystem"descriptionYes-NoName of the connection object."MyFileSystemConnection"throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"routingPoliciesYes[ ]YesThe ids of the routing policies that this connection will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]propertiesNo-NoConfiguration objecturlNo-NoPath of the base directory to crawl. All the seeds will be prefixed with this value to form the full path. All seeds will be prefixed with this value to form the full path"C:\\Directory"ignoreSymLinksYesfalseNoIf enabled symbolic links will not be processed and links in the root items will cause an error.true / falsestopOnScanErrorYestrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / falseindexContainersYesfalseNoEnable to index the directories.true / falsescanRecursivelyYestrueNoEnable to scan discovered items recursively.true / falseincludeYes

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]excludeYes[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]scanExcludedItemsYesfalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
     "name": "test-user",
                "domain": "test-domain",
                "entity": "user",
                "access": "allow"
            }, {
                "name": "test-group",
                "domain": "",
                "entity": "group",
                "access": "deny"
            }
        ]
    }
}

Update Connection


FieldOptionalDefaultMultipleNotesExample
FieldOptionalDefaultMultipleNotesExample
typeNo-NoThe value must be "smb"."smb"
descriptionNo-NoName of the connection object."smbConnection"
hostnameNo-NoHostname where the shared directory is located."shared.example.com"
portNo445NoPort where the SMB protocol is used."445"
propertiesNo-NoConfiguration object
disableFetch
YesfalseNoCheck to disable the connector fetcher, only metadata will be collected.true / false
verboseSMBJ
YesfalseNoCheck to enable SMBJ logging. (WARNING) Enabling this would decrease performance.true / false
stopOnScanErrorYestrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / false
indexContainersYesfalseNoEnable to index the directories.true / false
scanRecursivelyYestrueNoEnable to scan discovered directories recursively.true / false
includeYes

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]
excludeYes[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]
scanExcludedItemsYesfalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
fetchACLsYestrueNoCheck to retrieve owner, group and ACL information.true / false
resolveSIDs
YestrueNo
Check to resolve retrieved SIDs from owner, group and ACL.
true / false
addACLSID
YesfalseNoCheck to include SID value on ACL output.true / false
addACLEncodedSID
YesfalseNoCheck to include Encoded SID (Base 32) value on ACL output.true / false
addACLFlags
YesfalseNoCheck to include ACL flags on ACL output.true / false
addACLType
YesfalseNoCheck to include ACL type on ACL output.true / false
addACLAccessMask
YesfalseNoCheck to include ACL access mask on ACL output.true / false
enableDFS
YestrueNo
true /false
connectionTimeout
No6000No
"6000"
maxRetries
No5No
"5"
baseBackoff
No500NoBase time for the backoff sleeps (in ms)."500"
backoffMultiplier
No2.0NoMultiplier factor to be used for the backoff time."2.0"
lastAccessedUpdates
YesfalseNoCheck to restore the last accessed date on the documents processed by the connector. WARNING: Requires a user with permissions for writing. This is not supported by windows.true /false
staticAclYes

[ ]

YesStatic ACL configuration object
nameNo-NoName of the static ACL."group1"
domainYes""NoDomain of the static ACL."testDomain"
entityYes"user"NoEntity (user / group) represented by the static ACL."user" / "group"
accessYes"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "type": "smb",
    "description": "SMB Test Connector",
    "properties": {
        "host": "192.168.0.80",
        "port":"445",
        "disableFetch": false,
        "verboseSMBJ": false,
        "stopOnScanError": true,
        "indexContainers": true,
        "scanExcludedItems": true,
        "includes": ".*\\.txt",
        "excludes": ".*\\.png",
		"fetchACLs":true,
staticAclYes

[ ]

YesStatic ACL configuration objectnameNo-NoName of the static ACL."group1"domainYes""NoDomain of the static ACL."testDomain"entityYes"user"NoEntity (user / group) represented by the static ACL."user" / "group"accessYes"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
    "type": "filesystem",
    "description": "FileSystem Test Connector",
    "properties": {
        "urlresolveSIDs": true,
		"addACLSID"C:\\Directory",
        "ignoreSymLinks": true,
        "stopOnScanError: false,
		"addACLEncodedSID": false,
		"addACLFlags": false,
		"addACLType" : false,
		"addACLAccessMask": false,
		"enableDFS": true,
        "indexContainers		"connectionTimeout": true60000,
        "scanRecursively		"maxRetries": true5,
        "scanExcludedItems		"baseBackoff": true500,
        "includes		"backoffMultiplier": ".*\\.txt",
        "excludes": ".*\\.png"2.0,
		"lastAccessedUpdates": false,
        "staticAcl": [{
                "name": "test-user",
                "domain": "test-domain",
                "entity": "user",
                "access": "allow"
            }, {
                "name": "test-group",
                "domain": "",
                "entity": "group",
                "access": "deny"
            }
        ]
    }
}

Create Connector


For the creation of the Connector object using the Rest API check this page

Update Connector


For the update of the Connector object using the Rest API check this page

Create Seed


FieldOptionalDefaultMultipleNotesExample
seedNo-NoThe subdirectory to crawl. This value will be appended to the url of the connection.Path to the element to be crawled, can be a directory or a file."myDirectory/levelTwo"directory"
typeNo-NoThe value must be "filesystem"."filesystemsmb"
descriptionNo-NoName of the seed object."MyFileSystemConnection"MySMB"
seedFileYesfalseNoIf checked the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop.true /false
connectorNo-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionNo-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsYes[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
tagsYes[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]

Example

Code Block
themeRDark
titlePOST aspire/_api/seeds
{
    "type": "filesystemsmb",
    "seed": "directorymyDirectory/levelTwo",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag1", "tag2"],
    "properties": {
        "seed": "directory"seedFile": false
    }
}

Update Seed


FieldOptionalDefaultMultipleNotesExample
idNo-NoId of the seed to update."2f287669-d163-4e35-ad17-6bbfe9df3778"
seedYes-NoThe subdirectory to crawl. This value will be appended to the url of the connection."directorymyDirectory/levelTwo"
descriptionYes-NoName of the seed object."MyFileSystemConnection"MySMB"
seedFileYesfalseNoIf checked the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop.true /false
connectorYes-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionYes-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsYes[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.addYes[ ]YesThe ids of the workflows to add.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.removeYes[ ]YesThe ids of the workflows to remove.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.addYes[ ]YesThe ids of the routingPolicies to add.["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.removeYes[ ]YesThe ids of the routingPolicies to remove.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"]
tagsYes[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag3"]
tags.addYes[ ]YesThe tags to add["tag4"]
tags.removeYes[ ]YesThe tags to remove["tag2"]

Example

Code Block
themeRDark
titlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "type": "filesystemsmb",
    "seed": "directorymyDirectory/levelTwo",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag", "tag2"],
    "properties": {
        "seedseedFile": "directory"false
    }
}