Field | Required | Default | Multiple | Notes | Example | |||||
---|---|---|---|---|---|---|---|---|---|---|
type | Yes | - | No | The value must be "http listener". | "http listener" | |||||
description | Yes | - | No | Name of the connection object. | "http listener connection" | |||||
properties | Yes | - | No | Configuration object | ||||||
Content | No | false | No | Set this parameter to true if you will be POST-ing XML/JSON data to the HTTP Feeder. This XML/JSON data will be set as an input stream attached to the job published by the feeder. | "shared.example.com" | |||||
multipart | Yes | "disabled" | YesNo | Enable multi-part form submission, which allows for uploading files to the HTTP server through HTML forms, as well as other input elements. | "file" | |||||
filehandler | No | "stream" | YesNo | Specify the type of file handler to use for posted files. The stream (default) handler will attach an InputStream to the file stream to the job and subsequent stages can access the data using the Standards.Basic.getContentStream(Job j) method in the package com.accenture.aspire.framework. The file handler will upload the file to the specified directory (see below). No input stream is attached to the job for the file handler. See above for more details and restrictions. | "file" | |||||
uploadDir | transform | xsltFileName | saxonProcessor | outputMime | jobMime | Yes | - | No | Specify the location where files from multi-part forms will be uploaded when using the file handler. | /upload |
transform | No | false | No | Set on true if you want to use processor to transform xml using XSLT 2.0 files. | false | |||||
xsltFileName | Yes | - | No | The path of the XSL transform file to be used to format the output xml. Path names will be relative to Aspire Home. | /config/xsl/executor.xsl | |||||
saxonProcessor | No | false | No | Set on true if you want to use SAXON Processors to transform using XSLT 2.0 files. | true | |||||
outputMime | Yes | - | No | Specifies the mime type which the HTTP feeder will report back to the HTTP client. Change this to "text/html" if your transform creates HTML which should be shown by a browser. | text/xml | |||||
jobMime | No | false | No | Specifies the mime type which the HTTP feeder will report back to the HTTP client. Takes mime type from Job. | true | |||||
headers | No | - | Yes | The configuration of the http headers. | ||||||
maxUploadSize | No | 10 MB | No | Specifies max size of uploaded file | 10 MB | |||||
debugOutFile | No | - | No | Specify the location where the XSLT processed output will be written to. This is used for debugging the transforms. | /debug/debug_output.txt | headers | maxUploadSize | debugOutFile |
{
"id": "8338cc3e-ebfa-43c0-a976-fa6125555754",
"type": "http listener",
"description": "http listener connection",
"properties": {
"Content": true,
"multipart": "file",
"fileHandler": "file",
"uploadDir": "/config/xsl",
"transform": true,
"xsltFileName": "/config/xsl/executor.xsl",
"saxonProcessor": false,
"outputMime": "text/xml",
"jobMime": false,
"headers": [
{
"headerName": "Connection",
"headerValue": "keep-alive"
},
{
"headerName": "Access-Control-Allow-Origin",
"headerValue": "*"
}
],
"maxUploadSize": 1000000,
"debugOutFile": ""
}
}
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "smb". | "smb" |
description | Yes | - | No | Name of the connection object. | "smbConnection" |
credential | Yes | - | No | The ID of the credential to be used with this seed. The credential type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
hostname | Yes | - | No | Hostname where the shared directory is located. | "shared.example.com" |
port | Yes | 445 | No | Port where the SMB protocol is used. | "445" |
properties | Yes | - | No | Configuration object | |
disableFetch | No | false | No | Check to disable the connector fetcher, only metadata will be collected. | true / false |
verboseSMBJ | No | false | No | Check to enable SMBJ logging. (WARNING) Enabling this would decrease performance. | true / false |
stopOnScanError | No | true | No | If enabled, the crawl will stop if there is an error on the scan phase. | true / false |
indexContainers | No | false | No | Enable to index the directories. | true / false |
scanRecursively | No | true | No | Enable to scan discovered directories recursively. | true / false |
include | No | [ ] | Yes | Patterns to match against document URL, if any of them match, the document will be included in the crawl. | [ ".*pdf$", ".*docx$" ] |
exclude | No | [ ] | Yes | Patterns to match against document URL, if any of them match, the document will be excluded from the crawl. | [ ".*png$", ".*jpeg$" ] |
scanExcludedItems | No | false | No | Enable to force the scan of excluded directories, so child items within the scope can be found. | true / false |
fetchACLs | No | true | No | Check to retrieve owner, group and ACL information. | true / false |
resolveSIDs | No | true | No | Check to resolve retrieved SIDs from owner, group and ACL. | true / false |
addACLSID | No | false | No | Check to include SID value on ACL output. | true / false |
addACLEncodedSID | No | false | No | Check to include Encoded SID (Base 32) value on ACL output. | true / false |
addACLFlags | No | false | No | Check to include ACL flags on ACL output. | true / false |
addACLType | No | false | No | Check to include ACL type on ACL output. | true / false |
addACLAccessMask | No | false | No | Check to include ACL access mask on ACL output. | true / false |
enableDFS | No | true | No | Distributed File System (DFS) resolution | true /false |
connectionTimeout | Yes | 6000 | No | Timeout in milliseconds for each SMB request. | "6000" |
maxRetries | Yes | 5 | No | Maximum retries permitted per document. | "5" |
baseBackoff | Yes | 500 | No | Base time for the back off sleeps (in ms). | "500" |
backoffMultiplier | Yes | 2.0 | No | Multiplier factor to be used for the back off time. | "2.0" |
lastAccessedUpdates | No | false | No | Check to restore the last accessed date on the documents processed by the connector. WARNING: Requires a user with permissions for writing. This is not supported by windows. | true /false |
staticAcl | No | [ ] | Yes | Static ACL configuration object | |
name | Yes | - | No | Name of the static ACL. | "group1" |
domain | No | "" | No | Domain of the static ACL. | "testDomain" |
entity | No | "user" | No | Entity (user / group) represented by the static ACL. | "user" / "group" |
access | No | "allow" | No | Access (allow / deny) granted by the ACL. | "allow" / "deny" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "smb", "description": "SMB Test Connector", "credential": "2a5ca234-e328-4d40-bb2a-2df3e550b065", "properties": { "host": "192.168.0.80", "port":"445", "disableFetch": false, "verboseSMBJ": false, "stopOnScanError": true, "indexContainers": true, "scanExcludedItems": true, "includes": ".*\\.txt", "excludes": ".*\\.png", "fetchACLs":true, "resolveSIDs": true, "addACLSID": false, "addACLEncodedSID": false, "addACLFlags": false, "addACLType" : false, "addACLAccessMask": false, "enableDFS": true, "connectionTimeout": 60000, "maxRetries": 5, "baseBackoff": 500, "backoffMultiplier": 2.0, "lastAccessedUpdates": false, "staticAcl": [{ "name": "test-user", "domain": "test-domain", "entity": "user", "access": "allow" }, { "name": "test-group", "domain": "", "entity": "group", "access": "deny" } ] } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
seed | Yes | - | No | Path to the element to be crawled, can be a directory or a file. | "myDirectory/levelTwo" |
type | Yes | - | No | The value must be "filesystem". | "smb" |
description | Yes | - | No | Name of the seed object. | "MySMB" |
seedFile | No | false | No | If checked, the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop. | true /false |
connector | Yes | - | No | The ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | Yes | - | No | The ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "smb", "seed": "myDirectory/levelTwo", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "description": "FileSystem_Test_Seed", "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag1", "tag2"], "properties": { "seedFile": false } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | ID of the seed to update. | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
seed | No | - | No | The subdirectory to crawl. This value will be appended to the URL of the connection. | "myDirectory/levelTwo" |
description | No | - | No | Name of the seed object. | "MySMB" |
seedFile | No | false | No | If checked, the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop. | true /false |
connector | No | - | No | The ID of the connector to be used with this seed. The connector type must match the seed type. | "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31" |
connection | No | - | No | The ID of the connection to be used with this seed. The connection type must match the seed type. | "602d3700-28dd-4a6a-8b51-e4a663fe9ee6" |
workflows | No | [ ] | Yes | The IDs of the workflows that will be executed for the documents crawled. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.add | No | [ ] | Yes | The IDs of the workflows to add. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
workflows.remove | No | [ ] | Yes | The IDs of the workflows to remove. | ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"] |
throttlePolicy | No | - | No | ID of the throttle policy that applies to this connection object. | "f5587cee-9116-4011-b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The IDs of the routing policies that this seed will use. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.add | No | [ ] | Yes | The IDs of the routingPolicies to add. | ["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"] |
routingPolicies.remove | No | [ ] | Yes | The IDs of the routingPolicies to remove. | ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag3"] |
tags.add | No | [ ] | Yes | The tags to add | ["tag4"] |
tags.remove | No | [ ] | Yes | The tags to remove | ["tag2"] |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "id": "2f287669-d163-4e35-ad17-6bbfe9df3778", "type": "smb", "seed": "myDirectory/levelTwo", "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31", "description": "FileSystem_Test_Seed", "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b", "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"], "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6", "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"], "tags": ["tag", "tag2"], "properties": { "seedFile": false } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "smb". | "smb" |
description | Yes | - | No | Name of the credential object. | "smbCredential" |
domain | No | - | No | Domain of the account that will crawl the shared directory. If the user is a local account leave blank | "WORKGROUP" |
username | Yes | - | No | Account user | "admin" |
password | Yes | - | No | Account password | "234dfc22re!?" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "smb", "description": "SMB snapshot", "properties": { "username": "test", "password": "test1", "domain":"WORKGROUP" } } |
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "smb". | "smb" |
description | Yes | - | No | Name of the credential object. | "smbCredential" |
domain | No | - | No | Domain of the account that will crawl the shared directory. If the user is a local account leave blank | "WORKGROUP" |
username | Yes | - | No | Account user | "admin" |
password | No | - | No | Account password | "234dfc22re!?" |
Code Block | ||||
---|---|---|---|---|
| ||||
{ "type": "smb", "description": "SMB snapshot", "properties": { "username": "test", "password": "test1", "domain":"WORKGROUP" } } |