Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The HTTP Listener Connector can be configured using the Rest API. It requires the following entities to be created:

  • Connection
  • Connector
  • Seed

Below are the examples of how to create the Connection, the Seed, the Credential. For the General Connector Configuration, please check this page.

Easy Heading Free
navigationTitleOn this Page
navigationExpandOptionexpand-all-by-default


Create Connection


FieldRequiredDefaultMultipleNotesExample
typeYes-NoThe value must be "http listener"."http listener"
descriptionYes-NoName of the connection object."http listener connection"
propertiesYes-NoConfiguration object
ContentNofalseNoSet this parameter to true if you will be POST-ing XML/JSON data to the HTTP Feeder. This XML/JSON data will be set as an input stream attached to the job published by the feeder."shared.example.com"
multipartYes"disabled"NoEnable multi-part form submission, which allows for uploading files to the HTTP server through HTML forms, as well as other input elements."file"
filehandlerNo"stream"NoSpecify the type of file handler to use for posted files. The stream (default) handler will attach an InputStream to the file stream to the job and subsequent stages can access the data using the Standards.Basic.getContentStream(Job j) method in the package com.accenture.aspire.framework. The file handler will upload the file to the specified directory (see below). No input stream is attached to the job for the file handler. See above for more details and restrictions."file"
uploadDirYes-NoSpecify the location where files from multi-part forms will be uploaded when using the file handler. /upload
transformNofalseNoSet on true if you want to use processor to transform xml using XSLT 2.0 files.false
xsltFileNameYes-NoThe path of the XSL transform file to be used to format the output xml. Path names will be relative to Aspire Home./config/xsl/executor.xsl
saxonProcessorNofalseNoSet on true if you want to use SAXON Processors to transform using XSLT 2.0 files.true
outputMimeYes-NoSpecifies the mime type which the HTTP feeder will report back to the HTTP client. Change this to "text/html" if your transform creates HTML which should be shown by a browser.text/xml
jobMimeNofalseNoSpecifies the mime type which the HTTP feeder will report back to the HTTP client. Takes mime type from Job.true
headersNo-YesThe configuration of the http headers.
maxUploadSizeNo10 MBNoSpecifies max size of uploaded file10 MB
debugOutFileNo-NoSpecify the location where the XSLT processed output will be written to. This is used for debugging the transforms./debug/debug_output.txt

Example

{
            "id""8338cc3e-ebfa-43c0-a976-fa6125555754",
Code Block
themeRDark
titlePOST aspire/_api/connections
{
            "type""http listener",


            "description""http listener connection",


            "properties": {


                "Content":
 true
 true,


                "multipart""file",


                "fileHandler""file",


                "uploadDir""/config/xsl",


                "transform":
 true
 true,


                "xsltFileName""/config/xsl/executor.xsl",


                "saxonProcessor":
 false
 false,


                "outputMime""text/xml",


                "jobMime":
 false
 false,


                "headers": [


                    {


                        "headerName""Connection",


                        "headerValue""keep-alive"


                    },


                    {


                        "headerName""Access-Control-Allow-Origin",


                        "headerValue""*"


                    }


                ],


                "maxUploadSize":
 1000000
 1000000,


                "debugOutFile"""


            }


}

Update Connection


FieldRequiredDefaultMultipleNotesExample
typeYes-NoThe value must be "smb"."smb"
descriptionYes-NoName of the connection object."smbConnection"
credentialYes-NoThe ID of the credential to be used with this seed. The credential type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
hostnameYes-NoHostname where the shared directory is located."shared.example.com"
portYes445NoPort where the SMB protocol is used."445"
propertiesYes-NoConfiguration object
disableFetch
NofalseNoCheck to disable the connector fetcher, only metadata will be collected.true / false
verboseSMBJ
NofalseNoCheck to enable SMBJ logging. (WARNING) Enabling this would decrease performance.true / false
stopOnScanErrorNotrueNoIf enabled, the crawl will stop if there is an error on the scan phase.true / false
indexContainersNofalseNoEnable to index the directories.true / false
scanRecursivelyNotrueNoEnable to scan discovered directories recursively.true / false
includeNo

[ ]

YesPatterns to match against document URL, if any of them match, the document will be included in the crawl.[ ".*pdf$", ".*docx$" ]
excludeNo[ ]YesPatterns to match against document URL, if any of them match, the document will be excluded from the crawl.[ ".*png$", ".*jpeg$" ]
scanExcludedItemsNofalseNoEnable to force the scan of excluded directories, so child items within the scope can be found.true / false
fetchACLsNotrueNoCheck to retrieve owner, group and ACL information.true / false
resolveSIDs
NotrueNo
Check to resolve retrieved SIDs from owner, group and ACL.
true / false
addACLSID
NofalseNoCheck to include SID value on ACL output.true / false
addACLEncodedSID
NofalseNoCheck to include Encoded SID (Base 32) value on ACL output.true / false
addACLFlags
NofalseNoCheck to include ACL flags on ACL output.true / false
addACLType
NofalseNoCheck to include ACL type on ACL output.true / false
addACLAccessMask
NofalseNoCheck to include ACL access mask on ACL output.true / false
enableDFS
NotrueNoDistributed File System (DFS) resolutiontrue /false
connectionTimeout
Yes6000NoTimeout in milliseconds for each SMB request."6000"
maxRetries
Yes5NoMaximum retries permitted per document."5"
baseBackoff
Yes500NoBase time for the back off sleeps (in ms)."500"
backoffMultiplier
Yes2.0NoMultiplier factor to be used for the back off time."2.0"
lastAccessedUpdates
NofalseNoCheck to restore the last accessed date on the documents processed by the connector. WARNING: Requires a user with permissions for writing. This is not supported by windows.true /false
staticAclNo

[ ]

YesStatic ACL configuration object
nameYes-NoName of the static ACL."group1"
domainNo""NoDomain of the static ACL."testDomain"
entityNo"user"NoEntity (user / group) represented by the static ACL."user" / "group"
accessNo"allow"NoAccess (allow / deny) granted by the ACL."allow" / "deny"

Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "type": "smb",
    "description": "SMB Test Connector",
	"credential": "2a5ca234-e328-4d40-bb2a-2df3e550b065",
    "properties": {
        "host": "192.168.0.80",
        "port":"445",
        "disableFetch": false,
        "verboseSMBJ": false,
        "stopOnScanError": true,
        "indexContainers": true,
        "scanExcludedItems": true,
        "includes": ".*\\.txt",
        "excludes": ".*\\.png",
		"fetchACLs":true,
        "resolveSIDs": true,
		"addACLSID": false,
		"addACLEncodedSID": false,
		"addACLFlags": false,
		"addACLType" : false,
		"addACLAccessMask": false,
		"enableDFS": true,
		"connectionTimeout": 60000,
		"maxRetries": 5,
		"baseBackoff": 500,
		"backoffMultiplier": 2.0,
		"lastAccessedUpdates": false,
        "staticAcl": [{
                "name": "test-user",
                "domain": "test-domain",
                "entity": "user",
                "access": "allow"
            }, {
                "name": "test-group",
                "domain": "",
                "entity": "group",
                "access": "deny"
            }
        ]
    }
}

Create Connector


For the creation of the Connector object using the Rest API, check this page

Update Connector


For the update of the Connector object using the Rest API, check this page

Create Seed


FieldRequiredDefaultMultipleNotesExample
seedYes-NoPath to the element to be crawled, can be a directory or a file."myDirectory/levelTwo"
typeYes-NoThe value must be "filesystem"."smb"
descriptionYes-NoName of the seed object."MySMB"
seedFileNofalseNoIf checked, the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop.true /false
connectorYes-NoThe ID of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionYes-NoThe ID of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsNo[ ]YesThe IDs of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyNo-NoID of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesNo[ ]YesThe IDs of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
tagsNo[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]

Example

Code Block
themeRDark
titlePOST aspire/_api/seeds
{
    "type": "smb",
    "seed": "myDirectory/levelTwo",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag1", "tag2"],
    "properties": {
        "seedFile": false
    }
}

Update Seed


FieldRequiredDefaultMultipleNotesExample
idYes-NoID of the seed to update."2f287669-d163-4e35-ad17-6bbfe9df3778"
seedNo-NoThe subdirectory to crawl. This value will be appended to the URL of the connection."myDirectory/levelTwo"
descriptionNo-NoName of the seed object."MySMB"
seedFileNofalseNoIf checked, the path will be processed as a file instead of a directory. WARNING: The crawler will only process the seed and then will stop.true /false
connectorNo-NoThe ID of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionNo-NoThe ID of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsNo[ ]YesThe IDs of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.addNo[ ]YesThe IDs of the workflows to add.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.removeNo[ ]YesThe IDs of the workflows to remove.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyNo-NoID of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesNo[ ]YesThe IDs of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.addNo[ ]YesThe IDs of the routingPolicies to add.["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.removeNo[ ]YesThe IDs of the routingPolicies to remove.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"]
tagsNo[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag3"]
tags.addNo[ ]YesThe tags to add["tag4"]
tags.removeNo[ ]YesThe tags to remove["tag2"]

Example

Code Block
themeRDark
titlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "type": "smb",
    "seed": "myDirectory/levelTwo",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "FileSystem_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag", "tag2"],
    "properties": {
        "seedFile": false
    }
}

Create Credential


FieldRequiredDefaultMultipleNotesExample
typeYes-NoThe value must be "smb"."smb"
descriptionYes-NoName of the credential object."smbCredential"
domainNo-NoDomain of the account that will crawl the shared directory. If the user is a local account leave blank"WORKGROUP"
usernameYes-NoAccount user"admin"
passwordYes-NoAccount password"234dfc22re!?"

Example

Code Block
themeRDark
titlePOST aspire/_api/credentials
{
    "type": "smb",
    "description": "SMB snapshot",
    "properties": {
        "username": "test",
        "password": "test1",
        "domain":"WORKGROUP"
    }
}

Update Credential

FieldRequiredDefaultMultipleNotesExample
typeYes-NoThe value must be "smb"."smb"
descriptionYes-NoName of the credential object."smbCredential"
domainNo-NoDomain of the account that will crawl the shared directory. If the user is a local account leave blank"WORKGROUP"
usernameYes-NoAccount user"admin"
passwordNo-NoAccount password"234dfc22re!?"

Example

Code Block
themeRDark
titlePUT aspire/_api/credentials/2a5ca234-e328-4d40-bb2a-2df3e550b065
{
    "type": "smb",
    "description": "SMB snapshot",
    "properties": {
        "username": "test",
        "password": "test1",
        "domain":"WORKGROUP"
    }
}