Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Aspider Web Crawler Connector can be configured using the Rest API. It requires the following entities to be created:

  • Credential
  • Connection
  • Connector
  • Seed

Below are the examples of how to create the Connection and the Seed. For the Connector please check this page.

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Create Credential (Common)


FieldOptionalDefaultMultipleNotesExample
typeNo-NoThe value must be "<connector>aspider"."<connector>aspider"
descriptionNo-NoName of the credential object."<connector>CredentialAspiderCredential"
propertiesNo-NoConfiguration object
useSeleniumYesfalseNoFlag to let aspider know if it has to set up Seleniumtrue / false
webDriverImplementationNo-No

Browser used by selenium.

Possible values:

  • CHROME
  • FIREFOX

Note: Only used if useSelenium is set to true.

  • "CHROME"
  • "FIREFOX"
webDriverPathNo-No

Path to the selenium web driver executable.

Note: The driver must have execution permission.

Note: Only used if useSelenium is set to true.

"lib\\chromedriver.exe"
headlessModeNo-No

Flag to start the browser on headless mode (no GUI).

Note: Only used if useSelenium is set to true.

true / false
authMechNo

[]

YesArray containing the authentication mechanisms

[]

hostYes""


portYes-1


schemeNo-No

Scheme to use during the authentication.

Possible values:

  • Basic
  • Digest
  • NTLM
  • Negotiate
  • Forms
  • Selenium
  • "Basic"
  • "Digest"
  • "NTLM"
  • "Negotiate"
  • "Forms"
  • "Selenium"
userNo-


passwordNo-


domainYes""


realmNo""


Create Credential - NTLM specific fields

FieldOptionalDefaultMultipleNotesExample
adfsNofalseNoFlag to indicate if ADFS should be used, only required when scheme is "NTLM".true / false

Create Credential - Negotiate specific fields

FieldOptionalDefaultMultipleNotesExample
useDefaultKrb5NotrueNoFlag to indicate if Aspider should use the system settings for Kerberos.true / false
kdcYes-NoHostname of the key distribution center to get the Kerberos tickets."kdc.example.com"
verboseNofalseNoFlag to indicate if the entire negotiation process should be logged.true / false

Create Credential - Forms specific fields

FieldOptionalDefaultMultipleNotesExample
loginUrlYes-NoURL of the login page"https://example.com/login"
formPathYes-NoCSS Selector for getting the login form."#content > form"
userFieldYes-NoId of the username field"txtUser"
passwordFieldYes-NoId of the password field"txtPass"
adfsNofalseNo
true / false
samlNofalseNo
true / false
retriesYes-NoNumber of retries to do if the authentication fails.5
customFieldNo

[]

YesArray of other fields in the form[]
nameYes-NoName of the field in the form"myField"
valueYes-NoValue of the field in the form"myValue"

Example

Code Block
themeRDark
titlePOST aspire/_api/credentials
{
    "type": "<connector>",
    "description": "<connector>Credential",
    "properties": {
		
    }
}

Update Credential


FieldOptionalDefaultMultipleNotesExample
idNo-NoId of the credential to update."2f287669-d163-4e35-ad17-6bbfe9df3778"
descriptionNo-NoName of the credential object."<connector>Credential"
propertiesNo-NoConfiguration object






Example

Code Block
themeRDark
titlePUT aspire/_api/credentials/2a5ca234-e328-4d40-bb2a-2df3e550b065
{
    "id": "2a5ca234-e328-4d40-bb2a-2df3e550b065",
    "description": "<connector>Credential",
    "properties": {
		
    }
}

Create Connection


Field

Optional

Default

Multiple

Notes

Example

typeNo-NoThe value must be "<connector>"."<connector>"
descriptionNo-NoName of the connection object."My<connector>Connection"
credentialNo-NoId of the credential assigned to this object."2a5ca234-e328-4d40-bb2a-2df3e550b065"
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this connection will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
propertiesNo-NoConfiguration object






Example

Code Block
themeRDark
titlePOST aspire/_api/connections
{
    "type": "<connector>",
    "description": "<connector> Test Connector",
	"credential": "2a5ca234-e328-4d40-bb2a-2df3e550b065",	
    "properties": {
        
    }
}

Update Connection

Field

Optional

Default

Multiple

Notes

Example

idNo-NoId of the connection to update"89d6632a-a296-426c-adb0-d442adcab4b0",
descriptionYes-NoName of the connection object."MyAspiderConnection"
credentialYes-NoId of the credential assigned to this object."2a5ca234-e328-4d40-bb2a-2df3e550b065"
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this connection will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
propertiesNo-NoConfiguration object






Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
    "description": "<connector> Test Connector",
	"credential": "2a5ca234-e328-4d40-bb2a-2df3e550b065",
    "properties": {
        
    }
}

Create Connector Instance


For the creation of the Connector object using the Rest API check this page

Update Connector Instance


For the update of the Connector object using the Rest API check this page

Create Seed


Field

Optional

Default

Multiple

Notes

Example

seedNo-No<seed description>
typeNo-NoThe value must be "<connector>"."<connector>"
descriptionNo-NoName of the seed object."My<connector>Seed"
connectorNo-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionNo-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsYes[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
tagsYes[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]
propertiesNo-NoConfiguration object






Example

Code Block
themeRDark
titlePOST aspire/_api/seeds
{
    "type": "<connector>",
    "seed": "directory",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "<connector>_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag1", "tag2"],
    "properties": {
        
    }
}

Update Seed


Field

Optional

Default

Multiple

Notes

Example

idNo-NoId of the seed to update."2f287669-d163-4e35-ad17-6bbfe9df3778"
seedYes-No<seed description>
descriptionYes-NoName of the seed object."My<connector>Seed"
connectorYes-NoThe id of the connector to be used with this seed. The connector type must match the seed type."82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31"
connectionYes-NoThe id of the connection to be used with this seed. The connection type must match the seed type."602d3700-28dd-4a6a-8b51-e4a663fe9ee6"
workflowsYes[ ]YesThe ids of the workflows that will be executed for the documents crawled.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.addYes[ ]YesThe ids of the workflows to add.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
workflows.removeYes[ ]YesThe ids of the workflows to remove.["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]
throttlePolicyYes-NoId of the throttle policy that applies to this connection object."f5587cee-9116-4011-b3a9-6b235b333a1b"
routingPoliciesYes[ ]YesThe ids of the routing policies that this seed will use.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.addYes[ ]YesThe ids of the routingPolicies to add.["b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
routingPolicies.removeYes[ ]YesThe ids of the routingPolicies to remove.["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7"]
tagsYes[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag3"]
tags.addYes[ ]YesThe tags to add["tag4"]
tags.removeYes[ ]YesThe tags to remove["tag2"]
propertiesNo-NoConfiguration object






Example

Code Block
themeRDark
titlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "seed": "<seed example>",
    "connector": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31",
    "description": "<connector>_Test_Seed",
    "throttlePolicy": "6b8b5f23-fc77-47a1-9b58-106577162e7b",
    "routingPolicies": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],
    "connection": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6",
    "workflows": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],
    "tags": ["tag", "tag2"],
    "properties": {
        
    }
}