The Aspider Web Crawler Elastic Cache Lookup component can be configured using the Aspire Admin UI. It requires the following entities to be created:

Credential

Connection

Connector
Seed

This component is an application for workflow configuration, and it is used in the "onAddUpdate" Workflow Event.

Easy Heading Free

navigationTitle	On this Page
wrapNavigationText	true
navigationExpandOption	expand-all-by-default

Create

Connection

Workflow

On the Aspire Admin UI, go to the connections page page Image Added
All existing connections will be listed. Click on the new button button Image Added
Enter the new connection description. Select Elastic Cache Lookup from the Type list.workflow description.
Select the “Create” button.
Go to the Workflow Event “onAddUpdate”.
Search in “Type criteria” the Applications options and drag, using Image Added, the Elastic Cache Lookup component in the onAddUpdate section.
Enter a new description for this application component.
Elasticsearch Settings:
1. Server URL: Select this to enable basic user authentication.
2. Authentication: Select this to enable AWS Signature V4 authentication.
  1. Basic:Select this to enable basic user authentication.
    1. Username: The name of Elasticsearch user to use.
    2. Password: The password of Elasticsearch user to use.
  2. Amazon Web Services (AWS): Check this to use default AWS credentials.
    1. Region: The Region of the ES service to use, i.e: us-east-1.
    2. Use credentials provider chain: To uses AWS credentials provider chain.
    3. Access Key: The Access key of the ES service to use.
    4. Secret Key: The Secret key of the ES service to use.
3. Index: Index name to get the _source content.
Connection Settings:
1. Connection pool
  1. Idle connection timeout: Time (in milliseconds) to keep an idle connection open.
  2. Max connections: Maximum number of connections to be opened.
  3. Connections per target: Number of connections opened for the same target.
2. Timeout settings
  1. Connection timeout: Time (in milliseconds) to wait for the connection.
  2. Socket timeout: Time (in milliseconds) to wait for a socket response.
3. Connection throttling:
  1. Throttling settings
    1. Throttling period: Time (in milliseconds) to throttle the connection.
    2. Max connections per period: Number of connections used during the throttling period.
4. Retries:
  1. Maximum retries: Maximum number of retries for a failed document.
  2. Retry delay: Time (in milliseconds) to wait before a retry.
Cache:
1. Use cache: Results should be cached in memory.
2. Cache Eviction Policy:
  1. Size
    1. Max number of entries: Max total number of entries to keep in the cache.
  2. Weight
    1. Max total Weight (MB): Specifies the maximum weight of entries the cache must contain.
  3. Time
    1. Time (min): Remove records that have been idle for an amount of time in minutes.
Lookup Fields:
1. Index lookup field: Elastic index field name for the lookup.
2. Source lookup field: Field name from the incoming AspireObject for the lookup. Field availability will be searched first in 'doc' and then in the 'doc.connectorSpecific' section.
3. Uppercase the source lookup field value: Convert the value of the source field into UPPERCASE value.
4. Lookup output field: Output fields from the lookup will be placed under this configured object.
5. Debug: Option if you want debug messages enabled.
6. Hit Size: Max mount of hits returned by the cache lookup. If -1 all hits will be returned.