Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Aspider Web Crawler Elastic Cache Lookup component can be configured using the Aspire Admin UI. It requires the following entities to be created:

  • Credential
  • Connection

    • Connector
    • Seed

    This component is an application for workflow configuration, and it is used in the "onAddUpdate" Workflow Event.

    Easy Heading Free
    navigationTitleOn this Page
    wrapNavigationTexttrue
    navigationExpandOptionexpand-all-by-default

    Create

    Connection 

    Workflow


    1. On the Aspire Admin UI, go to the connections page page Image Added
    2. All existing connections will be listed. Click on the new button button image2021-12-7_7-37-8.pngImage Added
    3. Enter the new connection description. Select Elastic Cache Lookup from the Type list.workflow description. 
    4. Select the “Create” button.
    5. Go to the Workflow Event “onAddUpdate”.
    6. Search in “Type criteria” the Applications options and drag, using Image Added, the Elastic Cache Lookup component in the onAddUpdate section.
    7. Enter a new description for this application component.
    8. Elasticsearch Settings:
      1. Server URL: Select this to enable basic user authentication.
      2. Authentication: Select this to enable AWS Signature V4 authentication.
        1. Basic:Select this to enable basic user authentication.
          1. Username: The name of Elasticsearch user to use.
          2. Password: The password of Elasticsearch user to use.
        2. Amazon Web Services (AWS): Check this to use default AWS credentials.
          1. Region: The Region of the ES service to use, i.e: us-east-1.
          2. Use credentials provider chain: To uses AWS credentials provider chain.
          3. Access Key: The Access key of the ES service to use. 
          4. Secret Key: The Secret key of the ES service to use.
      3. Index: Index name to get the _source content.
    9. Connection Settings:
      1. Connection pool
        1. Idle connection timeout: Time (in milliseconds) to keep an idle connection open.
        2. Max connections: Maximum number of connections to be opened.
        3. Connections per target: Number of connections opened for the same target.
      2. Timeout settings
        1. Connection timeout: Time (in milliseconds) to wait for the connection.
        2. Socket timeout: Time (in milliseconds) to wait for a socket response.
      3. Connection throttling:
        1. Throttling settings
          1. Throttling period: Time (in milliseconds) to throttle the connection.
          2. Max connections per period: Number of connections used during the throttling period.
      4. Retries:
        1. Maximum retries: Maximum number of retries for a failed document.
        2. Retry delay: Time (in milliseconds) to wait before a retry.
    10. Cache:
      1. Use cache: Results should be cached in memory.
      2. Cache Eviction Policy:
        1. Size
          1. Max number of entries: Max total number of entries to keep in the cache.
        2. Weight
          1. Max total Weight (MB): Specifies the maximum weight of entries the cache must contain.
        3. Time
          1. Time (min): Remove records that have been idle for an amount of time in minutes.
    11. Lookup Fields:
      1. Index lookup field: Elastic index field name for the lookup.
      2. Source lookup field: Field name from the incoming AspireObject for the lookup. Field availability will be searched first in 'doc' and then in the 'doc.connectorSpecific' section.
      3. Uppercase the source lookup field value: Convert the value of the source field into UPPERCASE value.
      4. Lookup output field: Output fields from the lookup will be placed under this configured object.
      5. Debug: Option if you want debug messages enabled.
      6. Hit Size: Max mount of hits returned by the cache lookup. If -1 all hits will be returned.


    Image Added




    Image Added


    Image Added






    Image Added



    Image Added