Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Step 1. Launch Aspire and Open the Content Source Management Page

Launch Aspire (if it's not already running). See:


Step 2. Add or select a Workflow


Step 3. Add the Elasticsearch Publisher to the Workflow

  • Select the event for which you want to add the Elasticsearch Publisher to, from the Event combo.
  • To add a Elasticsearch Publisher drag the Elasticsearch Publisher from the Rules Section on the right side of the screen and drop it below the Workflow Event to the left side of the screen. This will automatically open the Elasticsearch Publisher window for the configuration of the publisher.

Step 3a. Specify a description for the Publisher

 In top section of the ElasticSearch Publisher configuration window, specify the description for the publisher.

Step 3b. Specify Server Configuration

 In the Server section of the ElasticSearch Publisher configuration, specify the information related to the server.

  1. ElasticSearch URL: Select how you want to enter the ElasticSearch URL
    1. Host and port
      • ElasticSearch HostEnter the ElasticSearch host.
      • ElasticSearch PortEnter the ElasticSearch port (9200 by default)
    2. Complete Url
      • ElasticSearch URLEnter the url for the ElasticSearch bulk index endpoint, it must have this format <protocol>://<host>:<port>/_bulk
  2. ElasticSearch IndexEnter the index to which the jobs are going to be publish.

Step 3c. Specify Authentication Configuration

 In the Authentication section of the ElasticSearch Publisher configuration, specify the authentication information.

  1. None: The server requires no authentication
  2. Basic: provide credentials for basic authentication
    1. User:Provide the user for basic authentication.
    2. Password: Provide the password for basic authentication.
  3. Amazon Web Service (AWS): provide the configuration to authenticate using AWS
    1. Region: Specify the AWS region to use.
    2. Use credentials provider chain: enable to specify a credentials provider chain
      1. Access key: provide the access key for authentication with AWS.
      2. Secret key: provide the secret key for authentication with AWS.

Step 3d. Specify Transform Documents

 In the Transform Documents section of the ElasticSearch Publisher configuration, specify the groovy transformation file path.

  1. Groovy Transform: the default value is set to "${component.home}/config/groovy/transform.groovy" for the default JSON transformation file provided with Aspire. To use a custom file, follow the instructions in JSON Transformation.

Step 3e. Specify Pre/Post Processing Options

 In the Pre/Post Processing section of the ElasticSearch Publisher configuration, specify the Pre/Post Processing configuration options.

  1. Clear index on full crawl: select to clear the index on full crawls.
    1. Clear index by: select the approach to clear the index.
      1. Deleting all documents: deletes the documents from the index.
      2. Delete index: deletes the index completely.

Step 3f. Specify Connection Settings Values

 In the Connection Settings section of the ElasticSearch Publisher configuration, specify the Connection Settings values for the connection to the server.

  1. Connection Pool:

  2. Timeout Settings:
  3. Connection Throttling:
  4. Retries:

Step 3g. Specify Index Dump Configuration

 In the Index Dump section of the ElasticSearch Publisher configuration, specify the Index Dump configuration values.

  1. Max Results per request: How many documents can be fetched by Request: maximum number of documents that the search engine for the same querycan fetch in a single query.
  2. Page size: How many Size: maximum number of documents to fetch per by query page.
  3. Id field: the name of the field containing the document id, relative to the top level "hits" node in Elasticsearch.
  4. Url field: the name of the field containing the document url, relative to the top level "hits" node in Elasticsearch. Field used to store the id in elasticsearch. Used to compare against the content source audit logsUrl field: Field used to store the url in elasticsearch
  5. Timestamp field: the name of the timestamp field holding the index timestamp of every document
  6. Authentication: Indicates the authentication credentials for the ElasticSearch url, if you are using protocol https, then you must provide authentication credentials
    1. None: use this setting if you do not need to provide credentials
    2. Basic: use this setting if your ElasticSearch url requires authorization credentials.
      • Username: enter the username required to access the ElasticSearch url.
      • Password: enter the password required to access the ElasticSearch url.
  7. Debug: Check if you want to run the publisher in debug mode.
    Note: The Delete by Query feature does not work on Elasticsearch 5.x onward
  8. document feed timestamp, relative to the top level "hits" node in Elasticsearch.

Step 3h. Specify Debug Configuration

 In the Debug section of the ElasticSearch Publisher configuration, specify the Debug flag.

  1. Debug: Check to enable debug mode to show debug messages from the publisher.

Step 3i. Specify Debug Configuration

  1. Click on the Add button.























Once you've clicked Add, it will take a moment for Aspire to download all of the necessary components (the Jar files) from the Maven repository and load them into Aspire. Once that's done, the publisher will appear in the Workflow Tree.