Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default


Step 1. Launch Aspire and Open the Content Source Management Page.

Launch Aspire (if it's not already running). See:



Step 2. Add or select a Workflow.

  • Add a new workflow or open an existing workflow.
  • For this step, please refer to the Workflow Introduction.



Step 3. Specify a description of the Publisher.

 In the top section of the Azure Search Publisher configuration window, specify the description of the publisher.


Step 4. Add the

Elasticsearch

Azure Search Publisher to the Workflow.

  • Select the event for which you want to add the Elasticsearch Azure Search Publisher to, from the Event combo.
  • To add an Elasticsearch Azure Search Publisher, drag the Elasticsearch Azure Search Publisher from the Rules Section on the right side of the screen and drop it below the Workflow Event to the left side of the screen. This will automatically open the Elasticsearch Azure Search Publisher window for the configuration of the publisher.


Image RemovedImage Added

 In the top section of the Elasticsearch Publisher configuration window, specify the description for the publisher.

Step 3b. Specify Server Configuration

Step

3a. Specify a description for the Publisher

5. Specify Server Configuration.


 In the Server section of the Elasticsearch Publisher Azure Search Publisher configuration, specify the information related to the server.

  1. Elasticsearch URL: Select how you want to enter the Elasticsearch URLHost and Port
  2. Elasticsearch HostEnter the Elasticsearch host.
  3. Elasticsearch Port: Enter the Elasticsearch port (9200 by default)
  4. Complete URL
    • Elasticsearch URLEnter the URL for the Elasticsearch bulk index endpoint, it must have this format <protocol>://<host>:<port>/_bulk
  5. the name of the service endpoint to use.
  6. Azure Elasticsearch Index: Enter the index to which the Azure Search index where the jobs are going to will be publishedstored.

Image RemovedImage Added


Step

3c

6. Specify Authentication Configuration.

 In the Authentication section of the Elasticsearch Publisher Azure Search Publisher configuration, specify the authentication information.

  • None: The server requires no authentication
  • Basic: Provide credentials for basic authentication
    User:Provide the user for basic authentication.
  • Password: Provide the password for basic authentication.
  • Amazon Web Service (AWS): Provide the configuration to authenticate using AWS
    1. Region: Specify the AWS region to use.
    2. Use Credentials Provider Chain: Enable to specify a credentials provider chain
      1. Access Key: Provide the access key for authentication with AWS.
      2. Secret Key: Provide the secret key for authentication with AWS.
  • Image Removed

    1. Azure API Version: Enter the Azure Search API version of the REST API.
    2. Azure API Key: Enter the Azure Search API Key used to connect to the REST API.

    Image Added

    Step 3d. Specify Transform Documents

    Step 7. Specify Transform Documents.


     In the Transform Documents section of the Elasticsearch Publisher configuration, specify the groovy transformation file path.Azure Search Publisher configuration, you can choose between specifying a Local Transform File or picking from a previously uploaded Resource Transform File:

    1. Local Transform FileGroovy Transform: the default value is set to "${component.home}/config/groovy/transformaspireToAzureSearchBulk.groovy" for the default JSON Groovy transformation file provided with Aspire. To  To use a custom file, follow the instructions in JSON Transformation.

    Image Removed

    1. .

    2. Resources Transform File: pick the appropriate file that was previously uploaded by using Aspire's "Resources" feature.

    Image Added

    Image Added


    Step 8

    Step 3e

    . Specify Pre- / Post-Processing Options.

     In the Pre- / Post-Processing section of the Elasticsearch Publisher configuration, specify the Pre- / Post-Processing configuration options.Azure Search Publisher configuration.

    1. Create and Clear Index on Full Crawl:Select to clear the index on full crawls.
      1. Clear Index by: Select the approach to clear the index.
        1. Deleting All Documents: Deletes the documents from the index.
        2. Delete index: Deletes the index completely.

    Image Removed

    1.  Check to re-create the Azure Search index each full crawl. If "no" is selected, the publisher will expect the index to exist.
    2. Azure Search Index File:you can choose between specifying a Local Index Definition File or picking from a previously uploaded Resources Index File:
      1. Local Index Definition File: the default value is set to "${component.home}/config/json/azureSearchIndex.json" for the JSON index mappings definition to use. You can select a file path or use a resource file upload in the resources' manager section

      2. Resources Index File: pick the appropriate file that was previously uploaded by using Aspire's "Resources" feature.
    3. Index deletion wait time: the maximum time (in milliseconds) to wait between the deletion/creation of the index and the publishing process.


    Image Added

    Image Added

    Step 3f. Specify Connection Settings Values

    Step 9. Specify Connection Settings Values.


     In the Connection Settings section of the Elasticsearch Publisher configuration, specify the Connection Settings values for the connection to the server.

    1. Connection Pool: Connection pool settings.

      1. Idle Connection Timeout: Maximum time (in milliseconds) to keep an idle connection open.
      2. Max Connections: Maximum number of connections to be opened.
      3. Connections per Target: Maximum number of connections opened for the same target.
    2. Timeout Settings: Connection pool timeout settings.
      1. Connection Timeout: Maximum time (in milliseconds) to wait for the connection.
      2. Socket Timeout: Maximum time (in milliseconds) to wait for a socket response.
    3. Connection Throttling: Enable to specify Throttling Settings.
      1. Throttling Period: Time period (in milliseconds) to throttle the connection.
      2. Max Connections per Period: Maximum number of connections used during the Throttling Period.
    4. Retries:
      1. Maximum Retries: Maximum number of retries for a failed document.
      2. Retry Delay: Time period (in milliseconds) to wait before a retry.


    Image RemovedImage Added

    Step

    3g

    10. Specify Batching Configuration.

     In the Debug section of the ElasticsearchAzure Search Publisher configuration, specify the batching configuration values.

    1. Scanner Job Batch Size: Maximum size of the batches that will be created.
    2. Simultaneous Batches: Number of batches that will be processed simultaneously.
    3. Batch Timeout: Period (in ms) after which a batch of documents will be closed and executed.


    Image RemovedImage Added


    Step

    3h

    11. Specify Debug Configuration.

     In the Debug section of the Elasticsearch Publisher Azure Search Publisher configuration, specify the Debug flag.

    1. Debug: Check to enable debug mode to show debug messages from the publisher.


    Image RemovedImage Added

    Step

    3i

    12. Click

    on

    the Add button.

    Once you click the add button, the Elasticsearch Azure Search Publisher settings will be saved.