Step 1. Launch Aspire and Open the Content Source Management Page.

Launch Aspire (if it's not already running). See:

Step 2. Add or select a Workflow.

  • Add a new workflow or open an existing workflow.
  • For this step, please refer to the Workflow Introduction.


Step 3. Add the Azure Blob Publisher to the Workflow.

  • Select the event for which you want to add the Azure Blob Publisher to, from the Event combo.
  • To add an Azure Blob Publisher, drag the Azure Blob Publisher from the Rules Section on the right side of the screen and drop it below the Workflow Event to the left side of the screen. This will automatically open the Azure Blob Publisher window for the configuration of the publisher.


Step 3a. Specify a description for the publisher.

In the top section of the Azure Blob Publisher configuration window, specify the description for the publisher.

Step 3b. Specify Connection Configuration.

In the connection section, specify the connection information needed to publish to Azure Blob storage.

  • Storage Connection String: This is the connection string for the Azure Blob storage service, which contains four parts.
    • Default Endpoints Protocol, with two possible values: HTTP or HTTPS. For example: “DefaultEndpointsProtocol=http;”
    • Account Name, which is the name of your Microsoft Azure storage account. For example: “AccountName=myAccount;”
    • Account Key, which is the key associated to your Azure storage account. For example: “AccountKey=myKey;”
    • BlobEndpoint, indicates the URL for the blob storage repository. For example:  “BlobEndpoint=http://mystorageaccount.blob.core.windows.net
  • Blob Container Name: Enter the name of the container inside the Azure Blob storage where you want to publish your results.
  • Clean container before full crawl: Mark this option if you intend to clean the blob container before a full crawl.

Step 3c. Specify Binary Objects Configuration (Optional).

In the Binary Objects section, specify the Binary Objects you want to upload related to the connection configuration.

  • Upload binary objects: Mark this option if you would like to upload the binary objects together with the JSON objects. For the Binary Objects to be successfully uploaded, you need to disable the extract text option in the Connector settings.
    • Use only one extension: Mark this option if you want all the binary objects to be uploaded using the same file extension. The file extension to be used is specified in the next field "Unique binary file extension". If the option is not marked, then the uploaded files will keep their original file extension.
    • Unique binary file extensions: if the option "Use only one extension" is marked, then this is the used file extension for all uploaded binary objects.
    • File extension exclude list: This is a comma separated list for all the file extensions you want to exclude from the upload binary process. Add the extensions of the files that you don't want to be uploaded into this list.

Step 3d. Specify Transform Documents Configuration (Optional)

In the Transform Documents section, you can choose between specifying a Local Transform File or picking from a previously uploaded Resource Transform File:

  1. Local Transform File: the default value is set to "${component.home}/config/groovy/aspireToBlob.groovy" for the default Groovy transformation file provided with Aspire.  This script will be used to transform the data from Aspire as it is posted to the Azure Blob Server. To use a custom file, follow the instructions in JSON Transformation.

  2. Resources Transform File: pick the appropriate file that was previously uploaded by using Aspire's “Resources” feature.



Step 3e. Specify Metadata Configuration (Optional).

In the Metadata section, specify the metadata configuration.

  • Soft Delete: Mark this option to execute a soft delete (object mark as deleted) rather than a physical delete from Azure Blob.
    • Delete Flag Name: Enter the name of the flag used to mark the document as deleted in Azure Blob. The flag will be stored as part of the blob metadata.
  • Add Blob Metadata: Mark this option to add metadata to the Azure Blob.
    • Metadata Name: Enter the name of the blob metadata field. If a name specified as a metadata field matches the delete flag name used for soft deletes, the value will be overwritten by the soft delete handler. This is a required value.
    • Metadata Value: Enter the field name or path (e.g., connectorSpecific/Author) inside Aspire or transformed document where to take the value from. This is a required value.
    • Default Value: Enter the default value to be used when the metadata value does not exist.

Step 3f. Specify Debug Configuration (Optional).

In the Debug section of the Azure Blob Publisher configuration, specify the Debug flag.

  • Debug: Check to enable debug mode to show debug messages from the publisher.


Step 3g. Click on the Add button.

Once you click the add button, the Elasticsearch Publisher settings will be saved.

  • No labels