You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »



Step 1. Open the Aspire Admin UI

Browse to the Aspire Admin UI. It is typically located at http://localhost:50505.

Step 2. Select the Connector Instances option from the left hand menu

The "Connector Instances" option, identified by a "connector" image   is located on the left side of the application, between the "Connections" and "Policies" options. Click on it to navigate to the "Connector Instances" page.

Step 3. Specify Connector Description and Type

Once on the "Connector" page, click on the the "+New" option to create a new Connector or select an existing one to modify it.

  • Description: specify a description for the Connector. It is advised for it to be concise and meaningful.
  • Type: select "Azure Identity" as the type for the Connector.

Step 4. Specify Connector General configuration

Once the type has been selected, you will be presented with the "General" section of the "Connector Instances" page. Here you need to enter the following information for the Connector:

  • Debug: enables/disables debug messages for the system.
  • Debug Workflow: enables/disables job logging.
  • Pipeline Statistics: enables/disables pipeline jobs statistics for the debug console.
  • Source Info Cache Size: number of "SourceInfo" objects kept in memory per seed.
  • Storage Maps Cache Size : number of map objects kept in memory per seed.
  • Storage Sets Cache Size : number of map objects kept in memory per seed.
  • Identity Cache Size: number of identities kept in memory per seed.

Step 5. Specify Text Extraction configuration

Once the type has been selected, you will be presented with the "General" section of the "Connector Instances" page. Here you need to enter the following information for the Connector:

  • Enable Text Extraction: Specify the Client ID for the credential.
    • Override default settings: Specify the Client secret for the credential.
      • Maximum Size: Specify the Client secret for the credential.
      • Timeout: Specify the Client secret for the credential.
      • Nesting Max Depth : Specify the Client secret for the credential.
      • HTML Output : Specify the Client secret for the credential.
      • Apache Tika Configuration Path: Specify the Client secret for the credential.
      • Override PDFBox properties: Specify the Client secret for the credential.
        • Enable "Autospace": Specify the Client secret for the credential.
        • Enable "SupressDuplicateOverlappingText": Specify the Client secret for the credential.
        • Enable "ExtractAnnotationText": Specify the Client secret for the credential.
        • Enable "SortByPosition": Specify the Client secret for the credential.
        • Enable "ExtractAcroFormContent": Specify the Client secret for the credential.
        • Enable "ExtractInlineImages": Specify the Client secret for the credential.
        • Enable "ExtractUniqueInlineImagesOnly": Specify the Client secret for the credential.
      • Non-Text Document Filtering : Specify the Client secret for the credential.
        • Open data stream for non-text documents: Specify the Client secret for the credential.
        • Identify By: Specify the Client secret for the credential.
        • Non-text document extensions:
      • Metadata Mapping: Specify the Client secret for the credential.

Step 4. Specify a Throttling Policy (Optional)

On the "Policies" section of the "Credentials", you have the option to specify a previously defined throttling policy for the connections using this credential: just select the desired policy from the list of available policies.

Step 5. Save the Connector

Click on the "Complete" button to save the new Connector (when updating, the button option will read "Save" instead of "Complete").


  • No labels