Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



On this page:

Table of Contents

Table of Contents

 

 

Image Removed

Step 1. Launch Aspire and open the Content Source Management Page


Launch Aspire (if it's not already running). See:

Image Added

Step 2. Add a new Content Source


  • For this step please follow the step from the Configuration Tutorial of the connector of you choice, please refer to Connector list
Image Removed


Step 3. Add a new Archive Extractor to the Workflow


To add an Archive Extractor drag from the Archive Extractor rule from the Applications Workflow Library and drop to the On Add Update Workflow Tree. This will automatically open the Archive Extractor window for the configuration.

Step 3a. Specify Archive Information

 In the Archive Extractor window, specify the desired options for .

  1. General Configuration
    1. Index Containers
Image Removed
    1. Scan Recursively
    2. Add Parent Info
    3. Send Delete By Query first
    4. Index Archive file job 
    5. Batch Size
    6. Batch Timeout
    7. Debug

  1. Discovery Archive Method
    1. Auto Identify (Select Supported Types)
    2. Regex

  2. Extract Text
  • Timeout 
  • Size Image Removed
      1. Extract Text Timeout
      2. Max Extract Size
      3. Disable extraction

     

    1. Routing
      1. Workflow for Add/Update jobs   
      2. Workflow for delete jobs
      3. Workflow for error jobs

     

    Step 3b. Share the rule into a new Library

    Once you save the component, share it in a library (this is required). 

    Image Removed



    Step

    3b

    3c. Copy the shared rule

     Add it into the Delete pipeline (from the shared library, this is required)




    Info

    In order to

    extract the content of the files inside the Archive File you need to disable the extract text of the connector and Configure it in the Archive File Component. So you need to add a rule for the extract text of the others jobs from the crawl (you can share the extract text in the same library used before).

    work, the application requires Extract Text to be disabled in the connector configuration.

    Image Added

    Image Added

    Image AddedImage AddedImage AddedImage AddedImage Added

    You can use some rule like:

    Extract TextImage Removed

     

     

     

     

     

    Once you've clicked on the Add button, it will take a moment for Aspire to download all of the necessary components (the Jar files) from the Maven repository and load them into Aspire. Once that's done, the publisher will appear in the Workflow Tree.

    Info

    For details on using the Workflow section, please refer to Workflow introduction.