Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


On this page:

Table of Contents

Image Removed


Step 1. Launch Aspire and open the Content Source Management Page

Launch Aspire (if it's not already running). See:

please

Image Added

Step 2. Add a new Content Source

  • For this step please , follow the step steps from the Configuration Tutorial of the connector of you your choice, please refer to . For more information, see Connector list.

Image Removed


Step 3. Add a new Tesseract OCR application to the Workflow

To add a Tesseract OCR application drag from the Tesseract OCR rule from the Workflow Library and drop to the Workflow Tree where you want to add it. This will automatically open the Tesseract OCR window for

the

application configuration

of the application

.

Step 3a. Specify Application Information

 In the Tesseract OCR window, specify the information to set up the application.

  1. Tesseract executable file path
    1. Location to the tesseract executable file.
    2. Note: In windows only the program name can be put here if tesseract is part of the path.
  2. Page segmentation mode:
    1. Page segmentation mode to be used during OCR execution.
    2. See link for more information.
  3. Languages to detect:
    1. Languages used for the OCR execution.
    2. The order of the languages affect the output. See here.
    3. Note: Before using a language, the language training data must be installed.
  4. Process timeout:
    1. Time in milliseconds to wait before killing the tesseract process.
  5. Accept patterns:
    1. Regex to be matched against the document URL if it will be processed by the application.
  6. Debug:
    1. Enable debug messages.

Image Added

Once you've clicked on the Add button, it will take a moment for Aspire to download all of the necessary components (the Jar files) from the Maven repository and load them into Aspire. Once that's done, the publisher will appear in the Workflow Tree.

Info

For details on using the Workflow section, please refer to Workflow introduction.