Page History

Versions Compared

Old Version 1

changes.mady.by.user Freddy Morera

Saved on Dec 06, 2016

compared with

New Version Current

changes.mady.by.user user-1b188

Saved on Oct 03, 2018

Key

This line was added.
This line was removed.
Formatting was changed.

Image Removed open Page please

Step 2. Add a new Content Source

For this step please , follow the step steps from the Configuration Tutorial of for the connector of you your choice, please refer to . For more information, see Connector list

Image Removed

.
Note: Make sure the "Disable Text extraction" box on Advanced Properties is checked.

from the of the application windows

- Windows, only the program name can be put here if tesseract is part of the path.
Page segmentation mode:
- Page segmentation mode to be used during OCR execution.
  See link for more information.
Languages to detect:
- Languages used for the OCR execution.
- The order of the languages affect the output. See here.
  Note: Before using a language, the language training data must be installed.
Process timeout:
- Time in milliseconds to wait before killing the tesseract process.
Accept patterns:
- Regex to be matched against the document URL if it will be processed by the application.
Debug:
- Enable debug messages.

Once you've clicked on the Add button, it Once please refer to