Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

FAQs

Is needed any other component before the Tesseract OCR component?

Yes, to configure the component properly, we recommend to use using a normalize mime type.


Is any preprocessing

needed

required before the Tesseract OCR component?

Right now multipage tiff file is Currently, multipage TIFF files are not supported, so you need to split multipage tiff TIFF before the OCR process.

Can I have multiple Tesseract OCR versions installed?

We recommend that only 1 version is installed, since sometimes occasionally, the installations are not completed properly, as you will see in the next example.

Troubleshooting

Problem

You might get an NPE on the OCR process and if you enable the debug option, you'll find the real cause:

Error opening data file <path to Tesseract>/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.

Solution

This could happen if you have multiple Tesseract installations, but you can use two approaches to solve this:

Either you don't have the proper language installed you are trying to use:

Panel
borderStylesolid
titleMethod 1 (Recommended)

Clean installation:

  • Uninstall all the tesseract programs that you have on your machine
  • Restart your machine
  • Install again the 5.0.2 version and verify you have selected the English language or other language you want to use
  • Restart again
  • Verify that the TESSDATA_PREFIX is set properly to the tessdata folder in your tesseract installation

Or you might not have the TESSDATA_PREFIX variable correctly defined: 

Panel
borderStylesolid
titleMethod 2

Set properly the TESSDATA_PREFIX environment variable:

  • If your installation was completed properly, you should have a folder like this installed: (verify you have the proper languages installed, in this case "eng"):

  • Set the variable to this folder.