The Tesseract OCR application will perform an OCR operation on an image file using the open-source tool known as Tesseract.
Features
Some features of the Tesseract OCR application include:
- Multiple language support.
- Several page segmentation modes.
- Multiple image creation color scales and formats.
Limitations
Since the Tesseract OCR application is a third-party tool that needs to be set up separately from Aspire, it has the following limitations as per the API:
- It must be installed separately.
- Before using a Tesseract feature, it must be properly installed.
- For example: OCR for other languages as French, Spanish among others.
- While performing OCR in a file, the order where the languages are provided will affect the output.
- See the documentation here.
- List of all available languages, see here.
- Multipage tiff files are not supported right now.
Future Development Plan
- Add multipage tiff support
Is there anything we should add? Please let us know.