In a nutshell
- After the images or pages of a PDF file are loaded in the OCR Engine additional image processing has to be applied. It is very important to send the best image quality possible to the core technology.
- ABBYY OCR technology and products offer a lot of built in options
- Image rotation, de-skewing and straightening text lines
- Image cleaning (scan dirt, ISO noise from cameras)
- Split of double pages
- Correction of geometrical distortions
- Image Cropping
- Color filtering, e.g. stamps
- Adaptive Binaziration to separate the printed texts form their backgrounds
- More details an information can be found in the articles below.
Code Samples related to Imaging
Knowledge Base Tips
- Impartant: Recognition Quality and Recognition Speed are directly related, see the section on OCR: Speed & Quality
- No tags, yet