In a nutshell

  • After the images or pages of a PDF file are loaded in the OCR Engine additional image processing has to be applied. It is very important to send the best image quality possible to the core technology.
  • ABBYY OCR technology and products offer a lot of built in options
    • Image rotation, de-skewing and straightening text lines
    • Image cleaning (scan dirt, ISO noise from cameras)
    • Split of double pages
    • Correction of geometrical distortions
    • Image Cropping
    • Color filtering, e.g. stamps
    • Adaptive Binaziration to separate the printed texts form their backgrounds
  • More details an information can be found in the articles below.

Related Articles

Image Preprocessing for OCR - illustrarion

  • No tags, yet