Parallel PDF Processing

FineReader Engine
Technology & Features
Advanced PDF Processing, OCR: Speed & Quality

Single/Multi-page Document Input

When text is read from images with optical character recognition (OCR), then most of the time multi-page documents are represented as a series of single page files. But when PDFs files are OCRed then it is very common that one document contains multiple pages.

Single image documents can be processed right away while the pages from a multi-page PDF have to be separated first. When PDFs with images are produced via a scanner or an MFP the separation is quite simple and fast because it is just about reading the single images that are stored within an “image only PDFs”. When PDFs are created via code or a printer driver, then the PDF code source code has to be “rendered” to create a proper pixel based representation.

OCR Up-Scale for PDF Processing

  • Since modern computers usually have several CPU cores, all available resources should be used for processing a large number of (multi-page) PDFs. Parallel document-page processing increases the overall speed and throughput.
  • Depending on processing scenario all stages - import, recognition and export - of PDF processing can be performed in parallel.
    • Parallel import is available from FineReader Engine 11 Release 5.
      Using the parallel import helps to save up to 35% overall processing time.1)
    • Parallel recognition was already introduced in FineReader Engine 9 and extended in V10.
    • Parallel export to PDF is available from FineReader Engine 11 release 3.
1) The percentage of saved processing time depends on the number of CPU cores, size of document (number of pages) and the complexity of document
This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.