Hyperthreading (HT) & OCR Scalability

Language:
EN
Product-Line:
FlexiCapture Engine, FineReader Engine
Version:
10, 11
Type:
Knowledge Base & Support
Category:
Recognition, OCR: Speed & Quality
KB-Type:
Tips & How to Information

CPUs in all kinds of devices now have more than one processing unit / CPU core, so that they can do more work in parallel. Modern operating systems can distribute the amount of running processes/tasks/threads more efficiently on multiple CPU cores. So the system execute more jobs in parallel. Intel & Co found out that that even “virtual” CPU cores can increase the throughput/speed.

Hyper-threading in a Nutshell

Hyper-threading (HT) is a technology for CPU cores to improve parallelization of computations (doing multiple tasks at once) performed on PC microprocessors. It first appeared in February 2002 on Xeon server processors and in November 2002 on Pentium 4 desktop CPUs. Later, Intel included this technology in Itanium, Atom, and Core 'i' Series CPUs, among others. Source and more: Hyper-threading

When a CPU has one physical CPU core running OCR or text analytics, the task-manger shows one CPU core with a high load:

1 core physical CPU with full load - Windows Task-Manager screenshot
… this is reflecting the real usage of the CPU

When a CPU has one working CPU with hyper-threading, the task manger shows 2 logical CPU cores, but each core seems to have only 50% load:

 HT CPU with "2" cores with full OCR load - but only 50% displayed load - Windows Task-Manager screenshot
… that is math, but not reflecting the reality ;-)

  • Technically an HT-CPU can work more efficiently, especially when multiple applications are running, this is the case when Windows, Outlook, Office and browsers are running. Most of the “normal” applications/processes do not need too much CPU power - this is why a hyper-threading CPU can switching between the different tasks more efficiently.
  • Result
    • From a user point of view the system feels/reacts faster than a single core CPU (without HT).
    • Also benchmarks proof a speed improvement, this is why hyperthreading is supported by a lot of CPU types.

High Load Processes and HT

BUT when running optical character recognition the situation is different, because OCR is a very CPU intense.

  • 1 OCR process can use almost 100% of a physical CPU core capacity.
    ⇒ So the efficiency of the second “logical” HT-core can not “deliver” another 80-90% as it would be the case when running more 'less CPU hungry' applications in parallel!
  • Important: A simple (arithmetic) average calculation delivers a wrong impression.
    • When all the physically provided processing power is used by one or multiple (OCR) processes the task- manager will show almost a 100% load.
    • If you now double the number of cores by enabling/using hyperthreading, the almost 100% load appears only as a 50% load because of the average calculation! Please keep in mind that this is not reflecting the reality.
  • Hyperthreading is a little bit like putting a spoiler on a sports car - in certain driving situations it can/might improve things, but the engine will not have more horsepower to go much faster.
    • Hyperthreading CPUs will, therefore, influence the performance of computers when running 'standard' applications like Outlook, Browser, Office, etc. Here the user can often experience almost doubling the performance.
    • Hyperthreading CPUs do not have such a strong effect on the speed when it comes to CPU intense tasks, like OCR processing. Here the influence will be maximum between 20-30% 1).

Hyperthreading in ABBYY SDKs

ABBYY FineReader Engine and FlexiCapture Engine come with code samples that show how to use multiple CPU cores. The list of articles is listed here: OCR: Speed & Quality.

FineReader Engine Processing Pool

A simple test made with FineReader Engine 11 Release 5 on a Laptop (2012) Quad i7-3720QM, 2,6 GHz, Windows 7, 16 GB RAM, 64 bit; 2).

Threads/Processes running in the background 1 2 3 4 5 6 7 8
Throughput, pages per minutes 11 22 26 32 37 37 36 29

Results

  • More OCR processes increase the throughput as expected.
  • The maximum page throughput is achieved when the number of processes is “number of physical CPU cores + 1
    Here it is: 4 physical cores + 1 additional process = 5
  • If the machine has more cores you might see an increase of pages when starting even 2-3 more processes, but the final result also depends on the document size and the OCR scenario you perform.

Screenshots for 1,4,5 and 8 OCR Processes

Code-sample Screenshot 1 OCR process Code-sample Screenshot 4 OCR processes Code-sample Screenshot 5 OCR processes Code-sample Screenshot 8 OCR processes

If you have FineReader Engine installed you easily can reproduce the test with the sample: FineReader Engines Pool - Multithreading Sample. The sample Multi-Core OCR Processing Sample, lets you test multiple core scalability without the process-pool.

FlexiCapture Engine Processing Pool

The same behavior effect of hyperthreading can be seen in the FlexiCapture Engine code sample: FlexiCapture Processors Pool

Code-sample screenshot FlexiCapture Engine 1 to 8 OCR processes on 4 core HT CPU
FlexiCapture Engine 11 Release 1, on a Laptop (2012) Quad i7-3720QM, 2,6 GHz, Windows 7, 16 GB RAM, 64 bit; Standard sample files

Results

  • If more threads/processes than physical cores are used the performance can be increased.
  • In the test configuration the 8 processes process ca 32% more pages than only 4 native processes.
  • So HT CPUs are delivering more throughput, but in a real production environment it might be useful to reserve not all HT-Cores for the recognition/data extraction, especially when other components like a database are running on the same server.

Relates Site-Sections

Request a Trial

Request an FineReader Engine Trial
… The form is hosted on abbyy.com, please select the proper region, because only then ABBYY can provide you with the trial. ;-)
More details on what you will get...

1) … depending on the CPU type, the documents processed, the task
2) absolute numbers might be different on other machines, the purpose here is only to show the influence of higher numbers of processes on a hyperthreading CPU.
  • No tags, yet