OCR Voting API

Language:
EN
Product-Line:
FineReader Engine
Version:
9.x, 10, 11, 12
Type:
Technology & Features
Category:
Recognition, OCR: Speed & Quality, Verification & User Interaction
  • The term 'voting“ in OCR is used when developers combine multiple OCR engines in their solutions. When multiple Engines generate different recognition variants for a character or word, the developer can select the best variant by voting between the variants.
  • Since 2006 ABBYY FineReader Engine offers a special Voting API which provides access to different hypotheses of character or word recognition with corresponding weight values.
  • Developer can use the FineReader Engine Voting API to check recognition results using his own databases and algorithms, and to correct text. For example, the developer can build words from letters or check all generated hypotheses.

Options are:

  • WordRecognitionVariant
    the object represents a single hypothesis for a word and contains the text of the hypothesis, type of model, the average width of stroke, and information on whether the hypothesis has been found in the dictionary
  • CharacterRecognitionVariant
    the object represents a single hypothesis for a character and contains character confidence, probability that a character is written with a serif font, and information on whether the character is superscript or subscript.

Character Example

During the layout analysis the text areas, lines and single characters coordinates are detected. After the character separation each character is recognized with different text recognition technologies/algorithms/classifiers.

The recognition confidence of a single character image is a numerical estimate of the probability that the image does in fact represent this character. For example, an image of the letter “e” may be recognized

  • as the letter “e” with a confidence of 95,
  • as the letter “c” with a confidence of 85,
  • as the letter “o” with a confidence of 65, etc.

The hypothesis with the highest confidence rating is selected as the recognition result. But the choice also depends on the context (i.e. the word to which the character belongs) and the results of a differential comparison.

If the word with the “e” hypothesis is not a dictionary word while the word with the “c” hypothesis is a dictionary word, the latter will be selected as the recognition result, even though its confidence rating will still be 85. The rest of the recognition variants can be obtained as hypotheses.

Important Note: The Voting API is only available for OCR, not for for recognizing hand-printed texts

This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.