Recognition-related questions

FineReader Engine
9.x, 10, 11
Frequently Asked Questions

What recognition language is used by default?

English is the default recognition language. If you want to change the default recognition language, you must use the SetPredefinedTextLanguage method of the RecognizerParams object.

How can I improve the quality of recognition on blocks which contain different types of text?

If a block contains text of different types, ABBYY FineReader Engine will still treat it as text of the same type. To improve the quality of OCR, draw a separate block for text of each type.

More details can be found in the documentation chapter: How autodetection works.

How the type of text is detected in blocks for which the TextType property of the RecognizerParams object is set to TT_ToBeDetected?

More details can be found in the documentation chapter: How autodetection works.

Why italic fonts and superscript/subscript are not recognized by autodetection?

If the PossibleTextTypes property of the RecognizerParams object contains any combination of TT_MATRIX, TT_TYPEWRITER, TT_OCR_A, and TT_OCR_B, italic fonts and superscript/subscript will not be recognized, regardless of the values of the ProhibitItalic, ProhibitSubscript and ProhibitSuperscript properties of the RecognizerParams object.

More details can be found in the documentation chapter: How autodetection works.

Do hieroglyphic characters have extended recognition attributes?

Yes, hieroglyphic characters have such recognition attributes.

More details can be found in the documentation chapter: Recognizing Hieroglyphic Languages, ExtendedRecAttributes, CharParams.

What is the difference between the CharConfidence and the IsSuspicious properties?

The CharConfidence property of the ExtendedRecAttributes, the PlainText, and the CharacterRecognitionVariant objects is the read-only long property which stores the value of character confidence. It is in the range from 0 to 100, and 255 corresponds to the fact that confidence is undefined. It represents an estimate of recognition confidence of a character in percentage points. The greater its value, the greater the confidence. Character confidence can be undefined, for example, for characters which were added during text editing.

Recognition confidence of a character image is a numerical estimate of the similarity of this image and the “ideal” whose recognition confidence would be 100%. When recognizing a character, the program provides several recognition variants which are ranked by their confidence values. For example, an image of the letter “e” may be recognized

  • as the letter “e” with a confidence of 95%,
  • as the letter “c” with a confidence of 85%,
  • as the letter “o” with a confidence of 65%, etc.

The sum total of the confidence values of all the recognition variants of a character need not be 100%. The hypothesis with a higher confidence rating is selected as the recognition result. But the choice also depends on the context (i.e. the word to which the character belongs) and the results of a differential comparison. For example, if the word with the “e” hypothesis is not a dictionary word while the word with the “c” hypothesis is a dictionary word, the latter will be selected as the recognition result, and its confidence rating will be 85%. The rest of the recognition variants can be obtained as hypotheses.

The IsSuspicious property of the CharParams object is the Boolean property. This property set to TRUE means that the character was recognized unreliably. This property is determined by an algorithm which takes into account a number of parameters, such as recognition confidence of a character, nearby characters and their recognition confidence, hypotheses and their recognition confidence, the geometric parameters of a character, the context (i.e. the word to which a character belongs), etc.

FAQ Overview

This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.
  • No tags, yet