OCR Recognition Languages

FlexiCapture Engine, FineReader Engine, Cloud OCR SDK
9.x, 10, 11
Technology & Features
Recognition, Languages & OCR, OCR: Speed & Quality
  • ABBYY OCR technology can process more than 200 OCR languages.
  • There are different types of languages
    • Natural languages, like English, Russian or German
    • Artificial languages: Esperanto, Interlingua, Ido, Occidental
    • Programming Languages: Basic, C/C++, COBOL, Fortran, Java, Pascal, Simple chemical formulas
  • Languages contain special language units/data types, e.g.:
    • addresses
    • date and time
    • human names, etc.
    • For some natural languages: City, village, settlement (English, United Kingdom); Currency in words (English, United States), etc.
  • Chinese (PRC and Taiwan), Japanese, Korean and Korean/Hangul
  • Thai
  • Hebrew
  • Arabic

The languages mentioned above are “so-called” predefined languages. In additional to them, it is possible to define your own language and use it for recognition.
The screen-shot below shows the “Language Editor” implementation of FineReader 10, witch is shipped with FineReader Engine 10.

Structure of a Recognition Language

Every recognition language has the following properties:

  • Name
  • Set of allowed characters:
    • alphabet
    • list of prefixes
    • list of suffixes
    • alphabet for subscripts
    • alphabet for superscript and
    • list of ignored characters.
    • Dictionaries are optional, so a language can have one, but recognition will also “work” without one.

Language Auto-Detection

  • ABBYY technologies are able to detect language of a document automatically.
  • The product chooses the best matching language from a group pre-defined group of languages.
  • This group can be set/edited by the user/developer.

FineReader Engine 11 has new capabilities to work with multi-language documents, more details here: OCR Language Auto-Detection

Related Articles

This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.