Recognition with Pattern Training

Language:
EN
Product-Line:
FlexiCapture Engine, FineReader Engine
Version:
9.x, 10, 11
Type:
Technology & Features
Category:
Recognition, OCR: Speed & Quality, Verification & User Interaction
  • ABBYY OCR technology uses font independent recognition technologies (more details on the omni-font approach...).
  • Sometimes this approach of recognition doesn't give the results as they are expected. The reasons can be very different. In some cases the recognition with training could help to increase the recognition quality.

When is the Recognition with Pattern Training useful?

Extended Recognition with custom trained pattern can be used for:

  • texts set in decorative fonts
  • texts containing unusual characters (e.g. mathematical symbols)
  • long documents of low print quality (more than a hundred pages)

Note: It is recommended to use Recognition with Training only if one of the above applies. In other cases, the slight increase in recognition quality will be outweighed by considerably longer processing times.

What is the Recognition with Pattern Training?

ABBYY can read texts set in practically any font regardless of print quality. Consequently, no prior training is normally required before recognition can take place. Nevertheless, in some specific cases (see above) ABBYY products provide a possibility to create/train a user pattern, which will be used for the further recognition.

Pattern training works as follows. One or two pages are recognized in training mode,

and, subsequently, a pattern is created. The pattern is used as a source of additional information during recognition to aid recognition of the remaining text.

A pattern is a set of pairs “a character image - the character itself” created during pattern training.

Sometimes two or even three characters may get “glued” together (see Wikipedia:Ligature)), and cannot be to enclose each character in an individual frame to separate them.
If this is the case (i.e. you cannot move the frame so that it contains only one whole character and no other character parts), you can train the whole inseparable character combinations. Examples of character combinations frequently found glued together include ff, fi, and fl (see picture above). Such combinations are referred to as ligatures.

Note:

  • Patterns are only useful when the documents have:
    • the same font,
    • the same font size, and
    • the same resolution as the document used to create the user pattern.
  • Pattern training is not supported for hieroglyphic or Asian languages.

Video

Further Information


Back to: OCR & PDF FeaturesTechnology & Features