Overview OCR & PDF NLP, Linguistic & Semantic Comparisons Scenarios/Tasks Technology Cycles OS

NLP/Semantic Technologies

The amount of enterprise information is growing dramatically. Unstructured data account for 80-90% of this information. They are often untagged, organized according to personal preferences, written according to the language and terminology preferences of the author, and thus hard to find and manage.

A result of many years of intensive R&D, ABBYY language technologies allow better understanding of unstructured document content, offering robust solutions to many long-standing language processing problems of the information age.

Applications include:

  • Comprehensive text analysis: information monitoring, sentiment analysis, controlling access to confidential information, and summarizing and annotating documents
  • Efficient handling of text documents: classification, filtering, text comparison


Linguistic Analysis Results

Below you can see the Compreno linguistic result of the short sentence: John bought a car from Mark


Animated Gif: showing the linguistic facts for each word found by Compreno

Entities and Fact Results

The same sentence processed in ABBYY InfoExtractor.


  • The fact that this sentence is about “PurchaseAndSale” was detected
  • The object was detected
  • The two involved persons were identified
  • The information is provided as RDF/XML, so that it can be used in other semantic applications

Result of the fact and relationship extraction detected by ABBYY InfoExtractor

Semantic analysis of a sentence

' [[en:features:linguistic:semantic:understanding_infoextractors_rdf-xml|]]