ScanDifFinder SDK

There are a lot of tools for comparing digital documents on the market, but the document comparison of scanned and OCR documents requires some tuned algorithms and technologies to deliver good results.

What is ScanDifFinder

ABBYY ScanDifFinder SDK is currently a toolkit prototype for software developers who have to compare scanned and OCRed documents.

Features:

  • Compare scanned paper documents with digital versions of the originals
  • Automatically check for real differences in the text of two documents that to make sure that they are identical for legal or contractual reasons
  • Minimize the hassles of identifying false positives caused by “fake” changes: minor format changes, as opposed to meaningful content differences, generated by OCR.

ScanDifFinder SDK lets you create applications featuring automated document comparison with a unique and patented technology, which eliminates the need to compare a paper document with its digital version. Normally this check has to be done manually (by eye) to verify document integrity.

Usage Scenarios for ScanDifFinder

ScanDifFinder can help to save hours when comparing documents such as contracts.
For example, when contractors receive a signed paper version of an agreement, it’s important to carefully compare that document to the original digital file it was printed from in order to verify that no unauthorized changes have been made. With ScanDifFinder, this task can be done in minutes or even seconds – saving time, ensuring the accuracy and eliminating error-prone manual processes.

Why ScanDifFinder is unique

ScanDifFinder filters out meaningless and “fake” changes and shows only “real” changes. For example, it isn’t necessary or even desirable to show all the differences between two documents because, for the most part, most differences consist of formatting changes – as opposed to actual content modification. Such distinctions are important. For example, the number of all changes in a document (including meaningless and “fake” changes) may be ten times bigger than the number of “real” changes. Therefore, ScanDifFinder SDK shows only meaningful changes such as changed numerals or added sentences – as demonstrated in the example below.

Example:

  • 16 pages document with 16 actual changes - made in the digital version, then printed and signed
  • Microsoft Word Comparison of a text OCRed with ABBYY FineReader Professional shows 150 changes to review
    Note: the changes are not only OCR errors ;-) - but also differences in the layout or text formats of the reconstructed document
  • ABBYY ScanDifFinder SDK shows 18 changes only.

More about ScanDifFinder

  • ABBYY ScanDifFinder is currently not an officially sold product.
    Consider it as a technology preview coming out of the “ABBYY technology melting pot.”
  • If you think that this kind of tuned comparison technology for scanned/OCRed paper documents could be valuable for your application, please contact ABBYY Europe.
This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.
  • No tags, yet