ALTO XML Export

Language:
EN
Product-Line:
FineReader Engine, Cloud OCR SDK
Version:
10, 11
Type:
Technology & Features
Category:
Export

About the ALTO Format

  • ALTO = Analyzed Layout and Text Object
  • ALTO is a XML schema that defines metadata in a technical way for describing the layout and content of physical text resources, such as pages of a book or a newspaper.
  • It most commonly serves as an extension schema used within the Metadata Encoding and Transmission Schema (METS) administrative metadata section.
  • ALTO instances can also exist as a standalone document used independently of METS.

Further information on ALTO can be found here:

ALTO XML in ABBYY Products

ALTO export was introduced in

  • FineReader Engine 10 Release 2 (December 2010)
    • The format is included per default in the SDK Development and Runtime licenses without additional costs
    • More details can be found in the FineReader Engine 10 documentation:
      AltoExportParams Object
  • Recognition Server 3.0 Release 8 (Part#: 691/18 - July 2011)
    • The format is included only in Recognition Server 3.0 Extended Edition
  • FineReader Engine 10 R7 (Win) - updated export to ALTO format (Released 3.1.2013)
    Improvements in this version:
    • Location of the schema is written in the file.
    • Incompatibilities with the schema have been corrected.
    • The page is divided into print space and margins.
    • Base lines are written to the files
    • Character coordinates can be written relative to the source image.
    • Note: This improvements are currently only available in the SDK, but not in Recognition Server 3.x

ALTO Samples

The ZIP archive (1,4 MB) contains the processing results and the source image.

Original

ZIP content

Processed with Recognition Server 3.5 Version: 3.5.2.410; Part# 691/24

External Info

Back to: Feature OverviewABBYY XML Export