PDF/A Export

What is PDF/A

  • PDF/A is a file format and an ISO Standard for the long-term archiving of electronic documents.
  • PDF/A is in fact a subset of PDF, obtained by leaving out PDF features not suited to long-term archiving.
  • There are different levels of PDF/A
    • PDF/A-1b - Level B compliance in Part 1
      PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document.
    • PDF/A-1a - Level A compliance in Part 1
      PDF/A-1a includes all the requirements of PDF/A-1b and additionally requires that document structure be included (also known as being “tagged”/“Tagged PDF”), with the objective of ensuring that document content can be searched and repurposed. PDF/A-1a also requires Unicode character maps.
    • PDF/A-2 is based on ISO 32000-1
      A-2 a new standard
      PDF 1.7 and is defined by ISO 19005-2:2011, published on June 20, 2011 under the formal name Document management – Electronic document file format for long-term preservation – Part 2: Use of ISO 32000-1 (PDF/A-2).
    • PDF/A-3
      • The standard was published in October 2012 and differs form PDF/A-2 in a way that it allows to embed all kinds of file formats. For example: XML, Office formats, raw binary data, etc
      • Important: the long-term compatibility will only be guaranteed for the PDF-part of the collection. If an organization will embed other file formats, then there are reasons/benefits to have access to the other file formats and accepting the risk that they are not usable in 100 years.

PDF/A Minimum Requirements

  • Things that have to be full filled to be PDF/A compliant:
    • Audio and video content are forbidden.
    • JavaScript and executable file launches are forbidden.
    • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
    • Colorspaces specified in a device-independent manner.
    • Encryption is forbidden.
    • Use of standards-based metadata is mandated.
    • External content references are forbidden.
    • LZW and JPEG2000 image compressions are forbidden in PDF/A-1,
      but JPEG 2000 compression is allowed in PDF/A-2.
    • Transparent objects and layers (Optional Content Groups) are forbidden in PDF/A-1, but they are supported in PDF/A-2.
    • Provisions for digital signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard are supported in PDF/A-2.
    • Embedded files are forbidden in PDF/A-1, but PDF/A-2 offers the possibility to embed PDF/A files, allowing archiving of sets of documents in a single file.

Source: http://en.wikipedia.org/wiki/PDF/A

PDF/A Support in ABBYY Technology Products

PDF/A Export (PDF/A-1b & PDF/A-1a) is available in the following ABBYY technology products

FineReader Engines - OCR & Document Conversion

FlexiCapture Engine - Separation, Classification & Data Capture

Recognition Server - Solution for server based processing and document capture

FlexiCapture - Solutions for Data Capture

PDF/A-2 Support

In addition to the common PDF and PDF/A-1 formats, FineReader Engine 11 now experts to PDF/a-2. The new options of the ISO standard format are:

  • Support of JPEG2000 compression to generate smaller files
  • A-2a – tagged & unicode PDF/A-2
  • A-2u – not-tagged PDF/A-2 with an ability to extract text in Unicode.

PDF/A-2 enables creation of smaller PDF files using JPEG2000 compression. For long-term archiving, this can help reduce used storage space and enable faster access when working on low bandwidth networks.

The general technical changes of PDF/A-2 are:

  • based on based PDF 1.7 (ISO 32000-1)
  • highly efficient JPEG2000 compression allowed
  • support for transparency effects and layers
  • embedding of OpenType fonts
  • provisions for digital signatures in accordance with the
    PAdES (PDF Advanced Electronic Signatures) standard.
  • possibility to embed PDF/A files in PDF/A-2,
    allowing archiving of sets of documents as individual documents in a single file.

PDF/A-3 Support

PDF/A-3 is an extension of the A-2 standard which allows inclusion of PDF/A files or files in a variety of other binary formats such as XML or Office formats. Long-term archiving and readability of the PDF/A part is still guaranteed, and the binary attachments can deliver additional benefits.

The PDF/A-3 extended container capabilities will make this format attractive in new areas, for example when a graphical representation of a document should be combined with some source data. The new e-invoice format defined by the Forum for Electronic Invoices Germany (FeRD) is based on PDF/A-3 and XML.

Sine Release 3 of FineReader Engine 11 the API is extended, so that included files/attachments can be extracted and also be added to a PDF.

Further Information

This website uses cookies which enable you to see pages or use other functions of our websites. You can turn off such cookies in your browser’s settings. If you continue to use these pages, you consent to the use of cookies.