PDF/A standard

PDF/A are the standards of Portable Document Format (PDF), specially prepared for archiving and long-term preservation of documents.

Each standard version define a set of rules that needs to be satisfied, so that the document is recognizable as PDF/A-compliant.

The rules prohibit PDF features that are unsuitable for long-term storing and archiving. As an example it forbid usage of encryption, font linking or external content references.

The standard not only prohibits but also define requirements for the document including color management guidelines, support for embedded fonts or standards-based metadata.

Conformance levels and versions

PDF/A-1

It was the first standard specialized in archiving, published on September 28, 2005 and is based on the PDF 1.4 reference version.

It specifies two levels of conformance for the PDF files:

  • PDF/A-1b - basic - level B
  • PDF/A-1a - advanced - level A

The conformance level B defines only requirements necessary for the reliable visual appearance reproduction of a document.

Level A includes all the requirements specified in Level B in addition to features that are intended to improve a document’s accessibility.

PDF/A-2

The second version of the standard was published on June 20, 2011. It addresses some features that were added along with PDF Reference 1.5, 1.6 and 1.7 versions.

What is important to notice, that a file that conforms PDF/A-1 will not necessarily conform PDF/A-2, and also a file compliant to PDF/A-2 will not necessarily conform to PDF/A-1.

PDF/A-3

The third version of the standard, published on October 15, 2012, is a superset of PDF/A-2 which allows embedding of arbitrary file formats (i.e. XML, CSV, CAD etc.) into conforming documents.

PDF/A-4

The fourth standard version was published on December 16, 2020, defined as ISO 19005-4:2020 (PDF/A-4). It is based on the PDF 2.0 and replaces previous versions of given standard.

UniPDF implementation

UniDoc wants to rise to the challenge and provide a way to define and verify PDF/A-compliant documents.

Along with the needs we’ve provided PDF/A-compliant document reader and writer.

Currently, UniPDF supports fully PDF/A-1b and partly PDA/A-1a verifier.

It also supports PDF/A-1b standard applier - with most of the necessary features - which changes the content of the document, so that it complies to PDF/A-1b. In the future releases we want to provide full support for the PDF/A standards.

Compliance Pdf Reader

The compliance reader allows to read the document and verify if it satisfies provided standard. In example, in order to verify if the document complies to the specified standard we need to create a new model.CompliancePdfReader for given document, define the standard to which we want to apply it.

Example - Verify against PDF/A-1b

package main

import (
	"errors"
	"fmt"
	"os"

	"github.com/unidoc/unipdf/v3/model"
	"github.com/unidoc/unipdf/v3/model/pdfa"
)

func main() {
	const documentName = "test_document.pdf"
	
	// Check the compliance of the file with provided name 
	err := checkCompliancePDFA1b(documentName)
	if err == nil {
		fmt.Printf("The document: '%s' is compliant with the standard PDF/A-1b\n", documentName)
		os.Exit(0)
	}
	
	// Check if the errors are related to the pdfa verification by trying to extract pdfa.VerificationError.
	var vErr pdfa.VerificationError
	if !errors.As(err, &vErr) {
		fmt.Printf("Err: %v\n", err)
		os.Exit(1)
	}

	// Print out all violated rules.
	fmt.Printf("Document: %s, violated standard PDF/A-%d%s with following rules:\n", documentName, vErr.ConformanceLevel, vErr.ConformanceVariant)
	for _, violatedRule := range vErr.ViolatedRules {
		fmt.Printf("Rule: '%s' - %v\n", violatedRule.RuleNo, violatedRule.Detail)
	}
	os.Exit(1)
}

func checkCompliancePdfA1b(fileName string) error {
	// Open up the file with given name.
	f, err := os.Open(fileName)
	if err != nil {
		return err
	}
	defer f.Close()

	// Prepare compliant document reader.
	r, err := model.NewCompliancePdfReader(f)
	if err != nil {
		return err
	}

	// Define to which standard we want to check document compliance. 
	profile1B := pdfa.NewProfile1B(pdfa.DefaultProfileOptions())

	// Verify the standard.
	if err = profile1B.VerifyStandard(r); err != nil {
		return err
	}

	return nil
}

Applying standard on Document

UniPDF provides also a way to change the document so that its content complies to specified standard. It is applied on the standard unipdf model writer and changes is content after all the document parts are already established during the Write method.

Example apply PDF/A-1b standard on document

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/unidoc/unipdf/v3/model"
	"github.com/unidoc/unipdf/v3/model/pdfa"
)

func main() {
	const documentName = "test-document.pdf"

	if err := applyCompliancePdfA1b(documentName); err != nil {
		fmt.Printf("Err: %v\n", err)
		os.Exit(1)
    }
}

func applyCompliancePdfA1b(fileName string) error {
	// Open up the file with given name.
	f, err := os.Open(fileName)
	if err != nil {
		return err
	}
	defer f.Close()

	// Prepare compliant document reader.
	r, err := model.NewCompliancePdfReader(f)
	if err != nil {
		return err
	}

	// Define to which standard we want to check document compliance. 
	pdfa1B := pdfa.NewProfile1B(pdfa.DefaultProfileOptions())

	// Verify the standard.
	if err := pdfa1B.VerifyStandard(r); err == nil {
		// The document is already compliant with given standard.
		return nil
	}

	// If the document is not PDF/A-1b compliant apply required changes on the document and store a new version.
	w, err := r.ToWriter(nil)
	if err != nil {
		return err
	}

	w.ApplyStandard(pdfa1B)

	// Add the pdfa suffix to the file name.
	fileName = strings.TrimSuffix(fileName, ".pdf")
	fileName += "_pdfa_1b.pdf"
	
	// Write the PDF/A-1b compliant document to a new file.
	if err := w.WriteToFile(fileName); err != nil {
		return err
    }
	
	fmt.Printf("PDF/A-1b compliant document written to: %s\n", fileName)

	return nil
}

Examples

References