Automated PDF Testing

  • Updated


AMP extends its automatic testing capabilities to PDF documents. AMP will allow you to test PDF documents that are found as part of an AMP Crawling Test, as well as any hosted or local PDF you choose. AMP performs a number of Accessibility checks against the PDF document including whether or not the document is tagged, as well as testing for accessibility compliance within the tagging structure if present. This support article provides instructions for testing PDF documents in AMP as well as an overview of all the accessibility tests we perform automatically.

Testing PDFs Via the Document Inventory

PDFs captured during an AMP spider test are made available for automated testing. This means that PDFs available for automated testing will appear in the Document Inventory section of an AMP Report. To access the Document Inventory navigate into the desired AMP Report and select the Document Inventory link in the Report Information widget.

NOTE: Adobe Acrobat PDF must be enabled in the Organization Testing Control in order for the automated testing to be performed. 

Document Inventory Link

This will bring you to the Document Inventory section as shown below.

Document Inventory section

 In order to test a PDF Document:

  1. Select the Run Test button for the desired PDF Document in the Actions column within the Document Inventory.
  2. Selecting the Run Test icon will initiate the automated PDF testing on that document.

Testing Multiple PDF Documents

Like testing a PDF document testing multiple or all the PDF documents at once is made simple with AMP. To test multiple PDF documents at once in AMP you must first select all the desired PDFs that you would like to test. Using the checkboxes on the first column to select the PDF's you would like to test or select the checkbox in the table header to select all the documents. Once the desired documents are selected activate the link labelled "Test PDF Documents" This will test the selected PDF Documents.

Testing PDFs Via the Add Module Function

PDF documents can also be tested through the "Add Module" dialog in the "Report Modules" section of  an AMP report. This workflow Allows you to test both hosted PDF's (via a direct URL to the PDF Document) or by allowing you to upload a local PDF file to be tested. To do so, perform the following steps: 

  1. Navigate to the report and select the Report Modules link. 
  2. Select the Add Module button, which opens the corresponding modal window where you are able to add a PDF module via URL or by uploading a local file.

  3. Add Module Form
  4. After adding the file, Submit the form to add the module to the report. The file will automatically be tested when this is done.


Requirements for automated PDF testing

The primary requirement of automated PDF tested is ensure all Content is Tagged as per Ensure all content is tagged Best Practice. If the PDF is not tagged you will return a violation for untagged content, however no other automated tests will be run.

Automated Testing on PDFs will check that all content is tagged at two levels, which are explained in greater detail below:

  • Check that the Document is Tagged
  • Check that all of the content in the Document is tagged
Document is Tagged
- Check for the document catalog (see 7.7.2 Document Catalog in the PDF 32000-1:2008 Specification). If one does not exist, then that is an indicator that this Document is not tagged.
- Check the mark information dictionary (14.7 Logical Structure - 14.7.1 General - Table 321 - Entries in the mark information dictionary).
-- If the dictionary does not exist, this is an indicator that the document is not tagged.
-- If the Marked (boolean) entry in the dictionary is FALSE, then this is an indicator that the document is not tagged.
-- If the Suspects (boolean) entry in the dictionary is TRUE, then this is an indicator that the document is not properly tagged.
When the PDF is not encrypted, we can also check if the StructTreeRoot (14.7.2 Structure Hierarchy in the PDF 32000-1:2008 Specification) exists and if it does not this too is an indicator that the document is not tagged.
Content in the Document is tagged
The Automated checked uses the Content Stream to identify unique content.
It then verifies that each content item corresponds to or is part of a Tag in the Tag tree. See 14.8.2 Tagged PDF and Page Content in the PDF 32000-1:2008 Specification.
Any content items identified as 'Artifact' are not considered to be untagged.
Note that there are also special considerations for 'span' Content Elements related to accessibility information.
Along with whether or not their is a tagging structure, AMP will test for the following:

The title property is mandatory and must clearly identify the document. This property must be set in the meta stream under the dc:title property. When a title is not provided a title is provided that does not clearly identify the document users of assistive technology or users with cognitive impairments may not be able to quickly determine the identity of the document.

Ensure that all documents specify the correct language of the document. Screen readers use this information to determine the correct pronunciation during speech output. When the wrong language is specified, the user may not understand document content.

Within the tagging structure of the PDF, AMP will perform automated tests against the following Best Practices:

Form fields should explicitly indicate the form field label as accessible text. When form fields do not explicitly include a label, assistive technology may guess at the label or provide no label at all to users. When the incorrect or no label is provided users of assistive technology may not be able to complete a form.


It is important that links are tagged properly as links in order for the link to be keyboard accessible and to be indicated as actionable with a role of 'link' to users of assistive technology.

Alternative text must be present to communicate the meaning of images to users of assistive technologies and text-only browsers.

Custom tags can be created by automated tools. These custom tags (custom roles) are mapped to standard PDF tags. For example, when converting from MS Word format to PDF, heading styles are given a tag name of Heading1, Heading2, etc. A role mapping must be created within the tags pane to map each of the headings to the proper PDF tag. The Heading1 tag must map to the H1 PDF tag in order to be identified as a heading to users of assistive technology.

Security settings set too high in a PDF document or preventing the user from opening the document in a standalone PDF reader can prevent assistive technologies from accessing the PDF document.



Was this article helpful?

0 out of 1 found this helpful