The process for validating file formats verifies whether a digital file's contents and structure satisfy the requirements set for that file format's specification.
DPF Manager is a particularly user-friendly open source tool for checking TIFF files, with a simple interface to show whether your TIFF file satisfies the right TIFF specification. And if your file does not satisfy these requirements, the tool also explains why not.
- Why validate?
- When to validate?
- DPF Manager for TIFF file validation
- Example error messages
It's very important to validate file formats for their long-term preservation. One major stumbling block when developing a digital preservation strategy is that we often don't have a clear overview of the file formats in our digital archive, even though this is important for regularly checking whether they can still be opened with available software. After all, it's possible that this software might cease to exist in the future. File identification and validation can help you to detect in good time whether a format is going to become obsolete, so you can take action by converting any relevant and affected files into a different format.
It's also important to check that files delivered in an outsourced digitisation assignment satisfy the set quality requirements.
When to validate?
Quality requirements are set in advance of a digitisation project, for example with regard to which file format to use. Guidelines in the High-quality text and image content digitisation article recommend using Uncompressed baseline TIFF v6.0. When the digitisation is complete, you should therefore check that the TIFF files received satisfy this specification. Even if errors are discovered in the file validation process, the digitisation company can still convert the files into the correct format.
So you're not just checking that files with a .tif extension are actually TIFF files, but also that they satisfy the set requirements in the Uncompressed baseline TIFF v6.0 specification. The file's structure is analysed to check for any errors when the file was created, which could result in not all software being able to read it.
DPF Manager for TIFF file validation
There is a DPF Manager tutorial on YouTube.
Install DPF Manager
Download and install DPF Manager. It is available for Windows, macOS and Linux.
Select files to validate
Open the DPF Manager program on your computer.
Drag the folder containing the TIFF files that you want to validate to the 'Files/Folders' window.
...or click 'Select' to choose the folder containing the TIFF files for validation.
Select the 'Default' option, and click the 'Full check' button.
The 'Tasks' window opens below, where you can follow the progress of the validation process. The validation is finished when the bar is fully green. Close the window by clicking on 'Tasks' at the bottom left.
Analyse the results
When the validation is complete, you can view the report with validation results by clicking on 'Reports' in the menu bar at the top.
You will then see a general overview showing:
- when the validation was performed;
- how many TIFF files were validated;
- which folder was validated;
- how many errors were detected;
- how many warnings there are;
- how many TIFF files passed the validation;
- the score.
Click the folder symbol to go straight to the reports. You can check the results by clicking on the line.
You will then see an overview of the results per file. This shows a summary of the general report for the entire folder at the top, followed by summaries of the reports for the individual TIFF files. The overview shows you, for each TIFF file:
- a colour code indicating whether the validation was successful;
- which files have been validated;
- how many errors were detected;
- how many warnings there are.
Click on the HTML symbol to see a brief visual summary of the validation results for the entire folder.
All reports, both for the entire folder and for individual TIFF files, are available in four file formats: HTML, PDF, XML and JSON. Simply click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol. For the validation report for an individual TIFF file, click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol next to that file.
The HTML validation report for the entire folder
Click HERE to download a PDF of an example validation report for a folder of TIFF files without any errors.
The HTML validation report for an individual file
Click HERE to download a PDF of an example validation report for an individual TIFF file without any errors.
Example error messages
Not all file validations result in a report without any error messages. You will find a number of example error messages, with solutions for correcting them, below.
Example 1: use of special characters
The validation report indicates that the TIFF file does not comply with baseline TIFF v6.0 specifications. The error message is 'Only 7-bits ASCII-codes are accepted'. Hover your cursor over the error message to see more details.
ASCII is a code for displaying letters, numbers and punctuation marks on a computer screen. It consists of 128 characters in total, and you can find an overview on Wikipedia. The error message indicates a problem with the embedded metadata from 'tag 33432 Copyright'. You can find the details for this tag higher up in the report, in the list of IFD tags: '© Rony Vissers'. The copyright symbol in not 7-bits ASCII-code, and that's the reason for the error message.
Fortunately, it's easy to rectify. If you open the file with image editing software (e.g. Adobe Photoshop or GIMP) and view the embedded metadata, you can simply change '© Rony Vissers' to 'copyright: Rony Vissers'. You can access the embedded metadata in Adobe Photoshop by clicking on 'File info' in the 'File' menu. In GIMP, access the embedded metadata by clicking on 'Metadata' in the 'Image' menu, and then 'Edit Metadata'. Don't forget to save the updated TIFF file once you have modified it. See also the embedded metadata article for information about modifying embedded metadata.
When you check the updated TIFF file with DPF Manager, you will see that the previously reported error has disappeared and the file is now valid.
If the TIFF files are the result of a digitisation project carried out by a specialist digitisation company, ask them to fix the errors rather than doing it yourself.
Example 2: use of compression
Even though the TIFF file format is mainly known as a file format without compression, it does offer this possibility. Compression is not recommended for digitisation, however. DPF Manager can detect TIFF files that have been compressed.
Here is a validation report from the same image: saved without compression on the left, and saved with JPEG compression on the right. The TIFF file with JPEG compression gives an error message.
The only way to fix this error is to perform the capture or scan again and save it as Baseline TIFF v6.0 without compression. If the RAW file used to create the TIFF file is available, you can use that to create a Baseline TIFF v6.0 without compression.