Cleaning up digital and paper archives means getting rid of all unnecessary elements such as duplicates, draft versions, documents without any added value, and empty files and folders. Cleaning also requires removing any elements that could damage the archive. This can be viruses in digital archives, or harmful packaging such as elastic bands, staples, paper clips, plastic folders, sticky tape and post-it notes in paper archives, ...
What can be removed?
Duplicates are identical documents, but even just a note added to an otherwise identical document stops it from being a duplicate. Some series of documents can contain a lot of exact duplicates, e.g. sheet music, theatre scripts, promotional archive materials, etc. If your organisation has ever published a leaflet, for instance, you don’t need to keep the whole stock of them in your archive; two copies will suffice. In this case, you should keep the least damaged versions.
Note: don’t just simply get rid of all duplicates. Some documents belong in different dossiers. Destroying the document irrevocably changes its content and the context of the dossiers.
Draft versions precede the definitive version of a document. They can be removed if they only differ marginally from the final version.
Note: Draft versions can also contain valuable information. Early versions of grant applications, for example, can include interesting ideas that didn’t ultimately make it to the final dossier, or draft versions of a letter or email can differ significantly from the final version, and be useful as working versions within a creative process.
Notes without much importance
Once they have served their practical purpose, notes made during a process can also be removed as long as they don’t include anything of any significance.
Your email archive also becomes more accessible if you regularly clean up your inbox. Email correspondence includes lots of ballast such as spam, advertising, press releases, emails sent to schedule a meeting and so on. These types of messages do not need to be stored permanently and are best deleted once they no longer serve any practical or informative purpose. Go to the Archiving emails: how and why? section for more tips on how to manage your emails.
Empty folders and files
Delete empty folders and files that have been created over time or as a result of cleaning up. They don’t contain anything worth archiving, and deleting them benefits clarity and searchability.
Note: Empty folders can add important contextual information to an archive in some cases.
Financial and administrative documents that don’t need to be kept permanently
There are lots of series of documents in financial and administrative archives that don’t need to be kept permanently, e.g. account statements, supporting documents (invoices, receipts) and personnel holiday schedules, etc. Non-profit organisations, the self-employed and limited liability companies are obliged to adhere to legally required storage periods.
Note: Do not adopt this list blindly; use your common sense. Decide for yourself whether there are any documents or objects that have artistic or cultural and historical value alongside any legal or administrative value.
Elements that can damage documents
Remove any harmful materials such as staples, paper clips, plastic folders, ring binders, etc. from your paper archive. Viruses are the biggest risk to digital archives.
SOURCE: STAPPENPLAN OVERDRACHT DIGITAAL ARCHIEF (SODA – STEP-BY-STEP PLAN FOR DIGITAL ARCHIVE TRANSFER)
Viruses cause a loss of information. Checking for viruses is therefore an important aspect when cleaning up your digital archive. You need a virus scanner for this. This will compare all the files on a computer against a database that describes all known viruses. The software detects any similarities between your files and the descriptions in the virus database. So if a match is found, this means your file is infected with a specific virus. The antivirus software will then remove the virus, place the infected file in quarantine so that the virus cannot spread or, if necessary, completely delete the infected file. Possible virus scanners include:
Make sure you regularly update your antivirus software, so you’re always using the latest version.
Checksums need to be created before you can search for any duplicate files. These are digital fingerprints for files, control digits that are assigned to files. As soon as anything different is detected in the files, the checksum software generates a new series of digits, so each changed file receives a new control digit, which means that any files that have the same checksum in the archive are duplicates. See the Checksums as a way to monitor the integrity of files page to learn how to create checksums.
There are various software tools available for finding and removing duplicates. They recognise duplicates using identical checksums. Options include:
- FSLint (Linux)
- YADFR (Windows)
- Duplicate File Searcher (Windows, Linux, OSX)
- Duplicate File Finder (Windows, Linux)
Note: Two identical files (or files with an identical checksum) might have different file names.
Duplicates that share the same content, but are saved in different file formats, are not found by these checksums. Deleting empty folders Software that can remove empty folders include:
- FSLint (Linux)
- RED (Windows)
Unfortunately, there aren’t any useful tools available for cleaning up paper archives, so manual intervention is required. A golden rule is not to wait too long after closing a dossier or project, so that it stays fresh in your memory.
Note: Make sure you don’t lose any contextual information when removing harmful packaging elements. Documents that are kept in folders or stapled together are usually stored like this for a specific reason. If you remove these packaging materials, replace them with paper folders that keep the documents together and note any inscriptions from the original, harmful folder on this new folder.
Author: Eline De Lepeleire (Letterenhuis)