Te whakarerekē hōputu kōnae
File format migration
Currently, we openly accept and encourage the transfer of all kinds of files, formats, and information; we do not prescribe one file format over another, nor do we require any particular type! The most authentic information is our ideal acquisition. This is information which has been created naturally in order to suit and fulfil business needs. Business-as-usual information possesses archival value and context; but context can be lost during the migration process. Files need not be migrated into more conventional or household formats before being transferred to our archive.. Microsoft Word documents, Excel spreadsheets, and PDFs are some of the more common digital file formats that we collect, but if your agency uses other applications and formats, those are equally desirable to us.
What is format migration, anyway?
File format migration is a digital preservation strategy that involves the transfer of data from one format to another. Migration is often used to move information from an obsolete or aging format to a modern standard, or from one application to a newer version, without changing the intellectual content in the file itself. For instance, migrating an MS Word 1997-2003 (.doc) to an MS Word 2007 (.docx) document would be a common example.
Format migration, however, comes with potential risks. As with any digital intervention, format migration can cause unintended problems, like the alteration of file content or structure, the loss of essential information, or the introduction of new errors. The unpredictability of format migration is one of the reasons we do not encourage its practice simply for the sake of transfer.
Ultimately, the intellectual content of a file should not be compromised in any way.
Considerations
However, if your organisation has files that need to be migrated into different formats, for whatever reason, please consider these points:
Access - current and future
In this day and age, both file formats and software are susceptible to obsolescence; there are no guarantees that the same tools will be used to access information in the future
Since future access is dependent on the prevalence of certain formats and software, you need to to take precautions to ensure the file formats you produce will still be supported in the foreseeable future. The same applies to software applications which are used for rendering and/or editing those file formats.
Content integrity
Which properties of the digital record are necessary to keep and transfer to a new file format and which could possibly be lost (if any)?
Colour - for example, colour images vs. greyscale
Layout - for example, headings, logically structured text, indentations, paragraphs
Functionality - for example, hyperlinked text, scripts, macros, digital signatures, drop-down menus, embedded metadata, ability to edit the content
Size - for example, dimensions for specific printing purposes
Format validity
A newly created file must conform to that file format specification if possible. For more details see the Format Validation paragraph in one of our previous blog posts.
Documentation/Accountability
Keep a record of all format migrations, including the details of the original files and their migrated formats, to ensure evidence of authenticity and authority
Know the file: how it should look, how it should be rendered, its original quality; this can help when testing and evaluating possible migration paths and/or tools in order to select the one which fits your needs most
Establish a metric or tool that will be used to measure any potential loss of information
Further reading on the topic of format migration is available from the following institutions:
Originally published on the Records Toolkit blog 20 March 2018