Managing Datasets

  Continuum Logo

Contents

What are Datasets?
Why is Managing Datasets Important?
Planning for Effective Dataset Management
More Information
Contact us
 
Datasets, like other information or corporate records created or received by a public office, can be public records under the
Public Records Act 2005 and need to be managed accordingly. This factsheet provides some guidance for the management of datasets in public sector organisations.
 

What are Datasets?

A dataset is structured, encoded information found in lists, tables, spreadsheets or databases. Data may be numeric, spatial, spectral, statistical or structured text (including bibliographic data and database reports).
 
Datasets are most commonly found in tables, spreadsheets and databases:

  1. Tables are the simplest of the three types of datasets. They consist of an ordered arrangement of any number of rows and columns.
  2. Spreadsheets consist of interactive tables in which a data item may include a formula and may be dynamically linked to another data item, so that a change in one causes a change in the other.
  3. Databases are best referred to as database systems. A database system has three components: The database itself (the actual content); a Database Management System (the software between the data and the user); and the database application, which incorporates the user interface and the functionality that enables the user to search through and process the content of the database, as well as the programs that support the system in processing inputs and outputs.

Continuum logo Back to top

Why is Managing Datasets Important?

Continuum logo Back to top

Planning for Effective Dataset Management

Create a strategy or plan for the stewardship and preservation of your datasets, from their creation through to disposal, considering all possible uses for the data. Here are the steps:

  1. Assign responsibility
     
    Ensure that responsibilities for the management of all datasets are assigned to someone in your organisation.
  2.  

  3. Create appropriate metadata
     
    Datasets, like all records, require metadata to ensure they provide evidence of business activity and can be accessed for as long as they are required. Identify relevant standards for data/metadata content and format. The Archives New Zealand Electronic Recordkeeping Metadata Standard contains minimum requirements for recordkeeping metadata.
  4.  

  5. Make multiple (back-up) copies of valuable datasets
     
    Store some of them off-site and in different systems. Your vital records, business continuity, or disposal documentation may assist with identifying your valuable datasets.
  6.  

  7. Plan for data migration
     
    Plan the transition of datasets to new storage media and software systems in advance. Include budgetary planning for new storage and software technologies, file format migrations, and timeframes to complete the work. Storing datasets on new technologies before existing storage media becomes obsolete may help to prevent information loss.
  8.  

  9. Plan for transitions in data stewardship
     
    If the data will eventually be turned over to a formal repository or other custodial environment, ensure that it meets the requirements of the new environment and that the new steward has agreed to take it on.
  10.  

  11. Tailor plans for preservation and access to the expected use
     
    For example, gene-sequence data used daily by thousands of researchers worldwide may need a different preservation and access infrastructure to an internal human resources database used to manage staff information.
  12.  

  13. Pay attention to security
     
    Be aware of what you must do to maintain the integrity of your datasets and prevent unauthorised access.
  14.  

  15. Identify all relevant legislation
     
    Ensure your approaches to stewardship, access and disposal are compliant with all relevant legislation such as the Public Records Act 2005 and the Privacy Act 1993, and may be accessible under the Official Information Act 1982. There may also be sector-specific legislation that applies to the datasets.
  16.  

  17. Know the value and retention requirements
     
    Datasets may be of long-term or short-term value. Make sure that they are covered by a current disposal authority authorised by the Chief Archivist.

Continuum logo Back to top

More Information

Archives New Zealand. S8 Electronic Recordkeeping Metadata Standard. June 2008.
 
The Common Data Format website.
http://cdf.gsfc.nasa.gov/
 
The Data Documentation Initiative (DDI).
http://www.ddialliance.org/
 
Digital Preservation Europe (DPE). Database Preservation. March 2009.
http://www.digitalpreservationeurope.eu/publications/briefs/english.php#25
 
The Dutch National Archive. From digital volatility to digital permanence. Preserving databases. December 2003.
http://mixed.dans.knaw.nl/files/file/volatility-permanence-databases-en.pdf
 
Electronic Resource Preservation and Access Network (ERPANET). Conference Proceedings on Long-term Preservation of Databases. April 2003.
http://www.erpanet.org/events/2003/bern/Bern_Report_final.pdf
 
The Statistical Data and Metadata Exchange (SDMX) standard website.
http://sdmx.org/
 
Continuum logo Back to top

Contact Us

 

For recordkeeping advice and assistance, please contact Archives New Zealand at rkadvice@archives.govt.nz
 
Issued July 2009
 
Continuum logo Back to top