Skip to main content

Digital Preservation Policy

The Northwestern University Libraries’ Digital Preservation Policy supports sustainable access to and use of select digital collection content for the foreseeable future. The purpose of the policy is to:

  • Define preservation objectives;
  • Identify the content covered in this policy;
  • Contextualize digital preservation actions;
  • Inform how preservation actions are executed; and
  • Set reasonable expectations about limitations impacting digital preservation.

As technology evolves, so will this policy. It is subject to change as new and emerging technologies impact the ability to preserve and provide access to digital content. The policy is assessed on an annual basis, at minimum.

For the purpose of this policy, “digital preservation” is defined as the combination of policies, strategies, and actions which ensures authenticity, integrity, reliability, long-term access, and use of digital assets under the care of Northwestern University Libraries.

Policy Statement

The Northwestern University Libraries follows a practice of active preservation with the aim of ensuring the authenticity, reliability, and integrity of the digital collection assets entrusted to and under their care, while attempting to provide usable versions for research, teaching, and learning. The digital preservation of items in Northwestern University Libraries’ care is key to the Libraries’ mission “to provide collections and information services of the highest quality to sustain and enhance the University’s teaching, research, professional and performance programs.”[i]

Contextual Policies

Northwestern University Libraries’ Digital Preservation Policy should be approached in the context of other existing university and library policies and strategies, including the Collection Development Policies, Use of Electronic Resources Policy, Rights, Permissions and Reproductions Policy, McCormick Library Reading Room Policy, NU Purchasing and Strategic Sourcing policies, NU Network Privacy, Northwestern University Libraries Strategic Plan, SustainNU Strategic Plan, Northwestern University Strategic Plan, and other related policies and strategic plans that may be developed in the future.

Preservation Objectives

Scalability

Due to the many variabilities that comprise and affect data, there is no single, standardized ability to manage and preserve all digital assets. Such variabilities include, but are not limited to, file formats, metadata, user accessibility needs, or software and hardware obsolescence. Therefore, we seek to implement scalable preservation actions in three levels[ii] to balance preservation actions with the needs of our users, the priorities of Northwestern University Libraries, available resources, and technological capabilities.

Access, Authenticity, Integrity

Northwestern University Libraries seeks to provide its stakeholders with long-term access to authentic, usable versions of the digital material entrusted to our care to the best of our abilities

Professional Commitment

Digital preservation is a constantly evolving discipline. As such, Northwestern University Libraries is committed to collaborating with the broader digital preservation community to assess, develop, and participate in the field. We seek to innovate and develop leadership in the digital preservation field through our practices and partnerships.

Collaboration, Assessment

Effective and scalable digital preservation requires collaboration among colleagues, stakeholders, and vendors. Northwestern University Libraries regularly assesses these collaborations to measure the success of our digital preservation program against the Libraries’ resources and evolving technologies. We investigate and participate in new or additional agreements when they are beneficial to the Libraries’ strategies and capabilities.

Environmental Sustainability

We recognize that there are environmental costs to digital preservation including, but not limited to, energy consumption and electronic waste. We seek to conduct preservation actions and storage sustainably when possible.

Rights, Privacy, Security

Northwestern University Libraries upholds intellectual property rights, privacy concerns, and security of the content in our care. We document actions and preserve information about digital objects’ rights and privacy concerns. We assess and advocate for systems and storage which are managed securely. We leverage tools and systems to minimize risks of data loss and/or damage due to disasters, such as human error or natural disasters.

Identification of Content

The general scope of this policy covers digital items created by Northwestern University Libraries, digital materials that comprise the university scholarly record, preservation digital surrogates of physical library collection material, and natively electronic digital collections. More specifically, this may include content such as:

  • Products of digitization projects and associated metadata;
  • Items or collections from the NU community that Northwestern University Libraries’ have accepted a curatorial role for;
  • Electronic theses and dissertations (ETDs);
  • Items deposited in the institutional repository (e.g., Arch); or
  • Born-digital library and archival collection materials and associated metadata.

The scope of this policy does not include:

  • Working documents or administrative records created by the Libraries not yet transferred to the custody of the University Archives;
  • Access copies of born-digital archives or digitization projects; or
  • Databases, software, or systems used as tools by library staff.

Preservation Actions

File Format Migration

File format migration is a method of overcoming technological obsolescence by transferring digital files from one file format to another. Proprietary formats present challenges to some preservation activities. When possible, widely-used and supported formats will be transformed to a format that preserves the content and when possible the formatting and style of the original, but not necessarily the functionality.

It is achieved through reformatting, or converting from one file format to different file format. For example, a proprietary file format such as a .docx, which is dependent on specific software, may be converted to an open source format or more widely adopted format, such as a .pdf or an .odt file. File format migration at Northwestern University Libraries is completed by software after assessments conducted by file format monitoring.

File Format Monitoring

File format monitoring is a complimentary preservation action to file format migration as it provides information about whether file formats have become or are at risk of becoming obsolete. File format monitoring is conducted by software, which generates reports to inform file format migration actions.

Persistent Identifiers

A persistent identifier is a unique, long-lasting reference to a digital resource. Its uniqueness ensures provenance of the item and location tracking embedded within the identifier verifies its correct location and monitors changes in locations. Northwestern University Libraries provides persistent, unique identifiers to objects and/or their metadata according to standardized schemes.

Preservation Metadata

Preservation metadata is metadata, or data about data, that supports the process of long-term digital preservation. In contrast to other schema of metadata, preservation metadata goes beyond only describing characteristics of digital objects by capturing information which enables digital preservation actions. It often includes provenance information, rights management information, and technical information that maintain the ability to use the full value of the preserved digital object. Northwestern University Libraries creates preservation metadata for objects to sustain other digital preservation actions conducted.

Auditing

Auditing actions check the integrity, or “fixity,” of digital objects to ensure that they have remained unchanged. Checksum algorithms applied to data identifies data integrity and provides information to detect errors if any have been introduced in the course of data transmission or storage. Checksums provide verification that:

  • Data has been correctly received from the original source;
  • Data has been transferred successfully to preservation storage;
  • Data integrity has been maintained in storage; and
  • Data has been correctly retrieved from storage.

Modern object storage options conduct automated audits continuously on data. Leveraging audit automation reduces risks of human errors and person-hours required. It also generates regular status reports for staff action if a problem is identified. A continually-running, ongoing process helps catch issues more quickly, and can better balance system resources as some elements of the audit process can take a long time and use network and CPU resources.

Redundancy

Redundancy is the practice of maintaining multiple, true copies of data in diverse storage infrastructures. It reduces risks of data loss and/or damage due to disaster threats, such as human error or natural disasters. Preserving multiple copies also provides possibilities for patching, repairing, and/or replacing lost or damaged data. Northwestern University Libraries leverages storage architectures which provide redundancy services.

Security

Assigning security measures reduces risks of data loss and/or damage and enables abilities to maintain rights management of digital objects. Security protocols may include, but are not limited to, firewalls, intrusion detection, and assigned access and use privileges. Northwestern University Libraries implements cyber security protocols and technologies using software and capitalizing on vendors’ services.

Storage

Preservation storage is a location that works in tandem with other preservation actions such as, auditing, file format monitoring, preservation metadata, redundancy, and security. Northwestern University Libraries utilizes dedicated storage that is designated and managed specifically for digital preservation. Digital preservation storage maintains the original bitstream of digital assets under the care of Northwestern University Libraries to the best of our abilities. Periodically, Northwestern University Libraries assesses its current storage location(s) and takes action to upgrade, update, transfer, and/or further manage storage options.

Virus Checking

Virus Checking includes activities to monitor, report, and quarantine infected or corrupted data. Computer viruses and malware are malicious software that infects files or applications. If activated, the virus may delete or encrypt files, modify applications, or disable system functions. Northwestern University Libraries checks for viruses and malware upon acquisition or ingest into our systems. Virus and malware scans are conducted through software that generates reports. If viruses and/or malware are detected, the software quarantines said data.

Levels of Preservation Support

Northwestern University Libraries practices digital preservation by assigning different “levels” to content. Each Level includes selection criteria and designated preservation actions conducted on the content. Assigning Levels helps scale, sustain, and provide flexibility for our digital preservation activities. We use this approach to conduct ethical digital preservation while working within the limitations of available resources and technological capabilities.

While Level 1 includes all preservation actions, this does not designate content categorized in this Level as more valuable than content in other Levels. Regardless of the Level assigned to content, it is valuable to Northwestern University Libraries. Various limitations beyond our control such as, but not limited to, time, resources, personnel, and technological capabilities, are the primary drivers for assigning Levels of Preservation.

Joint content and technological appraisals are conducted to determine Levels of Preservation. While each Level below includes specific criteria to assign content to that Level, there may be exceptions that are identified after further appraisals or assessments. For example, compound files determined to be of high value may receive Level 1 support to individual components of that file. For example, a high value MP4 would receive Level 2 support, but its individual streams (AAC audio, H.264 video, SRT/TXT captions, artwork, metadata) could be pulled out and converted to a format that can be preserved at Level 1.

A triangle separated into three sections. The wide base is labeled Level 1: Maximum preservation actions. The middle section is labeled Level 2: Medium preservation actions. The small top section is labeled Level 3: Minimum preservation actions.

Level 1

Description:

Northwestern University Libraries pledges the most comprehensive digital preservation actions for content identified in Level 1. Best effort is made to maintain viability (integrity of the file), renderability (ability to display the file for viewing), understandability (the file is displayed in a manner that does not affect the viewer’s ability to understand the file contents), and functionality of original digital object. Please note: renderability in Level 1 denotes an accurate rendering of information of a file; but it may not necessarily emulate an authentic experience of displaying and using files as they were originally created.

File Criteria for Level 1 Support:
  • Is in a format that is publicly documented;
  • Is in a format that is widely adopted;
  • Is in a format that may be rendered by multiple software packages;
  • Is in a format that has lossless data compression; and/or
  • Contains no embedded files or dynamic content.

Actions:

  • File format monitoring for changes that may warrant format conversion or further reassessment;
  • File format migration to successive format, if possible;
  • Persistent identifier assigned;
  • Preservation metadata created;
  • Auditing of data integrity;
  • Redundant copies of data;
  • Security protocols applied;
  • Storage and maintenance of original bitstream; and
  • Virus checking, and quarantining if necessary, upon acquisition.

Level 2

Description:

Northwestern University Libraries pledges a mid-Level of preservation support for content identified in Level 2. Best effort is made to maintain viability (integrity of the file), and understandability (the file is displayed in a manner that does not affect the viewer’s ability to understand the file contents).

File Criteria Level 2 Support:
  • Is in a format that is publicly documented;
  • Is in a format that is widely adopted;
  • Is in a format that may be rendered by multiple, currently available software packages;
  • May be in a format with lossy data compression; and/or
  • Is in a format that does not meet the criteria for Level 1.

Actions:

  • Persistent identifier assigned;
  • Preservation metadata created;
  • Auditing of data integrity;
  • &Redundant copies of data;
  • Security protocols applied;
  • Storage and maintenance of original bitstream; and
  • Virus checking, and quarantining if necessary, upon acquisition.

Level 3

Description:

Basic level of preservation support pledges Northwestern University Libraries’ best effort to maintain viability (integrity of the file).

File Criteria for Level 3 Support:
  • Is in a format about which little information is publicly available;
  • Is in a format that is not widely adopted across all disciplines;
  • Is in a format with lossy data compression;
  • Is supported by obsolete, single, or very few software platforms;
  • Is in a format that is not widely usable in currently available operating systems; and/or
  • Is in a format that does not meet the criteria for Level 1 or 2.

Actions:

  • Auditing of data integrity;
  • Redundant copies of data;
  • Security protocols applied;
  • Storage and maintenance of original bitstream; and
  • Virus checking, and quarantining if necessary, upon acquisition.
Preservation Action About Level 1 Level 2 Level 3
File Format Migration Migration to successive format upon obsolescence, if possible x
File Format Monitoring Monitoring of file formats for obsolescence x
Persistent Identifiers Provision of persistent, unique identifier for object and/or its metadata x x
Preservation Metadata Creation of preservation metadata x x
Auditing Regular fixity checks; data replacement or repair, when possible x x x
Redundancy Multiple copies of data in a diverse storage infrastructure x x x
Security Assigned access and availability restrictions and allowances x x x
Storage Storage of original bitstream x x x
Virus Checking Virus and malware checks on ingest x x x

[i] https://dictionary.archivists.org/entry/digital-preservation.html
http://www.ala.org/alcts/about/advocacy/glossary

[ii] https://osf.io/qgz98/

 

Last reviewed: August 2023