File Formats and Technical Information for Reproductions

Printer-friendly versionSend by email

Northwestern University Library uses a variety of digitization equipment to fulfill requests. Unless different specifications are requested, the following file formats and standards will be used.

Please email the Special Libraries Division Administrative Assistant, Theresa Neef, with further questions. 

Material type Spatial resolution and bit depth File format
Unbound text or image Equivalent of 600 pixels per inch (ppi) or 5400 pixels in long dimension, whichever is higher. TIF
Bound text or image Resolution varies; typically 250-400ppi equivalent JPEG or PDF

Spatial resolution describes the fineness of a scan, usually expressed in terms of the number of pixels used to represent an inch on the original (pixels per inch, or ppi). Although the terms are NOT equivalent, occasionally "dpi" (dots per inch) will be used instead of the more technically correct ppi. At the equivalent of 600ppi, an 8.5" x 11" document will yield a scan of 5100 pixels (8.5 inches * 600 ppi = 5100 pixels) by 6600 pixels . At the equivalent of 600ppi, a 12" x 17" document will yield a scan of 7200 pixels by 10200 pixels. 600ppi has been chosen as a generally acceptable resolution that will be equally capable of representing fine lines and small text, and can be used to produce an acceptable facsimile of the original.

However, when scanning reduced-size materials, such as slides, film or microformats such as microfilm or microfiche, a slightly different method of computing resolution is needed. As indicated in the chart of fees, 5400 pixels in the long dimension (the longest side) is the smallest overall dimension used for scanning Northwestern University Library materials. However, a 600ppi scanning resolution is not sufficient for reduced-size originals. For example, a 35mm slide typically measures 1" x 1.5". To achieve a scan measuring 5400 pixels in the long dimension, a much higher scanning resolution of 3600ppi must be applied (5400 pixels divided by 1.5 inches = 3600 pixels per inch).  Top


Bit depths are named according to the amount of computer storage space allotted to store the color information for each pixel. These numbers are powers of two. Two is used because computer information is binary; one "bit" is can hold one of two binary values: either 0 or 1.

Different color depths use different ways of encoding information. 8-bit color images usually use a custom palette of 256 colors. Numerical values in the computer's memory refer to particular colors in the palette. The palette acts as an index where colors can be looked up. 16-bit and 24-bit images don't use a palette; they store the red, green, and blue componens of each pixel directly in the computer's memory.

1-bit: black & white (21)
A single bit has value 1 or 0. It stores either black or white. Often used for text. This bit depth is also called "bitonal" or "binary".

4-bit:16 colors (24 : 2x2x2x2=16)

8-bit: 256 colors (28: 2x2x2x2x2x2x2x2=256)
An 8-bit image has a fixed palette of 256 shades of gray or 256 different colors.

16-bit: Thousands of colors (216)
A 16-bit image contains up to 32,768 diferent colors.

24-bit (224) and 32-bit (232): Millions of colors
Both 24-bit and 32-bit images provide about 16.7 million available colors, more than the eye can actually see. Many computer monitors can support this many colors. Only the first 24 bits are used to determine color. In a 32-bit image, the last eight bits are reserved for information other than color (like transparency and overlays). 24 is the default bit depth used to digitize color Northwestern library materials.  Top


TIF is the default file format for Northwestern library materials. TIF or TIFF (Tagged Image File Format) is an "interchange" standard (supported on many platforms) with many uses. TIF is used in many desktop publishing applications, and is the standard import/storage format for many Optical Character Recognition (OCR) applications. TIF files may be uncompressed, in which case they tend to be very large, or compressed, usually with the lossless LZW compression.

JPEG is the second most common file format used to deliver Northwestern library materials. JPEG (Joint Photographic Experts Group) is both a file format and a type of compression, so be sure you understand the context in which the name is used. A JPEG image can contain millions of colors and is usually the best choice for images with a lot of color variation, such as a photograph. JPEG (or JPG or JFIF) files are also the most common image format for web pages.

PDF (Portable Document Format) may be used to deliver multiple page documents. A standard developed by Adobe, it can contain both text or image materials at a variety of levels of resolution. 

DV25 is a digital video format with support for audio. It can be stored in one of two wrappers, .mov or .avi, depending on your needs. It is a standard definition compression scheme and can be used as an access master from which to make smaller derivatives for streaming. This type of file averages 12GB per hour. Top


This simple formula can be used to estimate the size of an uncompressed digital image file:

(Pixel width * pixel height * bit depth) / 8 = file size in bytes


((Width in inches * height in inches) * (resolution squared) * bit depth / 8) = file size in bytes