Session 7

Test-Driving the Technology, Steve Puglia, National Archives and Records Administration

The digitization process converts an image into a series of pixels, which are in turn represented by a single or series of binary digits (1’s and 0’s). The pixels are arranged in a two-dimensional matrix called a bitmap. One important aspect of a bitmap is its bit depth. The number of shades comprising a given bitmap is determined by a formula 2^x, where x is the bit depth. A bitmap where x = 1 bit will be a black and white image. On the contrary, a 16 bit image will feature 65,536 shades between black and white.

Color systems are of three types: Additive – RGB; Subtractive – CMYK; and Device Independent – LAB. For digitization purposes, RGB files are the preferred files, because they have the widest color gamut of the three choices. RGB color images are created by passing bitmap images through three color filters – one red, one green and one blue. The amount of light reflected through these filters is then combined to create a single color image. Well done 8-bit (per filter channel) imaging meets most digitization needs, though 12-bit to 16-bit images are better for re-purposing.

The spatial resolution of images (often referred to in dots per inch, pixels per inch or lines per inch) is the definition of how finely or widely spaced individual pixels are arranged relative to one another. When digitizing images, it is common to create three versions of one image, each with a different spatial resolution – one for access, one for reproduction and one for preservation. (in ascending order of increased spatial resolution)

Digitization projects also often need to make use of color management tools and data compression in their work. The rule of thumb with color management tools is that they vary widely in their quality, and that one usually gets what one pays for. Concerning data compression, while on the pro side compressing digital images prior to storage and transmission saves space and time, one must keep in mind the higher tendency of compressed images to become corrupted and/or lose data. There are different types of data compression which are of varying degrees “lossy” or “lossless”. LZW and ZIP files are considered lossless; JPG files are considered lossy.