Contact the Center of Southwest Studies 

Center of Southwest Studies

Digital images home

           

Digitization standards


Through its partnership with the with the Collaborative Digitization Program (CDP), the archival staff of the Center of Southwest Studies has digitized more than five thousand images selected from a number of collections, beginning in December of 2000, for viewing on the Web for educational purposes and research.  To search for digital images at the Center of Southwest Studies and elsewhere, go to Heritage West.

The Collaborative Digitization Program is supported through a National Leadership Grant to the University of Denver Penrose Library from the Institute of Museum and Library Services (a federal grant-making agency in Washington, D.C., which fosters innovation, leadership and a lifetime of learning, by supporting museums and libraries) with additional assistance from the Colorado State Library, and the Colorado Regional Library Systems.

Scanning resolution standards for the Center of Southwest Studies
by Todd Ellison, adopted July 14, 2003

Master files’ resolutions:  The following are the Center's standards for determining the scanning resolution to choose when producing a master TIFF image for digital images of collection items. 

300 PPI (Pixels Per Inch; the corresponding term, DPI, refers to the output, e.g. how many dots per inch a printer prints out) is the minimally acceptable resolution.  The Collaborative Digitization Program scanning best practices guidelines (CDP) state that the master image should be "as large as is appropriate and you can store."  Until the near future, we have been limited in storage space; that pressure should be relieved because of the College's switch to a SAN, and the Campbell earmark funding we are expecting to pay for the (theoretically unlimited) SAN storage space expansion.

The CDP Digital Imaging Best Practices (January 2003, online in PDF format, follow the link in the "Best Practices" section at http://www.cdpheritage.org/ ) adds (p. 24) that "The master image should be the highest quality you can afford."  And (page 26), "There is no one `perfect' resolution to scan all collection materials.  Spatial resolutions [PPI] should be adjusted based on the size, quality, condition, and uses of the digital object."   The Best Practices pages 31 ff. offer these guidelines by source type:

Type of document

Scanning resolution

Cf. CDP page

Text

600 pixels per inch

31

Postcards

800 pixels per inch for the front (picture) side of black and white postcards; 600 PPI for color postcards; 600 PPI for the back (address) side of all postcards

 

Photographs

3,000 to 5,000 pixels across the long dimension

32

Maps

3,000 pixels across the long dimension

33

Graphic materials

3,000 pixels across the long dimension

34

Artwork & 3 dimensional objects

If scanning from photographic surrogates such as 35mm slides, use the recommendations for transparency photographs

34

Our benchmark for the master Tiff image size is that the full-size image will have a resolution of 300 when printed out on an 8.5" x 11" sheet of paper -- unless, as with the case of postcards, the actual image is smaller than that and we have no foreseeable need for displaying an image that is larger than the original.   

Higher resolution scans of photonegatives are especially worthwhile.  As Richard Pearce-Moses, Director of Digital Government Information, Arizona State Library, Archives and Public Records, has explained, "If you scan a 35 mm negative at 300 dpi, then enlarge it to 8x10 inches, the image is now around 40 dpi because you've spread those dots over a larger area. ... One of the joys of large negatives is their high quality, rich toned, and detailed images. A low resolution scan will result in losing those qualities." (Archives and Archivists List, 2003 Sept. 11)  The Center of Southwest Studies places a priority on digitizing photonegatives that have no matching print, because it is an economical means of providing access to that image without having to pay for the production of a photoprint.

Derivative files’ resolutions:

We present all of our images, regardless of source type, as Access files at a resolution of 150 PPI (and, generally, 640 pixels across the greatest width) and Thumbnail files at 72 PPI (and, generally, 150 pixels across the greatest width).  For purposes of the end use of our digital image files, we pay most attention to Pixels Per Inch rather than to Dots Per Inch or Resolution (terms which are more useful in referring to the digital image presentation phase).  Thus, we are concerned with the number of pixels across the greatest dimension.

We save photos as jpeg files; we save black and white textual images as gif files because they compress better (i.e., smaller file size).  Our html software blocks viewing of the Access images, to prevent unauthorized use of our images from our Web pages (because we are only placing them on the Web for viewing, not for downloading).


Scanning – the steps

Select items to be scanned.  For efficiency, scan items of the same type and the same size at the same time.

Turn on the scanner before powering up the PC.  If the scanner is off but the PC is already on, restart the PC.

If you're scanning a photonegative, temporarily remove the white lid on the underside of the scanner cover so the light will shine through the negative as it scans.  Place the emulsion side (the frosty side) of the negative UP.

Open Adobe Photoshop, the >File>Import>Epson Twain Pro

Place the item to be scanned onto the bed of the scanner – face down, top of item on left edge of scanner glass plate – positioned the same distance in from the front edge of the scanner each time (so you won’t have to pre-scan subsequent items that are of the same size)

Click Preview to pre-scan.  Click a corner of the moving dotted lines to center the image inside the box.

Scan – use 24 bits color (Std.) for color and sepia-toned images; use 8 bits Grayscale (Std) for black and white images.

To begin scanning, hit “Scan.”  When the full image appears behind the 2 boxes, click on the X in the upper right corner of the Epson box to shut the smaller 2 boxes.
    Image>Adjustments>Levels (to change brightness, color, etc., if necessary)
    Image> Rotate canvas>Arbitrary (to straighten the image if the item wasn't placed just right on the glass)
    Use the four-sided box icon (left menu bar) to crop, leaving the image centered inside a thin white border, then Image>Crop.

When satisfied with the way the image is looking,
(1) Save as a TIFF master image:   (see the file naming standards, below)
    File>Save As> save in SWImagesMasterTiffs in format TIFF (click on yellow file folder up arrow and double-click on the correct directory to open) / give correct file name <save> in Tiff Options box, wait for "Writing Tiff format" (hourglass) to complete > OK

We can do steps (2) and (3) using the batch processing automated mode in Photoshop CS, for great time savings.  See below.

(2) Save as a JPG derived from the TIFF:
    Image>Image size>Resolution 150>Greatest dimension 640 pixels>OK
    File>Save As>change format to JPEG and change directory to SWImages, then doubleclick SW images and click Save (in JPEG Options box, click ok) 

(3) Save as a thumbnail JPEG also derived from the TIFF:
Close the JPEG image (X in upper left corner)
    File>Open Recent>[choose your Tiff image and wait for it to upload -- "Reading Tiff format"}
    Image>Image size>Resolution 72>Greatest dimension 150 pixels>OK
    File>Save As>change format to JPEG, change directory to SWImages (double click), and add the TN letters before the .jpeg suffix, then click Save, and in JPEG Options box click OK, then click thumbnail box to close.

Batch creation of the derivative jpeg image files (thanks to Jill M Koelling, Executive Director, Collaborative Digitization Program):

  1. Open Adobe Photoshop software.

  2. Open one of your tiff images – make sure it has the same orientation (i.e., landscape/horizontal or portrait/vertical) as all the images you will be scanning in this batch.  The point is, deal with all of the landscape images in one batch, and all of the portrait-oriented images in a separate batch -- because the batch modifications are going to make all of the derivative images either be the same height, or be the same width; we accomplish this by copying all of the files that are of the same orientation into one temporary folder for running the batch generation of derivative jpeg files.

  3. Under the Window menu, click Actions to open the Actions window.

  4. If you don't see the type of Action you need in the Action window, click on the icon that says create a new action (this looks like a piece of paper and it is directly to the left of the trash can).

  5. A new Action window will open – name your action – we suggest you give it a descriptive name like 150wide (this stands for 150ppi).

  6. Click record.

  7. Now run through the series of clicks you normally would to alter the tiff file to a 150ppi image –

            Image
            Image Size/
Resolution 150, 640 PPI, then click OK
            Image size.

  1. Then save the file using SAVE AS.

  2. Select jpeg as the format.

  3. Choose the folder where you will store your jpegs.

  4. Save the image, then click OK.

  5. Close the image.

  6. Then hit the stop icon on the action window (it’s the dark green square in the bottom left corner of the window).

  7. You now have an action you can run from batch that will turn all your vertical tiffs to 150ppi jpegs. 

  8. Now open the File menu, click on Automate, choose batch, choose the appropriate action from the drop down list, choose the folder location of the master tiffs for the Source, choose the folder location for saving the jpegs for the destination (we park ours in a folder called TempFolderJPEGsBatchCreation or, thumbnails, in TempFolderTNBatchCreation), designate the file naming convention (first box: leave the text as Document Name) (second box: leave the text as Extension) (leave other boxes blank), then you can either stop for errors or save any error messages to a text file – we prefer this option and save the text file on the desktop for easy access.  Then click OK and go have some tea (if you have a lot of tiffs, this might take a bit, if you only have a few it won’t take long at all) or just sit and watch the amazing productivity of your computer work!

  9. After generating thumbnails, we have to use Windows Explorer to rename each new thumbnail image file to add the suffix TN before the .jpeg because we haven't been able to figure how to get the Adobe CS software to do this (if you've conquered this, please let us know).


Questions to ask and things to consider before beginning a digitization project

1. Will we be describing an entire collection or only a single item in each record?

Entire Collection: Advantages

Entire Collection: Disadvantages

Single Item: Advantages

Single Item: Disadvantages

2. Will we be creating records for the digital object or the original?

Original: Advantages

Original: Disadvantages

Digital: Advantages

Digital: Disadvantages

3. What types of information should we include in the item description?

4. What is the anticipated database size and what is our electronic files storage capacity?  (See the following section.)


Scan file size chart 

 Type of source document

Percent pixel dimensions for access JPEG file*

File source for 72 PPI resolution thumbnail JPEG file

File naming convention

 Homer Root ledger page  15% TIF master file M124####.___
 Polaroid photo of SW textile 65% JPEG access file  F014T###.___

 *Our benchmark for the Access image size is that the Web-transmitted image seen on the computer screen will be the size of an 8.5" x 5.5" sheet of paper.


File naming standards

Center staff names every digital image file using the file naming conventions we have developed to clarify our management of the digital images. Basically, every file starts with a letter for the collection type (F for artifacts, P for photos, M for manuscript collections), then the three-digit collection number, followed by number for the volume and page, if appropriate, or the accession number (as is probably more likely for photos of artifacts) --with no spaces or punctuation in the file names.

To name a digital image file, we concatenate the following:

  1. We begin with a volume number, that is derived from the group (higher hierarchical division of the photos in this collection).

  2. We decide what constitutes the lowest hierarchical level of grouping of types of photos (i.e., a series) within this collection; for instance, photos of persons of a certain tribal affiliation, or photos of costume in dress.  Photos within that series may be broken down further into their various and several photos (e.g., photos of Apache women working on domestic tasks, or photos of wampum belts).  We number the images within a given series sequentially (i.e., for a set of web pages that all describe Theodore Hetzel's photos of Apache Indians we start with 001 and carry on through the last image of Apache within that series in this collection).

  3. We have to check the images database to make sure that the number we assign is unique and does not replicate the PicNo of any other image in our database (including any image from any collection, not just the Hetzel collection) (numbers in the General Photo Collection are distinguished by being 5-digit numbers).

Rarely (and never in instances where we anticipate generating web pages automatically to display these images), this main portion of the file number is followed, if necessary, by digits that further describe aspects of the item scanned:

Thumbnail image file names contain the letters TN just before the .jpg suffix.  Images that are components of an item that has a single call number (such as card that has four photos glued to it) are lettered alphabetically after the call number (beginning with the left-most item on the upper row, then the left-most item on the next row), such as the images displayed at http://swcenter.fortlewis.edu/images/P001/P00130100.htm

All the 0's in the file names are zeroes, not letter o's.  All letters in file names are entered as capitals.  We use no spaces, no dashes, and no hyphens in the file name.

For example:

P004: For naming digital files of photoprints in the Fort Lewis College Archives photographs, the numbers following the Collection letter and 3-digit collection number are the series number (of variable length depending on what the series number is) and the 2- or 3-digit folder number (depending whether we anticipate ever having more than 99 images in this series).  For example, P00418G5001 describes the first item Series 18.G.5, photos of the Wanbli Ota Indian Club at Fort Lewis College.

For naming digital files of photonegatives that have been numbered in the Center's simple single sequential numbering system, the numbers following the Collection letter and 3-digit collection number are the 4-digit negative number, such as P0492000 as seen at http://swcenter.fortlewis.edu/images/P049/P0492000.htm

F021DC$$###     For naming digital files of textiles in the Durango Collection F 021, (DC stands for Durango Collection), the $'s are the two letters for the type of weaving (for example, NC is : Navajo, RG is Rio Grande) and the #s are the three-digit number of the artifact within that type. For example, F021DCMI012 as seen at http://swcenter.fortlewis.edu/images/F021MI/F021DCMI12Page.htm

F014T###     For naming digital files of textiles in the Southwest textiles collection F 014, the #'s are the three-digit textile number, from 001 to 150. For example, F014T001 as seen at http://swcenter.fortlewis.edu/images/F014/F014T001Page.htm

M1233000     For naming digital files of the Homer Root ledgers, the first # is the volume number; the next three #s are the three-digit page number stamped on the pages in that volume). For example, M12335124 as seen at http://swcenter.fortlewis.edu/images/M124/M1245124Page.htm

For a digital photo of the Center's Acoma black on white ceramic by Lucy Lewis, the image file name would be F016197002004.  Then, we would add a tag in the MARC record at SWF 016 Accn.7002004 in the College's bibliographic catalog to lead to the access view of this image on the TALON OPAC website.


Additional resources online for guidance in digital imaging:

Conservation Online guide to imaging and imagebases:  http://palimpsest.stanford.edu/bytopic/imaging/


Tools for archival work

Digital images home

Information for doing research at the Center of Southwest Studies

Center of Southwest Studies

Page last modified: October 24, 2007