| Contact the Center of Southwest Studies | Digital images home |
|
|
Digitization standards |
| Through its partnership with the with the Collaborative Digitization Program (CDP), the archival staff of the Center of Southwest Studies has digitized more than five thousand images selected from a number of collections, beginning in December of 2000, for viewing on the Web for educational purposes and research. To search for digital images at the Center of Southwest Studies and elsewhere, go to Heritage West. |
![]() |
The Collaborative Digitization Program is supported through a National Leadership Grant to the University of Denver Penrose Library from the Institute of Museum and Library Services (a federal grant-making agency in Washington, D.C., which fosters innovation, leadership and a lifetime of learning, by supporting museums and libraries) with additional assistance from the Colorado State Library, and the Colorado Regional Library Systems. |
|
Scanning resolution standards for the Center of
Southwest Studies
by Todd Ellison, adopted July
14, 2003
Master
files’ resolutions:
The following are the Center's standards for determining the scanning resolution
to choose when producing a master TIFF image for digital images of collection
items.
300 PPI (Pixels Per Inch; the corresponding term, DPI, refers to the output, e.g. how many dots per inch a printer prints out) is the minimally acceptable resolution. The Collaborative Digitization Program scanning best practices guidelines (CDP) state that the master image should be "as large as is appropriate and you can store." Until the near future, we have been limited in storage space; that pressure should be relieved because of the College's switch to a SAN, and the Campbell earmark funding we are expecting to pay for the (theoretically unlimited) SAN storage space expansion.
The
CDP Digital Imaging Best Practices (January 2003, online in PDF format, follow
the link in the "Best Practices" section at
http://www.cdpheritage.org/
) adds (p. 24) that "The master image should be the highest quality you can
afford." And (page 26), "There is no one `perfect' resolution to
scan all collection materials. Spatial resolutions [PPI]
should be adjusted based on the size, quality, condition, and uses of the
digital object." The Best Practices pages 31 ff. offer these
guidelines by source type:
|
Type
of document |
Scanning resolution |
Cf.
CDP page |
|
Text |
600 |
31 |
| Postcards |
800 |
|
|
Photographs |
3,000
to 5,000 pixels across the long dimension |
32 |
|
Maps |
3,000
pixels across the long dimension |
33 |
|
Graphic
materials |
3,000
pixels across the long dimension |
34 |
|
Artwork
& 3 dimensional objects |
If scanning
from photographic surrogates such as 35mm slides, use the recommendations
for transparency photographs |
34 |
Our
benchmark for the master Tiff image size is that the full-size image will have a
resolution of 300 when printed out on an 8.5" x 11" sheet of paper --
unless, as with the case of postcards, the actual image is smaller than that and
we have no foreseeable need for displaying an image that is larger than the
original.
Higher resolution scans of photonegatives are especially worthwhile. As Richard Pearce-Moses, Director of Digital Government Information, Arizona State Library, Archives and Public Records, has explained, "If you scan a 35 mm negative at 300 dpi, then enlarge it to 8x10 inches, the image is now around 40 dpi because you've spread those dots over a larger area. ... One of the joys of large negatives is their high quality, rich toned, and detailed images. A low resolution scan will result in losing those qualities." (Archives and Archivists List, 2003 Sept. 11) The Center of Southwest Studies places a priority on digitizing photonegatives that have no matching print, because it is an economical means of providing access to that image without having to pay for the production of a photoprint.
Derivative
files’ resolutions:
We present all of our images, regardless of source type, as Access files at a resolution of 150 PPI (and, generally, 640 pixels across the greatest width) and Thumbnail files at 72 PPI (and, generally, 150 pixels across the greatest width). For purposes of the end use of our digital image files, we pay most attention to Pixels Per Inch rather than to Dots Per Inch or Resolution (terms which are more useful in referring to the digital image presentation phase). Thus, we are concerned with the number of pixels across the greatest dimension.
We save photos as jpeg files; we save black and white textual images as gif files because they compress better (i.e., smaller file size). Our html software blocks viewing of the Access images, to prevent unauthorized use of our images from our Web pages (because we are only placing them on the Web for viewing, not for downloading).
Scanning – the steps
Select items to be scanned. For efficiency, scan items of the same type and the same size at the same time.
Turn on the scanner before powering up the PC. If the scanner is off but the PC is already on, restart the PC.
If you're scanning a photonegative, temporarily remove the white lid on the underside of the scanner cover so the light will shine through the negative as it scans. Place the emulsion side (the frosty side) of the negative UP.
Open Adobe Photoshop, the >File>Import>Epson Twain Pro
Place the item to be scanned onto the bed of the scanner – face down, top of item on left edge of scanner glass plate – positioned the same distance in from the front edge of the scanner each time (so you won’t have to pre-scan subsequent items that are of the same size)
Click Preview to pre-scan. Click a corner of the moving dotted lines to center the image inside the box.
Scan – use 24 bits color (Std.) for color and sepia-toned images; use 8 bits Grayscale (Std) for black and white images.
Screen/ Web
Color: 600 PPI
B/W: 800 PPI
To begin scanning, hit “Scan.”
When the full image appears behind the 2 boxes, click on the X in the upper
right corner of the Epson box to shut the smaller 2 boxes.
Image>Adjustments>Levels (to change brightness, color, etc., if necessary)
Image> Rotate canvas>Arbitrary (to straighten the image if the item wasn't
placed just right on the glass)
Use the four-sided box icon (left menu bar) to crop, leaving
the image centered inside a thin white border, then Image>Crop.
When satisfied with the way the image is looking,
(1) Save as a TIFF master image: (see
the file naming standards, below)
File>Save As> save in
SWImagesMasterTiffs in format TIFF (click on yellow file folder up arrow and
double-click on the correct directory to open) / give correct file name
<save> in Tiff Options box, wait for "Writing Tiff format"
(hourglass) to complete > OK
We can do steps (2) and (3) using the batch processing automated mode in
Photoshop CS, for great time savings. See below.
(2) Save as a JPG derived from the TIFF:
Image>Image size>Resolution 150>Greatest dimension 640 pixels>OK
File>Save As>change format to JPEG and change directory to SWImages,
then doubleclick SW images and click Save (in JPEG Options box, click ok)
(3) Save as a thumbnail JPEG also derived from the TIFF:
Close the JPEG image (X in upper left corner)
File>Open Recent>[choose your Tiff image and wait for it to
upload -- "Reading Tiff format"}
Image>Image size>Resolution 72>Greatest dimension 150 pixels>OK
File>Save As>change format to JPEG, change directory to SWImages
(double click), and
add the TN letters before the .jpeg suffix, then click Save, and in JPEG Options
box click OK, then click thumbnail box to close.
Batch creation of the derivative jpeg image files (thanks to Jill M Koelling, Executive Director, Collaborative Digitization Program):
Open Adobe Photoshop software.
Open one of your tiff images – make sure it has the same orientation (i.e., landscape/horizontal or portrait/vertical) as all the images you will be scanning in this batch. The point is, deal with all of the landscape images in one batch, and all of the portrait-oriented images in a separate batch -- because the batch modifications are going to make all of the derivative images either be the same height, or be the same width; we accomplish this by copying all of the files that are of the same orientation into one temporary folder for running the batch generation of derivative jpeg files.
Under the Window menu, click Actions to open the Actions window.
If you don't see the type of Action you need in the Action window, click on the icon that says create a new action (this looks like a piece of paper and it is directly to the left of the trash can).
A new Action window will open – name your action – we suggest you give it a descriptive name like 150wide (this stands for 150ppi).
Click record.
Now run through the series of clicks you normally would to alter the tiff file to a 150ppi image –
Image
Image Size/
Resolution 150,
640 PPI, then click OK
Image size.
Then save the file using SAVE AS.
Select jpeg as the format.
Choose the folder where you will store your jpegs.
Save the image, then click OK.
Close the image.
Then hit the stop icon on the action window (it’s the dark green square in the bottom left corner of the window).
You now have an action you can run from batch that will turn all your vertical tiffs to 150ppi jpegs.
Now open the File menu, click on Automate, choose batch, choose the appropriate action from the drop down list, choose the folder location of the master tiffs for the Source, choose the folder location for saving the jpegs for the destination (we park ours in a folder called TempFolderJPEGsBatchCreation or, thumbnails, in TempFolderTNBatchCreation), designate the file naming convention (first box: leave the text as Document Name) (second box: leave the text as Extension) (leave other boxes blank), then you can either stop for errors or save any error messages to a text file – we prefer this option and save the text file on the desktop for easy access. Then click OK and go have some tea (if you have a lot of tiffs, this might take a bit, if you only have a few it won’t take long at all) or just sit and watch the amazing productivity of your computer work!
After generating thumbnails, we have to use Windows Explorer to rename each new thumbnail image file to add the suffix TN before the .jpeg because we haven't been able to figure how to get the Adobe CS software to do this (if you've conquered this, please let us know).
Questions
to ask and things to consider
1.
Will we be describing an entire collection or only a single item in each record?
Entire Collection: Advantages
Efficiency: saves time
Highlights hierarchy of relationships
Works well for
digitization of repetitive materials or material with
little unique item-level information
Entire Collection: Disadvantages
Loss of individual detail
Less precise search results
Single
Item: Advantages
Amount of detail
Preservation uses and issues can be recorded
Precise searches
Single Item: Disadvantages
More costly in time and money
False, deceptive, inaccurate search results
Loss of relationships
2. Will we be creating records for the digital object or the original?
Original: Advantages
Might already exist – saves time and money
Users looking for original
Don’t have multiple records
Easier to update
Users expect original to be described
Original: Disadvantages
User confusion
Asset management needs – can add digital information to original record to help
Increase use of the original
Digital: Advantages
Asset management – information more direct
Treating digital object as unique
Audience potentially larger
Promotes access to digital and helps to preserve the original
Digital: Disadvantages
Potentially doubles the size of the database
Lose detail on original and buries it in the record
Credibility
3. What types of information should we include in the item description?
Who, what, where, when
Format of object, orientation, etc.
Digital object – migration asset management
who scanned
when it was scanned
hardware used
software used
plug-ins needed to view/play/hear
Subject(s)
Hierarchy, levels of description
Condition of the original
identification of preservation/conservation needs
telling users about what they are seeing
Where stored
Donor information
As much as we can provide, given our staffing resources
4. What is the anticipated database size and what is our electronic files storage capacity? (See the following section.)
Scan
file size chart
|
Type of source document |
Percent
pixel dimensions
for access JPEG file* |
File source for 72
PPI resolution thumbnail JPEG file |
File naming convention |
| Homer
Root ledger
page |
15% |
TIF
master
file |
M124####.___ |
| Polaroid
photo of SW
textile |
65% |
JPEG access
file |
F014T###.___ |
*Our benchmark for the Access image size is that the Web-transmitted image seen on the computer screen will be the size of an 8.5" x 5.5" sheet of paper.
File
naming standards
Center staff names every digital image file using the file naming conventions we have developed to clarify our management of the digital images. Basically, every file starts with a letter for the collection type (F for artifacts, P for photos, M for manuscript collections), then the three-digit collection number, followed by number for the volume and page, if appropriate, or the accession number (as is probably more likely for photos of artifacts) --with no spaces or punctuation in the file names.
To name a digital image file, we concatenate the following:
"Volume" is the number for the group (a letter, converted to its 1-digit equivalent: A = 1, B = 2, C - 3), the category if necessary (1 digit), and the sub-category if necessary (a letter, converted to its 1-digit equivalent: A = 1, B = 2, C - 3)
"Page" is the 2-digit item number (or, 3-digit if we think this series will include more than 99 distinct photo images).
To facilitate use of the Web page generator software, we follow these rules in formulating a special additional unique number (PicNo) of each image:
We begin with a volume number, that is derived from the group (higher hierarchical division of the photos in this collection).
We decide what constitutes the lowest hierarchical level of grouping of types of photos (i.e., a series) within this collection; for instance, photos of persons of a certain tribal affiliation, or photos of costume in dress. Photos within that series may be broken down further into their various and several photos (e.g., photos of Apache women working on domestic tasks, or photos of wampum belts). We number the images within a given series sequentially (i.e., for a set of web pages that all describe Theodore Hetzel's photos of Apache Indians we start with 001 and carry on through the last image of Apache within that series in this collection).
We have to check the images database to make sure that the number we assign is unique and does not replicate the PicNo of any other image in our database (including any image from any collection, not just the Hetzel collection) (numbers in the General Photo Collection are distinguished by being 5-digit numbers).
Rarely (and never in instances where we anticipate generating web pages automatically to display these images), this main portion of the file number is followed, if necessary, by digits that further describe aspects of the item scanned:
Thumbnail image file names contain the letters TN just before the .jpg suffix. Images that are components of an item that has a single call number (such as card that has four photos glued to it) are lettered alphabetically after the call number (beginning with the left-most item on the upper row, then the left-most item on the next row), such as the images displayed at http://swcenter.fortlewis.edu/images/P001/P00130100.htm .
All the 0's in the file names are zeroes, not letter o's. All letters in file names are entered as capitals. We use no spaces, no dashes, and no hyphens in the file name.
For example:
P004: For naming digital files of photoprints in the Fort Lewis College Archives photographs, the numbers following the Collection letter and 3-digit collection number are the series number (of variable length depending on what the series number is) and the 2- or 3-digit folder number (depending whether we anticipate ever having more than 99 images in this series). For example, P00418G5001 describes the first item Series 18.G.5, photos of the Wanbli Ota Indian Club at Fort Lewis College.
For naming digital files of photonegatives that have been numbered in the Center's simple single sequential numbering system, the numbers following the Collection letter and 3-digit collection number are the 4-digit negative number, such as P0492000 as seen at http://swcenter.fortlewis.edu/images/P049/P0492000.htm
F021DC$$### For naming digital files of textiles in the Durango Collection F 021, (DC stands for Durango Collection), the $'s are the two letters for the type of weaving (for example, NC is : Navajo, RG is Rio Grande) and the #s are the three-digit number of the artifact within that type. For example, F021DCMI012 as seen at http://swcenter.fortlewis.edu/images/F021MI/F021DCMI12Page.htm
F014T### For naming digital files of textiles in the Southwest textiles collection F 014, the #'s are the three-digit textile number, from 001 to 150. For example, F014T001 as seen at http://swcenter.fortlewis.edu/images/F014/F014T001Page.htm
M1233000 For naming digital files of the Homer Root ledgers, the first # is the volume number; the next three #s are the three-digit page number stamped on the pages in that volume). For example, M12335124 as seen at http://swcenter.fortlewis.edu/images/M124/M1245124Page.htmFor a digital photo of the Center's Acoma black on white ceramic by Lucy Lewis, the image file name would be F016197002004. Then, we would add a tag in the MARC record at SWF 016 Accn.7002004 in the College's bibliographic catalog to lead to the access view of this image on the TALON OPAC website.
Additional resources online for guidance in digital imaging:
Conservation Online guide to imaging and imagebases: http://palimpsest.stanford.edu/bytopic/imaging/
Information for doing research at the Center
of Southwest Studies
Page last modified: October 24, 2007