Forum Login

 

Archiving Images

Approaches to Storage & Retrieval

Everyone has their own approach to image storage and retrieval. Today's 6MP digital cameras produce 18MB RAW files, and at about 30 Megabytes for each scanned 35mm frame even the largest built-in hard disk fills up quickly. How to store, index and safely archive all of these files? There are a number of conflicting issues, including cost, convenience and speed of retrieval, and finally, security. What to do?

Photographers have several choices, each with benefits as well as downsides. Here is my current approach. It has recently changed — and I'll explain both what I used to do, and what I now do to store, retrieve and archive my work.

CD-R

CD-R is the least expensive approach to long-term file storage. When bought in bulk CD-R disk can cost as little as 30 cents. That means that some 20 images can be stored on each disk, at a cost of less than 1 cent per file. Certainly cheap enough, but there are other non-monetary costs, especially that of convenience. More on this momentarily.

The main drawbacks of using CD-Rs are slow speed and inconvenience. Even with a fast CD burner making CDs is a slow and labour intensive process. Also, once ones library of CDs grows to more than a dozen or so (which seemingly happens overnight) access and retrieval of specific files — even when one knows which disk they're on — can be time consuming.

 Muskoka Heron #1 — Ontario, 2000

Photographed with a Canon EOS-1V - HS and 100~400mm f/5.6L IS lens on Fuji Provia 100F

File & Format Longevity

Another concern is the archival properties of CD-R disks. Some manufacturers — one example being Kodak — claim up to 100 years. Certain pundits though say that CD-R disks can deteriorate in as little as 10-15 years. (See my cautionary note below).

Another concern is whether at some point in the future these disks will still be readable by then current technology. This is not as far-fetched as it appears at first. Have you tried to use a 5.25" floppy recently? How about an 8-Track audio tape?

I tend not to worry too much about these issues for a couple of reasons. One, is that the CD format has become so ubiquitous that it will likely remain a standard for many years to come. I also expect that as technology advances greater and greater storage densities will become available, and it will be a simple matter to copy ones existing files to these new media. Just get around to doing it at a point where both technologies overlap and before your CDs develop warts.

Of possibly greater concern is that of file format. I save all of my files in Photoshop's .PSD format. This allows me to save files with separate Layers. Having confidence in this format implies, of course, that Photoshop will still be around in 20 or 30 or more years from now, when either I or my heirs are interested in retrieving them. Again, if the time comes when Photoshop disappears there should be lots of opportunity to convert files. Also, just as there are programs today that can read WordPerfect files from 20 years ago, it's not unreasonable to assume that such conversion programs will exist in the future.

DVD-R, DVD+R, DVD-RAM etc, etc

I wish I had something nice to say about DVD as a data storage medium. Unfortunately the industry has not yet (as of Spring 2002) settled on a standard for this format, and until it does I'm staying on the sidelines. The idea of being able to store 4 — 12 Gigabytes of data on a single moderately-priced disk is appealing. I just don't have the confidence yet that any one of these so-called standards will survive. When a DVD data storage standard does appear it will solve a lot of problems. I expect that by some time in 2003 this will have sorted itself out.

Magneto Optical (MO)

There are a number of interesting MO disk formats. That's the problem — again there is no standard. Also, MO drives never became as popular in North America as they did in Japan, or even Europe. Not a contender in my opinion.

Tape

Tape offers the lowest cost storage of any media. Again, there is no one standard, and over the years numerous physical formats and storage densities have come and gone. Tape also features the slowest and least convenient access and retrieval. For a photographer who needs ready access to files this is not a convenient solution.

Three Women — Toronto, 2002
Leica M7 with Tri-Elmar @ 35mm. Fuji Sensia 200

Hard Disk

The most expensive, yet also most convenient means of storing and retrieving files is to keep them on a hard drive. Drives of between 100GB and 200GB are now readily available and relatively inexpensive. The only downside besides cost is that most PCs can only hold one additional drive.

Nevertheless I have found that a hard disk, along with CD-R for off-site archiving, is my preferred approach. More on this below when I discuss Firewire drives.

Hard Disks — The Downside

In addition to their higher per byte storage cost their is one additional concern with using a hard drive for archival storage, and that's the danger of changing or at worst deleting one or more of your files. That's why making a CD-R in addition is a critical part of this approach. Then, if you do accidentally change or delete an file archived on your hard disk you can always go to the CD for recovery.

Of course you can always change the permissions on any file or directory, or even the entire drive so as to lock the files and prevent accidental changes or deletions.

Indexing

I have tried a number of photo indexing programs. For a while I used one called Cumulus. I find the problem with all of these is the work necessary to write keyword associations. As much as I'd like to, I simply find this time consuming and boring. Without creating these searchable indexes I don't find these programs to offer much more than the following technique.

I store my files in subdirectories of less than 650MB in size. This allows me to save an entire directory to a single CD-R. I also name this directory with a descriptive name, such as (Grand Canyon 05-2002). I  then use Photoshop's contact sheet function to print a contact sheet of this directory. These printed sheets are then placed in a binder, again with a printed catalog of the directory names that it contains.

At any time it's a fairly fast and painless process to find the image that I want. I imagine that for a stock agency, or even a stock photographer with many thousands of images, this could be less that optimally efficient, but for me it works well.

Both the Mac and the PC now show small thumbnails of files in subdirectories, and both know how to read .PSD format files. (Windows XP shows especially large thumbnail views). This helps make identification of files pretty quick even if you haven't made Photoshop contact prints.

My Previous Approach

Through the combination of hard disk storage for ease of retrieval and CD-Rs for off-site archiving, I have had a system that worked well for me for several years. I would make 2 CD-R backups of all my Photoshop files. One I would keep locally and the other I would store off-site (at my summer cottage). See my essay on Image Security for more on this. 

The problem with this approach is that it offers security but not convenience. Making two CD-Rs of every file is very time consuming. And, I've run out of internal bays for adding more drives on my two networked computers.  

My Current Approach

Lacie 120GB Firewire Drive

What I now do is I still make CD-Rs, but just one to bring off-site. Otherwise I simply store my growing number of files on external Firewire hard disk drives. (The technical name for this high-speed serial device standard is IEEE 1394. Sony calls it I-Link and Apple calls it Firewire, as do many PC device manufacturers).

Firewire drives are fast and now inexpensive. A 120GB Lacie drive is just over U.S. $400 (April 2002). You can also buy empty Firewire drive enclosures for under $100 and put almost any hard disk drive inside them. Here is one example. If you currently have a second drive in your computer that you use for image storage you can quickly and easily remove it and transfer it to such an enclosure.

Why bother? Several reasons. If you work with both a desktop and a notebook computer you can simply plug the drives back and forth. If you need to travel with your files there are small Firewire drives available that make transporting large numbers of files very convenient. And, unlike with internal drives, you won't run out of drive bays. Up to 66 external Firewire drives can be chained together, and since Firewire is a Plug-&-Play format drives can be easily added and removed at any time, with no fuss at all.

Speed? Firewire drives are now the preferred format for people who do video editing, and these users need all the speed that they can get.

Don't have a Firewire card in your PC? Not an issue. Such cards are under $100 and if you have a laptop computer without an internal Firewire connection a PC card that provides one can easily be plugged in.

Back to costs. With a 120BB Lacie drive the cost per Gigabyte is $3.33. Certainly more than CD-R but with advantages that make it worthwhile — at least for me. It would take 185 CDs to equal one of these drive. Imagine mounting and unmounting that many CDs as you browse though your files.

Another way to look at the cost is on a per-image basis. Let's assume that each photograph stored is 30MB. This means that 33 images can be stored in each Gigabyte.  The math is simple. Each image therefore costs 10 cents to store. Not exactly cheap, but certainly convenient compared to CD.  Just as mounting and unmounting CDs to go through ones files is painfully slow, hard drive access is wonderfully fast. Is the cost worth it? Only you can decide based on the number of files you have and the rate at which you expect to grow your image collection.

Whatever you do though, do make backups, and do store a copy of all your important files off-site. Better safe than sorry.

A Cautionary Note

A few weeks after publishing this article I received an e-mail from a customer, ordering a print of one of my photographs from 1999. It wasn't one of my more popular (or recent) photographs so it wasn't on my new storage system, one of the Firewire hard drives.

For some reason though (Murphy was at work — as always) I couldn't find the CD ROM on which it was stored. I told the customer that he'd have to wait a week or so until I could get to my archive CD, which is stored off-site at my summer cottage.

The next weekend, upon arriving in the country, I retrieved the disk and was stunned to discover that the disk was unreadable on my laptop's drive. Upon returning to the city, and many hours of fussing later (along with some luck), I was able to retrieve the file, make the print and ship it off to the customer.

But, this made me wonder about all of my CD ROM backups. I have more than 100. I checked every one and found that 3 of the 100 were either totally or partially unreadable. These disks were on brand name media (Verbatim and Maxell), and had been verified after burning. Fortunately in each case the files on these bad disks were also located elsewhere and so nothing has been lost. (That's a 3% failure rate in just a few years. How many more will have deteriorated by next year?)

Scary isn't it? These disks were only between 3 and 6 years old. I had assumed (and read) that a minimum of 10-15 years could be expected from CD-ROMs. What's the cause of these failures? I don't know. Storage conditions were almost ideal; in plastic jewel-boxes located in a filing cabinet, in a normally heated and air-conditioned house.

It seems that just the way that your Ektachromes from the 1970's have now mostly faded away, so too will your CD-ROMs, given enough time — and that time apparently isn't very long at all.

So, until a non-volatile digital storage medium comes along I'm going to continue with my belt-and suspenders approach to back-up, namely a local outboard hard disk for ease of access, and 2 CD-ROMs, one located off-site.

Believe me, it's worth the bother.

Sources and Solutions

Based on the feedback that I've been getting from a variety of sources the consensus from those that appear to know what they're talking about is that CD deterioration is caused primarily by three factors...

Poor quality disks to begin with

Adhesive labels applied to the disks

Writing on the disks

The most reliable brands are reported to be Kodak Gold and Mitsui. Felt pens and paper labels and their glue apparently are all acidic, and leach through the disks over a period of years, destroying data integrity.

Best not to label or write on the disks at all, or only to use a pen if it specifically states that it is safe for writing on CDs.

Make two copies of everything. Pray.


Filed Under:  
Tutorials   

show page metadata

Concepts: Hard disk drive, Solid-state drive, Computer data storage, Cylinder-head-sector, CD-R, Computer storage devices, Disk enclosure, Floppy disk

Entities: Toronto, Kodak, Canon, Leica, Sony, Apple, North America, Europe, U.S., Japan, Firewire, IEEE 1394, Grand Canyon, Windows, XP, Michael Reichmann, Muskoka Heron, Tri-Elmar, Maxell, Murphy, Ontario, Photoshop, PC

Tags: hard disk, cd-r, firewire drives, image, hard drives, cd-r disks, standard, hard disk drive, Photoshop, external firewire, Lacie drive, storage densities, off-site archiving, Firewire hard disk, storage medium, summer cottage, bulk cd-r disk, built-in hard disk, data storage, hard disk storage, Firewire drive enclosures, CD ROM, external firewire drives, small firewire drives, contact sheet, single moderately-priced disk, ones existing files, lowest cost storage, outboard hard disk, long-term file storage, interesting mo disk, internal firewire connection, hard drive access, data storage medium, greater storage densities, data storage standard, 8-Track audio tape, fast cd burner, non-volatile digital storage, labour intensive process