Besides being a front-end developer in Voog, I'm also an avid photographer. Over the course of the last 15 years, I've taken hundreds of thousands of shots.
Even after deleting 90% of them the same day due to tough self-censorship, the remaining archive takes up thousands of gigabytes of storage. The number is growing fast — improved camera technology has resulted in a steep increase in file sizes.
Losing even part of these photos would be a first-grade disaster. Millions of people like me are keeping vast amounts of photos almost entirely in digital format and trying to figure out how these photos can survive forever.
Here's what I've learned along the way.
Issues with optical media
In the early 90s, when blank compact discs (CDs) arrived, manufacturers declared that optical media would solve all digital data storage and archival problems. The lifetime of CDs was predicted to extend far beyond 40 years.
Now, more than 20 years later, people are discovering that their precious data on the very same CDs has vanished. It seems that all these claims were nothing more than just a marketing hoax after all.
Now, after quite some years have passed since blank optical media was released for the masses, several real aging studies have been performed regarding its lifetime. On average it is said that about 90% of data is intact on CDs after 10 years of archiving in good conditions. However, it must also be said that some good-quality CDs have kept their data in ideal conditions for more than 20 years. The key question is, what defines a good quality CD?
The manufacturer's quality out of the production line is not the only factor to consider. Temperature fluctuations, light and physical forces all decrease optical media lifetime. Transportation and storage conditions (including position: vertical is better than horizontal) of your local merchant also play a vital role here, not to mention your own storing conditions.
It also seems that data on a disc from the same patch and manufacturer burned on one CD burner outlives the other one burned in a different one. If you change the manufacturer, the situation can be reversed.
Burners prefer some media over others with no apparent reason or logic. Data on optical media is also becoming denser with the introduction of DVDs and blue-ray discs, and manufacturing processes are being changed daily. Any predictions on the lifetime of current optical media have no basis whatsoever.
Optical media's good qualities, like the inability to accidentally overwrite, the absence of static electricity and magnetic fields on data, and the smallest probability of theft in case of burglary, are shaded by the fact that no warranty can be applied for its lifetime. However, they might be quite feasible as a second or third backup solution.
Hard discs are having a hard time starting up
Magnetic storage seems to be another quite feasible solution for storing data. The data on the magnetic plates of hard discs seems to outlive the technological advances in times that are forcing you to update your media under your data.
Floppies are practically extinct, and there are already difficulties in finding a computer to get your data out of IDE (integrated drive electronics) hard discs, not to mention SCSI (small computer system interface) The commonly introduced problems of static electricity, magnetic fields and ease of physical damage can really be minimized with good storage conditions.
All this might seem quite ok if there weren't quite an enormous „but“ in all this. The electric motors driving hard discs have a very high failure rate when starting up after standing on the shelf in one position for a prolonged time of, let's say, years.
They are built for
constant movement, not for standing inanimate. You could power them for storage, but that would, in turn, open up the possibility of all kinds of electrical damage and physical wear.
Using hard discs as a backup solution is a good option. However, you should use an external raid rack protected from outside electric fluctuations with a good UPS (uninterruptible power supply). This, in turn, raises the price and still does not give a full warranty, considering electrical failures and chances of burglary.
What about memory sticks, memory cards, and other solid-state media?
The construction of solid-state media that stores one bit of data can be described as a battery being charged or not. We all know that a charged battery can go dead after some time, even if not used. This applies to the bit stored on solid-state media, too.
The bit is said to deplete in about 10 years on average. Users have reported even 15 years of storage without problems. Manufacturers give their best flash products a maximum five-year warranty. It's safe to assume that you can extend their lifetime by a magnitude by simply rewriting the data — that is, refreshing the electrical charge — every five years.
The only problem is that we do not have very much real-life data to back this up, and some scientific studies say oxygen in the air can affect the discharge rate of these little batteries. So more than ten years on the shelf, even if recharged on regular basis, might still reveal some unpleasant surprises in the future.
While memory sticks seem to survive washing machines and trucks riding over them, you should keep them away from static electricity. An air balloon or synthetic cloth can easily do the damage trucks cannot.
Storing online
The data stored online in the cloud has the highest chance of survival. Online backup service providers keep copies and regularly upgrade the media under your data.
There are quite good solutions for storing massive amounts of data online. However, cost is a concern. But if it is your second backup, where data access is rarely needed (only in case the first solution has failed), costs can be rather low.
Amazon Glacier is a good example. You pay only 0.01$ per GB per month. No extra charges for adding data.
If you want to cut costs, there are some things to know before setting up and testing.
- If you want to retrieve your data, you will have to pay. The pricing scheme is kind of complicated.
- You might be presented with an enormous bill if you download too fast or your files are very big.
So, you may want to read this Wired article and some forum posts first. Apps are available for simplifying backing up to this service.
Online storage is not without risks, either. Extreme forces of nature, like tsunamis and earthquakes, can destroy your provider's
servers. They usually store data in multiple locations, which means additional costs. In addition, economic
and political decisions can render your provider non-existent faster
than you can react.
The Megaupload saga is one example of this case. So, choosing a provider who is globally and politically connected is a good start. We think Amazon is one of the options to consider, as government data from the USA is stored on these servers as well, making it less likely to be shut down without a pre-warning.
How to keep your data
Whatever solutions you choose to keep
your data, there are some basic rules.
- Never
rely on one solution and have at least one additional backup of your data in a different
location on different media.
- Upgrade the media under your data on a regular basis. Online
solutions do that for you, so you should monitor their financial and
political background.
- Your data's file formats can become obsolete, and programs available for specific formats can become extinct. It is a good idea to update your data from old formats to newer ones from time to time.
Digital data preservation needs constant involvement. Otherwise, your grandma's old negatives have a fair chance of outliving your last holiday photos.