Tuesday, April 5th, 2011...09:05
Losing Six Years of Photos and Recovering Them Five Years Later
In 1999 I acquired my first digital camera, an Epson PhotoPC 700. Looking back at the tech specs (1.3 megapixels, 4 megabytes of internal storage, no optical zoom) makes me wonder why I took photos with this camera at all. At its maximum resolution (1280×960) it could store 5 photos at “super fine” resolution, 11 at “fine” (still 1280×960 but with more compression) and 39 at “normal” resolution (640×480). The advantage of seeing the photo right after taking it proved to be a game changer though and it served as my only camera when I traveled.
Machu Picchu, As Viewed From Huayna Picchu
Over the next 5 years, I took more and more digital photos and upgraded to higher resolution digital cameras. In 2001 I picked up a 3.34 megapixel Casio QV-3500EX which had a Canon lens with 3x optical zoom.
Floating Tori, Itsukushima Shrine
In 2003 I switched again, to a 4.0 megapixel Canon G3 with 4x optical zoom and flip out LCD.
Left Field Corner, Old Yankee Stadium
It was also around this time that I built an 80GB RAID1 (2 hard drives which are exact copies of each other) fileserver to store photos, music, documents etc. and share amongst my various computers. By 2004 I had quite a collection of digital photos and other miscellaneous files. I had also ripped all my CDs to MP3 and the fileserver couldn’t hold everything. Unhappy with the prospect of having to buy 2 hard drives every time I needed to increase my fileserver’s capacity, I decided to setup a new system using RAID5 (N hard drives with an overal capacity of N-1 * size of smallest hard drive). This meant if I purchased 3 200GB hard drives I’d have 400GB of space on the fileserver and could survive a failure of one hard drive without losing data. In the future I could by another 200GB hard drive and have 600GB of space. I reconfigured my fileserver (at the time running a distribution of Linux called CentOS configured with software RAID) and all was good.
More than a year went by and then in September of 2005, disaster struck. I received notification that one of my hard drives had dropped out of the array and my wonderful RAID5 setup was now running in a degraded state. I poked around on the drive no longer in the array, fsck’d it etc. and there were no bad blocks or anything seemingly wrong with it, so I decided to try rebuilding the array with it remounted. This was a stupid idea; perhaps I should have turned off the server, purchased a 500GB external drive and tried to get the data off the fileserver right away but, at the time, that was a several hundred dollar proposition and would take several days. In retrospect, I don’t think it would have made a difference as about 20% into the rebuild either one of the two “good” hard drives crashed or the software RAID caused some sort of data corruption or my SATA card bork’d again etc. and now the entire array was no longer recognized. Trying to copy the data to a new hard drive would have likely resulted in the same.
I searched around on the Internet to find a way to rescue my failed drives/corrupted data but found nothing that seemed to work. No combination of mdadm commands caused my data to magically become available again. And at that point, if I was planning on powering on the drives again, the first thing to do would have been to make images of them which would have required 600GB of space. Then trying to recover data from the images would require another 400GB. 1TB drives didn’t exist so I’d have to cobble together several hard drives which would mean getting another computer to stick them in etc. I was looking at upwards of $1k in hardware and then researching how to reconstruct the data byte by byte, sector by sector. I researched a number of data recovery companies and after discussing my issue with them, decided to send my drives and RAID configuration out to one of them with the understanding that I may end up with a $1k bill to pay depending on how successful they were; and I was ok with that. What I was not prepared for was the e-mail I received a week later stating they were unable to recover anything and had mailed my hard drives back to me.
Shell shocked didn’t even begin to describe my state-of-mind. I didn’t take any photos for the next few months. I decided having a central fileserver was not worth the effort and instead made my wife’s computer and mine backups of each other. This way if anything failed on either of them, we’d still have access to all our data right away. I also started burning backups of photos and a subset of precious documents onto DVD. Amazon S3 and the like didn’t exist yet and I was lucky if I could get an upload speed of 30 kilobytes per second on my DSL line, so moving all my files online wasn’t a realistic option anyway.
I never opened the box of hard drives the data recovery company sent back to me. I considered erasing them and selling them on eBay, throwing them in front of a steamroller etc. but a small part of me held out hope that one day I would have the time, motivation and tools to recover the data. Fast forward 5 years to the end of 2010 and I found myself with a 5 day weekend around New Year’s. 1.5 terabyte hard drives were less than $100 (I needed 1 terabyte of space to image and attempt recovery of the disks to minimize the likelyhood of any further damage to the source drives). I still had an old desktop computer lying around with a bunch of unused SATA ports. It was now or never.
Unfortunately the data recovery company did not mail back the RAID configuration I had sent them, and I must have lost the configuration settings when we moved the previous year. All the software demos I tried claimed to have ways to auto-detect the correct values by scanning through the data and looking for certain levels of entropy, but none of them worked. Because I had used RAID5, in order to have any hope of getting the data back I needed to figure out:
- The order of the hard disks (sda, sdb and sdc in my case)
- The block size of the RAID stripes (4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, 1M, 2M)
- Whether the parity data was written Right or Left Synchronous
6 different ways to arrange the hard drives, 10 different block sizes and 2 directions yielded 120 different combinations that needed to be checked. I was fairly certain I had not used a stripe size above 512K or below 32K, as those are only used for RAID setups serving very large or very small files, so that cut my search space in half. R-Studio seemed to have the most intuitive interface and documentation so I decided to use their demo during my brute force experiments. For certain combinations of parameters, the demo was able to show me a few valid file and directory names after scanning several tens of sectors, typically for block sizes of 64K, 128K and 256K sometimes small files e.g. thumbnail images looked perfect as their data was contained on a single block. The R-Studio demo only allows recovery of files 64k or less in size so I purchased the software with the hope that it would be able to recover complete files if I found the right settings.
After going through many different combinations, diligently crossing them off my list, I hit upon the magic settings:
- sdc, sda, sdb
- 256K
- Left Synchronous
All of a sudden I was staring at full resolution images of the above photos for the first time in over 7 years. After letting the program run overnight I woke up to 300GB+ of photos, videos, documents, music etc. that had been missing for over half a decade! I immediately made a copy of everything onto another external hard drive and began sifting through the data.
With no record of what I’d lost, I could not verify everything had been recovered. I did find several malformed inodes and a handful of messed up images and MP3s.
Lake Tahoe
But there were clearly several hundred gigabytes of files in perfect condition which I previously thought were lost forever. Strangely enough, recovering the data turned out to be the easy/least time consuming part. After the initial high, I was faced with solving the, “how to prevent this from happening ever again” problem.
There’s been a lot of talk about “cloud computing” over the last few years and plenty of companies want you to store your data with them. Unfortunately, I’m not an “average user” and a few tens of gigabytes of online storage that is less than 100% reliable isn’t going to cut it. I still need a robust local backup plan, but it definitely needs to be augmented with some amount of non-local replication. I decided to split my data into two groups:
- Critical data which I would be very sad/angry to lose
- Optional data which I would really like to have, but could live without (after some amount of second guessing as to why I didn’t consider it critical)
In addition, I’m also making more of an effort to prune unnecessary photos, video etc. While data storage costs are always going down and organizational tools are always getting better, it seems the rate and size of data creation is matching, if not exceeding them (and in general, it just doesn’t make sense to be wasteful). I’m about 2/3s of the way through the data pruning and re-organization, at which point I will need to choose at least one offsite location for backups. There’ll probably be a blog post about that decision…
Comments are closed.