a glob of nerd­ish­ness

Cost to keep ShutterStem image masters backed up in Amazon Glacier

published by natevw on

I was excited by this morning's announcement of Amazon Glacier.

I've been interested in automatic offsite backup of my photo archive but found the cost for "cloud storage" prohibitive so far. (I'm assuming that attempts to actually take advantage of e.g. Flickr Pro's cheap "unlimited" storage will not end well. Also, any sort of "we'll happily take all your stuff" offer also tends to imply insecure storage of data.) A penny per gigabyte each month is a game changer — an order of magnitude cheaper than anything else I've found so far.

There's a slight catch in that Amazon charges an extra fee for retrieving more than 5% of the total amount you have stored each month. What happens if I'm using it as a backup for all my photos and vidoes, and would like to quickly recover it all after a data–tastrophe? It gets complicated but here's my best estimate for the real-world situation I'm interested in. If I've crunched the numbers right, I like what I'm seeing!

Glaciers above Mount Rainier's old Paradise visitor center

Say I've got 500 GB backed up to cold storage, something goes horribly wrong, and I need to locally restore a complete copy from the cloud. My ISP at home can "burst" around 10 GB/hr but hates my guts if I download more than 250 GB/month. So, the quickest I can expect to have it all home is 2 months, ten times faster than Amazon's "preferred" 20 month free retrieval period. This means I'll be "charged a retrieval fee starting at $0.01 per gigabyte".

Let's work through Amazon's pricing example with my round numbers. For 500GB, my peak daily allowance would be 0.833GB. Assuming I steadily exceed that, downloading an even 8.33GB/day, my peak hourly retrieval rate for the month will be 0.347 GB/hr. This results in a billable peak rate of 0.312GB/hr after subtracting 1/24 of their daily allowance. So my excess retrieval fee would be only $2.25/mo (0.312 GB/hr * 720 hr/mo * $0.01 GB) — less than 10% of an increase on top of the corresponding $24.90 of standard data transfer cost each month!

The numbers get a bit better for 1TB of data (which I'm heading towards). My ISP is the limiting factor; doubling the data at rest actually decreases my retrieval overage fee to an even $2.00 each month. So as my photo archive grows into that ballpark I'd be looking at:

Four months seems like a long time, but it's way longer than I've been hobbying upon ShutterStem — which, of course would be a perfect fit for Amazon Glacier! In fact, since all my main ShutterStem apps rely only on the "always present" 512-pixel photo thumbnails, having such cheap storage means that my hard disk archive is almost just a local cache, mitigating Glacier's 4 hour storage latency. Theoretically I could toss all my big drives, run my medium-resolution ShutterStem library off SSD, and just process any full-resolution exporting for prints/uploads overnight. Of course, why would I do this, and what would happen if AWS were to lose/corrupt that copy of my data, but it's fun to consider…

Sketch of ShutterStem's architecture using Glacier in place of traditional filesystem for storage of original images

Overall, Glacier is looking like an excellent fit for ShutterStem's design and intended audience. When used as a true just-in-case offsite backup, anticipating catastrophic recovery happening over average US broadband speeds, the pricing is very competitive. The crazy numbers come only if you've got a gigabit connection at home or suddenly feel like slurping your entire Glacier archive into an S3 bucket por el pronto. The fact is that cloud backup is "slow" for mere mortals, and slow is what Glacier wants. I expect to see a number of backup apps spring up around this new offering, and am already pondering what the interface for ShutterStem should look like.

blog comments powered by Disqus HTTP/1.0 500 Internal Server Error Cache-Control: must-revalidate Connection: close Content-Length: 60 Content-Type: application/json Date: Mon, 23 Oct 2017 11:20:37 GMT Server: CouchDB/2.0.0 (Erlang OTP/19) X-Couch-Request-ID: d9e7dffedb X-Couch-Stack-Hash: 2053811356 X-CouchDB-Body-Time: 0 {"error":"unknown_error","reason":"undef","ref":2053811356}