Digital Asset Management – The X-Equals Way – Part 1 of 4

Thu, Dec 11, 2008

Lightroom, Tutorials, Workflow, X=Series

We have been asked hundreds of times, to define our model, or a model, for file naming and folder organization for photographers. We’ve been consulting on the setup and ongoing use of these systems for a number of years and I thought it was a good time to distill many of our findings into a comprehensive post outlining our approach that can be downloaded and used for reference and/or stress testing out in the field.

We’ve seen hundreds of organizational and naming systems for folders and files.  While consulting with both studios and photographers across a wide spectrum of specialties (wedding, portrait, product, and fashion) we have found that 9 times out of 10 there has been no strategy with regard to how these systems are architected, implemented, or scaled out as their data grows. That being said, 9 times out of 10 we’re tasked with bringing order, documentation, and training on how to implement an easy to use, simple, and scalable organizational structure for photographers that works across multiple media, in-house and remote systems – across PC, Mac, and Linux platforms.

This post stems from what has been a half decade of analysis, and implementation of these systems – and we’re giving away the farm on this one. I have always believed in open sourcing as much as I can with x= and this topic is no different. At the end of this post you can download our structure for your own testing, evaluation, and commentary.

For future reference, you can bookmark the Digital Asset Management – The X-Equals Way – A Prelude post for a nice clean table of contents for all 4 posts in the series.

Let’s get down to business shall we … ?

The New World Order – {cheap} Infinite Data Storage

5 years ago, for the majority of boutique studios we worked with, it was ALL about DVD and external (firewire, USB, SATA, etc.)  drives for storage. The downside is that these solutions offered no disaster recovery options and were hardly a solution for long term storage. 13% of all external drives will fail within their first year of use. That was true in 2003, and it’s still true today. One look at a stack of DVD’s would tell you that physical media was NOT a long term solution either.

But this approach has persisted for a LONG time …

Clients didn’t have thousand of dollars budged for hardware, and this was especially true for storage. To provide backups of your backups (providing the lowest level of data protection) – just multiply costs by 2, and you just doubled your cash outlay. Add in the costs for care and feeding, upgrading, and disaster recovery efforts and the total cost of ownership (TCO) for managing an archival storage infrastructure beyond USB drives and DVD’s broke the bank for many studios.

Today, the agile studio seeks to outsoure non-core services. This is especially true when it comes to storage. We have numerous clients (including ourselves) using “Storage Cloud” services like Amazon S3 to archive data rather than mountains of USB Drives, DVD’s, or in-house servers. S3 provides, for all practical purposes, a limitless storage environment for archiving your work. We currently have 5 Terabytes of data archived in our S3 account with a TCO 50% less that the traditional cost model of housing our data in-house. For more formal numbers on how these costs savings could look, check out what Jeremy Zawodny has to say.

Production vs. Archival Data

Before we move any further, let’s get some clarifications out of the way.

There are 2 classifications of data floating around any studio: Production Data and Archival Data

Production Data

project (PSD, TIFF, JPEG, Lightroom Catalogs) and/or image files (RAW, NEF, CR2, XMP, DNG, etc.) that require immediate access on an ongoing basis

Archival Data

project (PSD, TIFF, JPEG, Lightroom Catalogs) and/or image files (RAW, NEF, CR2, XMP, DNG, etc.) stored for the purposes of disaster recovery and archival storage

Ideally, all Production Data is stored along with Archival Data offsite. If we lose all our production data for a job we shot last week, we can recover that data from the archive. S3 makes this methodology less complex, and more realistic than in years past where sizing limits for media and external drives dictated how much data could be stretched across multiple volumes or discs.

And as we discussed, Production Data has a bit of a different life while clients create, edit, manage, and deliver projects from their digital tools. This type of data generally sits on a local hard disk or external drives while in production, with incremental backups of the production files being sent to the archive as necessary.

But in reality, we find that due to the ever-changing nature of production data it can become challenging to keep a daily backup in the cloud due to the time it may take to copy files offsite – only to find that by the time they upload updates have already taken place and need to be uploaded again.

That being said, what you backup to the cloud, and what stays on-premise is entirely up to you.

What we find when clients initially start out with cloud storage is that Archival Data enjoys most of the benefits of these endless pits of storage space, and this is where we’ll focus our discussions.  At the end of the day data is data, so don’t get hung up on what type of data it is at the moment.

What’s next … ?

In Part 2, we’ll use our data definitions and storage considerations to build a flexible and easy to manage folder structure …

|Brandon Oelling
x=photography+consulting – technology. leadership. commitment.

, , , , , ,

3 Comments For This Post

  1. Mark Levison Says:

    The cloud is great but what kind of bandwidth do you have for your connection? Do you find that you’re able to shoot more than you can upload?

    For my home use I have 3 tier’s: Laptop -> Server -> Cloud (mozy in my case).

    A full backup from scratch (~120 gigabytes) takes nearly two weeks.

    What happens if you ever have to restore a terabyte? Do you wait a month? Will you still be in business?

  2. Brandon Oelling Says:

    A full backup from Server to Cloud?

    Or full restore from Cloud to Server?

    Your last point hints at the specific Service Level Agreement/Business Continuity plan(s) you would setup for your business that would allow for the necessary time to restore. Revenue generating production files could be placed on a 1TB drive and stored offsite if required. This would mitigate the download time from the cloud.

    I would also imagine that you wouldn’t need the entire 1TB to download before further production could resume.

  3. Brandon Oelling Says:

    Update: Amazon S3 now allows you to just send in data/drives that they will load into your storage cloud instance for you. This is also true for recovery, again, Amazon S3 will send your data back to you from your cloud instance.

    This mitigates a lot of the concerns on upstream and downstream data transfer rates which allows for more aggressive SLA’s.

Leave a Reply