[Table of Contents]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ARSCLIST] Hard disk drives and DAT/ calculating the future



Ah, the one thing you can absolutely count on is change.

All you can do is make assumptions, test those assumptions with those 
who are experts in their respective fields, and look at the past to see
what you can learn about the future.  As they say, the more things 
change, the more they stay the same.

I think a Gordon Moore-ish model for data density is not unreasonable,
independent of the technology used to achieve it.  That's the one thing
you can count on - memory and CPU power have followed these patterns for
decades now.

As you point out, it is important to account for format migration and
data corruption (and data recovery, and the cost of redundancy, etc.).

The good news is that all of this doesn't have to be left to sales people
as you suggest!  Risk assessment is a well-established and successful
science (whether for business, investments, health, insurance, etc.). 
It's the science of calculated risk.   Applying actuarial methods (or
operations research) to the problem of data storage is quite feasible.  
You can't reduce your risk to zero, but you can make the risks acceptable,
or at least apprehendable.

And where data is scarce, take what you know and build in what's known
as a "safety factor" by adjusting the numbers for extra conservatism
and build a plan around that.

Although not easy, it can be done.  If you don't take your best guess
at the future, you have a roadmap to nowhere.

Eric Jacobs
Principal

The Audio Archive
tel: 408.221.2128
fax: 408.549.9867
mailto:EricJ@xxxxxxxxxxxxxxxxxxx



-----Original Message-----
From: Andes, Donald [mailto:Donald.Andes@xxxxxxxxxx]
Sent: Tuesday, March 27, 2007 6:04 PM
To: EricJ@xxxxxxxxxxxxxxxxxxx; ARSCLIST@xxxxxxxxxxxx
Subject: RE: [ARSCLIST] Hard disk drives and DAT/ calculating the future


<snip>

There are MANY other concerns which most who haven't already delved into
the issue are completely unaware of?
i.e. File Format migration, Hard Drive Formating incompatibilities, Data
Corruption 

In probably 10 years we will be into new technology which is only in a
fringe state now such as Flash Hard Drives, speculating 50 or 100 years
out, is like trying to predict the Automobile, the Airplane, the
MicroChip, the PC, and the Internet, all while living in 1900.

Strangely, men at this time, armed with computers appear to believe that
we can do just that.

However, all things in perspective, the world is changing faster than it
ever has, and will be changing even more quickly as we race into the
future. I believe predicting the future is best left to salesmen.

Don Andes
EMI Music




-----Original Message-----
From: Association for Recorded Sound Discussion List
[mailto:ARSCLIST@xxxxxxx] On Behalf Of Eric Jacobs
Sent: Tuesday, March 27, 2007 5:27 PM
To: ARSCLIST@xxxxxxxxxxxxxxxx
Subject: Re: [ARSCLIST] Hard disk drives and DAT

I've been following this thread closely, and think a valuable exercise
would be to create detailed cost models for different storage scenarios.
Scenarios might include:

   - Hard disk on a shelf
   - Optical media (gold CD-R, DVD-R, Blue Ray)
   - Managed storage (mirrored on-line storage, tape back-up)

Common to each storage scenario:

   - Model three different size collections (1 TB, 10 TB, 50 TB)
   - Model a 50-year or 100-year life cycle
   - Make reasonable assumptions for media checking, refreshing,
     migration, and physical storage
   - Make assumptions about density and technology evolution 
     (Moore's law-ish)

Maybe there are other storage scenarios to be considered, and other
details to be considered, but you get the idea.  If I were an IT expert,
I'd crunch these cost models myself.

The ideal scenario will simply be the one with the lowest total cost,
since data integrity, migration, short-term and long-term costs will all
be accounted for, and the end goal (reliable long-term storage of data)
will be the same for each scenario.

I think all scenarios are viable - even the hard disk on the shelf (so
long as you have redundancy, check the integrity of the drive and the
data at some scheduled interval, and have a budget to deal with drive
failure and data recovery as needed, and refresh the hardware at some
reasonable interval).

I expect that different storage scenarios will be appropriate for
different collection sizes (1 vs 10 vs 50 TB).

To me, this seems like a pretty logical approach to the problem of
storage of digital assets for preservation.  As a collection grows,
different storage scenarios will make sense.  The question for many is
what storage scenario do you start with, and at what point do you
transition to the next level storage scenario.  In the end, it's all
cost driven (again, we're assuming that we drive each scenario to
equivalent levels of reliability through redundancy or other means).

I'd be surprised that no one has ever thought of working through these
different scenarios with a detailed cost model in order to determine
storage strategy and policy at some institution.

It's not rocket science, and the answer should be fairly straight-
forward I would think once you've captured all the costs and reliability
aspects (just because it's straight-forward doesn't mean that it's
easy!).

So, does anyone know of an existing cost-based study of long-term data
storage for several different scenarios AND several different size
collections?  If not, I may need to sharpen my pencil...

Eric Jacobs
Principal

The Audio Archive
tel: 408.221.2128
fax: 408.549.9867
mailto:EricJ@xxxxxxxxxxxxxxxxxxx


[Subject index] [Index for current month] [Table of Contents]