[Table of Contents]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AV Media Matters] Toward a MediaLESS Archive?



Greetings,

On Monday, July 12, 1999, Jim Lindner again raised for discussion the topic
of MediaLESS Warehousing. Of course, storage media will be required,
regardless of where you store your programming, it needs a place to exist.
It will not be perpetually 'on the way to somewhere'. He just tried to
provoke us into reacting, I just could not resist. Frankly, I would be more
inclined to call it Distributed Dynamic Media Asset Warehousing (DDMAW). At
the 1998 IBC I  proposed a similar model: 'that regional or national digital
data warehouses and audio-visual service centres be established, from which
copies can be ordered in any desired quality for on-line or off-line
delivery and as authorized by the copyright holders. These service centre
facilities would house a wide variety of well-maintained and competently
operated playback equipment to serve the video restoration, retrieval and
transfer needs for all magnetic tape formats offered. ...We should consider
the deep archiving service, similar to safety deposit box rental in a bank.
The only responsibility it has in relation to our digital audio-visual
productions, is to make sure that deposited data is readable on contemporary
storage media. If explicitly contracted to do so, it will take on
responsibility for ensuring that deposited digital works can be meaningfully
represented on and accessed from contemporary hardware platforms.(I used a
model of another industry, described by Neil Beagrie and Daniel Greenstein,
1998. Digital Directions: A Strategic Policy Framework for Creating and
Preserving Digital Resources, King's College, London). This could serve as a
model for the television, film and video industries to maintain and
disseminate their programme content in single, serial, compilation or
partial form in digits.'

Realizing that the days of manually migrating and transporting content from
one cassette to another are probably counted, various SMPTE Work Groups over
the last few years, such as V16.09 of which I am secretary, have done
extensive work indentifying the kind of features required of automated
management systems for future archives. In this connection I would recommend
reading of the February 1998 SMPTE Journal article by Dr. Juergen Heitmann
who is the chairman of V16.09, it is entitled:"User requirements and
Technologies for Automated Storage and retrieval", (pp. 100-105). Various
manufacturers are in fact already building hybrid video and data tape
modules to accommodate collections. There are many other building blocks
that are being defined, such as Selection criteria, Indexing automation,
Universal Numbering of frames and final products, Metadata encoding of
production elements to identify content or essence during shooting, indexing
and searching (video equipment that can accommodate this additional
information still needs to be developed), Browsing requirements etc.

One implication is that we are now also talking about compressing of some AV
works in order to transport them through the narrow pipelines available for
most current networks. Although this has content integrity consequences way
into the future (that we do not like to think about), the brighter side is
that migration within the DVCPro acquisition format unlike other options, is
practically lossless over many generations and it can be done at four times
playback speed. Yes, it is narrow miniature tape and hardware, but it saves
space too and appears to be the most economical and sustainable way of
preserving video content we know of, other than to store it in data form.
Besides, if we like to have tape with more space, there is the JVC option at
a cost. Why do I even raise the subject of videotape storage in the context
of this discussion? Well, many owners of programming find it the most
economical, even when hard disks and RAID systems are getting less
expensive. (See: the January 1998 SMPTE Journal article by Todd Roth,
entitled: "Video Servers "Shared storage" for Cost Effective Realtime
Access", (pp. 54-57). I do not believe that the bulk of video programming in
Media Asset Warehouses (MAW) will reside on RAID systems, but on tape. Yes,
this makes response time slower, but not everything owned will be accessed
frequently enough to make it cost effective to be fully online. Our
programming within the LAN or WAN will very probably reside on hybrid
automated storage media, in data form (for that which must be accessed
online at an instant), in DVC video form (for that which is made accessible
in the form it was acquired, enabling editing and browsing, but which will
be near-online or offline). However, most of our databases of AV works and
stockshots are already accessed via Webpages and hyperlinked, it is viable
to order content from near-online and offline sources placed online manually
as required, by personnel at the Media Asset Warehouses (MAW) designated by
copyright owners.

What we are talking about is not really that novel, as other industries are
already automating, using, migrating and networking the content of data
warehouses and are data mining for better access to business intelligence
(intellectual property). This could of course implies the efficient and
effective re-use of media assets, the reduction of overhead and minimizing
manual handling, hence the best thing we can do today, is be
'automation-ready' . In this connection it is worthwhile to consult the Data
Warehousing Website: http://dw-institute.com/.

The clincher for profitable investment in such facilities is the calculation
of a Return On Investment (ROI) of a AV Data Warehousing operation, let
alone a Distributed Dynamic Media Asset Warehouse Network. I am sure that
some serious ROI work will in due time  result in some business initiatives
to make it happen. But it is tough to calculate payback when so many AV
media assets are dormant and underutilized, because they are not yet
'automation-ready'.

People in the Data Warehousing business have already learned some of the
hard lessons that we can avoid: 1. Starting with the wrong sponsorship
chain: At the top must be an executive sponsor with a great deal of money to
invest in the effective use of information. 2. Setting expectations that you
cannot meet and frustrating executives at the moment of truth: Data
warehousing projects have at least two phases: (1) the selling phase in
which you attempt to persuade people that they can expect to get wonderful
access to the right data through simple, graphical delivery tools, (2) the
struggle to meet the expectations you have raised in phase one. Data
warehouses do not give users all the information they need. They do not
necessarily deliver it in usable form in terms of content formatting or file
format. I subscribe to a number of ftp sites. It enables us to post proposed
standards and recommended practices. We already find it awfully difficult to
find our documents back when the indexing system just uses surnames without
subjects for instance in a list of only twenty documents. Then we click on
our selection and using Acrobat wait for the document to appear on the
screen. If you are unlucky enough to have clicked on Powerpoint slides, your
only five-year old printer may get choked up in a jiffy. Waiting for them to
print is another test of patience, as your otherwise pretty decent computer
cannot be used for anything else while it feeds the printer at snail
pace.(3) Engaging in politically naive behavior. (e.g Saying "This will help
managers -read film makers, television program makers, researchers or
editors- to make better decisions), (4) Loading the Warehouse with
Information "Just because it was available": Archivists still operate with
very limited criteria to determine what has value, what can be re-used, what
is not copyrighted, and the culture is used to manual search and real-time
transactions. Also, archives are choking with tapes that either cannot be
played back anymore to be maintained (equipment is extinct) or that are not
labeled or identified with regard to their importance for posterity or other
applications. And access is conditional on the availability of redundant or
near-redundant playback equipment. So such tapes will hardly qualify for our
neat little Distributed system, because they are supposed to live on in
anonimity, if the conventional mindset is not re-engineered. (5) Believing
that Data warehousing Database design is the same as Transactional Database
design: Data warehousing databases are often denormalized to make them
easier to navigate for infrequent users. Transactional systems usually
contain only the basic data (stockshots), data warehousing users
increasingly expect to find aggregates (finished productions) and
time-series information ready for immediate display. (6) Choosing a Data
warehousing Manager who is Technology oriented rather than User-oriented:
Data warehousing is a service business-not a storage business. (7) Focusing
on traditional internal record-oriented data and ignoring the potential
value of external data and of text, images, and potentially sound and video.
A business plan must show who are the users of AV data and where they search
and find it, what they pay for it and how long it takes to get it? The sum
total must mean that the new approach is either much faster or much cheaper
than reshooting it. (8) Delivering data with overlapping and confusing
definitions: The Archilles heel of data warehousing is the requirement to
gain consensus on data definitions, conflicting definitions each have
champions, and they are not easily reconciled. Executives do not give up
their definitions without a fight, and few (AV) data warehousing managers
are in a position to bully executives (read archivists?) into agreement. We
all know how frustrating a search can be through 4,566 responses when you do
a web search, and to find the significant sites quickly. Obviously,
searching for specific content on our now integrated Media Asset Wide Area
Network will require more than a description, but at least a thumbnail
picture, and if it is worth exploring, a moving version of it. (9) Believing
the Performance, Capacity and Scalability promises: At a recent conference,
CIOs from three companies-a manufacturer, a retailer, and a service
company-described their data warehousing efforts. Although the data
warehouses were very different, all three ran into an identical problem.
Within four months of getting started, each of the CIOs unexpectedly had to
purchase at least one additional processor of a size equal to or larger than
the largest computer they had originally purchased for data warehousing.
They simply ran out of power. A very common capacity problem arises in
networking. One company reported that it sized a network to support an image
warehouse, but discovered that the network was soon overwhelmed. The
surprise was that the images were not at fault. The problem turned out to be
network traffic for data transfer between the end-user application and the
database of indices on the server. The images moved fast, but the process of
finding the right one clogged the network. Network overloads are a very
common surprise in client/server systems in general and in data warehousing
systems in particular. So there is some benefit from the SMPTE/EBU Taskforce
that recommends keeping metadata and the content separated, but linked. (10)
Believing that once the data warehouse is up and running, your problems are
finished: Each happy data warehouse user asks for new data and...wants it
immediately. Thus the data warehousing project needs to maintain high energy
over long periods of time. Data warehousing is a journey, not a destination.
(11)The natural progression of information in a data warehouse is (1)
extract the data from legacy systems, clean it and feed it to the warehouse,
(2) support ad hoc reporting until you learn what people want and then (3)
convert the ad hoc documents into regularly scheduled reports. Alert systems
can be a better approach and they can make a data warehouse
mission-critical. Alert systems monitor the data flowing into the warehouse
and inform all key people with a need to know, as soon as a critical event
takes place. (For the complete document quoted and massaged above, see
http://dw-institute.com/papers/10mistks.htm)

That is it for my contribution to AV data warehousing for now.

Best regards,

Ed H. Zwaneveld,
Technical Research and Development
National Film Board of Canada
125 Rue Houde, T-3
Saint lauernt, QC, Canada H4N 2J3
Tel: (514) 283-9143
Fax: (514) 283-0278


[Subject index] [Index for current month] [Table of Contents]