[Table of Contents]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ARSCLIST] File naming for digital audio and associated files



Greetings again,

I think Steven Barr's comments and my own earlier ones were both
addressing mainly the issue of assigning "call numbers" to physical
items, in this case tapes. Susan has now clarified that she and her
colleague are dealing with computer-based file naming issues rather
than  tape numbering and labeling, so I apologize if I misunderstood
the original question and started drifting off on a tangent. I continue
to think that simplicity is best in any numbering or filing scheme,
whether on the shelf on on a computer, but I'll let someone else take a
stab at offering specific suggestions regarding the digital ID numbers
and keeping the audio files linked to the transcripts.

Susan, thanks for explaining further what you are working on there.
Good luck and best wishes,


Steve G.






Steve Green
Western Folklife Center
Elko, Nevada


******


On Nov 3, 2004, at 1:03 PM, Susan Hooyenga wrote:

Hi Steve,

Thank you for your help, and I'm sorry we weren't clear about what
sort of items
we were naming.  The tapes already have call numbers assigned by the
Alaska
Native Language Center; when they were digitized, they were assigned ID
numbers, which may or may be kept in the next phase of the project.
Right now
Andrea is working on digital transcripts and time-alignment files, so
she's
trying to figure out the best scheme for identifying all of the files.

Thanks again,
Susan Hooyenga

Quoting Steve Green <sgreen@xxxxxxxxxxxxxxxxxxx>:

Some thoughts on file naming for audio materials.

But first, can you clarify whether you are speaking of a numbering
system for physical recordings such as cassettes, or a file naming
system for computer-based digital sound files? It sounds like you are
dealing with a collection of tapes, probably cassettes?

In practice, it is easiest and most efficient to store physically
tangible recordings (we can refer to them as sound "carriers" as
distinguished from what might be called the "sonic content") in simple
numerical sequences. It seems to work best for cassettes, DATs, CDs,
and open reel tapes to have their own separate format-based sequences.
Additions to a collection or series simply get the next highest number
in the sequence and are added at the end. A finding aid (database,
collection inventory, etc) can enumerate the actual carrier numbers
associated with a given collection or series. This is important and
helps alert users when a collection has recordings in several
different
formats.

What you want to avoid is having to write complex identifying numbers
on carrier items and their containers. For one thing, most cassettes
and DATs have very little room for writing on shells and j-cards.
Writing on CDs and CD-Rs should be kept to a minimum because of
potential problems associated with writing directly on CD surfaces. In
theory, a simple unique number is all that should be needed to
retrieve
and re-file recording carriers. The number is made unique by the
addition of a format code that can be either a prefix or suffix.

For example

CT001
CT002
CT003, etc.

DT001
DT002
DT003, etc.

There is a strong temptation to include additional clues to the
content
by incorporating initials, dates, locations, project names, and so
forth. But these quickly can become unwieldy when dealing with all the
different format types and dimensions out there. One school of thought
suggests that you want all these indicators labeled on your carrier
items because if somehow the recordings were separated from an index
or
database, there are still clues as to what the recording is and how to
link it back to other documentation that may exist. While that is, in
theory, a good argument for using a more complex compound numbering
system, I believe that in a library, archives, or other relatively
stable curatorial situation, the likelihood of recordings becoming
irrevocably separated from the master shelflist are rather slim--
assuming that databases and other support files are backed up and
stored offsite as is the recommended practice.

They say recordings collections are only as accessible as the
documentation that exists about them. I feel that a well-maintained
database can contain a wealth of information about the physical
carriers as well as the provenance and content and can point users and
curators easily and quickly to a unique, specific shelf location, so
that complex, compound numbering systems are unnecessary. Even with
all
those extra initials, project code abbreviations, dates, etc. written
on the carrier, someone still has to be able to decode what it all
means, and that still falls back on external documentation that is
maintained in a file somewhere.

As for file naming of digital audio files on a computer down to the
track or segment level: Assuming you start with a physical carrier
item
to begin with, and assuming that the carrier has a unique number like
DT541 or CT229, it is then easy enough to add on a track or sequential
item number to the file name, for instance: DT541.01. Again, you need
an external database or other type of computer file in which to
maintain information (metadata) about the individual track or segment.
It seems to me that long, compound file names on a computer simply
increases the likelihood of error in naming or searching for files,
and
there may be limitations on the syntax of the filename as dictated by
the operating system.

When all is said and done, I have found that, when possible, keeping
things simple in the numbering, naming and labeling department makes
things that much easier to track and manage.

Hope this helps, and naturally I would be interested to hear other
ideas and points of view as well.


Best wishes,



Steve Green Western Folklife Center Elko, Nevada

*******


On Nov 3, 2004, at 10:49 AM, Susan Hooyenga wrote:


I'm posting this for a colleague on a linguistic project in Alaska:

------------------
My question concerns file naming conventions. We are working to
create
an
archive of the Dena'ina (Athabascan) Audio Collection, which contains
a few
hundred tapes, and associated transcription and alignment files. We
need a file
naming system for individual audio tracks (narratives) that addresses
key
identification information without being too unwieldy or too brief.
Some of
this information includes:

-the ID number of the original tape in the collection
-the name (or initials) of the speaker
-the content of the narrative (ie ''tools'' or ''hunting moose'')

Our main problem at the moment is deciding which bits of information
should be
part of the file name and which should be included in an index or
some
kind of
metadata file.

We'd very much appreciate any input or direction to any sources of
information
on file naming conventions and audio archiving.
Andrea Berez

------------------

I'll pass the answers on to Andrea - thanks!
Susan Hooyenga
E-MELD

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.







----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.




[Subject index] [Index for current month] [Table of Contents]