[padg] RE: RE: file naming conventions?

Hello all,

I had answered Ann Marie privately thinking that my situation was unusual, but realized looking at other responses that all digitization projects are different enough that our experience might be useful. I’m doing a project at Sibley Music Library digitizing PD printed scores, so the input and output formats are all similar. We base our file names on the bar code assigned to the copy we digitize, since it is logically and permanently linked both to the specific copy we digitized (we retain the original with its bar code attached to the housing) and to the bib record.

For us this works better than the bib record number, since we often have multiple copies on the same bib record, often with different individual characteristics, some bibliographic variations too slight for different records and some simply chance differences, e.g. how closely the margin was trimmed, or mutilated pages. The bar code is not in fact included in our OPAC, but it is closely tracked in the system and can be used to call up the specific record in both Circulation and Cataloging staff modules.

Alice Carli

Conservator

Sibley Music Library

From: Walls, David [mailto:david.walls@xxxxxxxx]
Sent: Thursday, July 10, 2008 10:11 AM
To: padg@xxxxxxx; digipres@xxxxxxx
Subject: [padg] RE: file naming conventions?

Ann Marie

If there are guidelines around for file naming conventions, I haven't been able to find anything that offers more than the most basic suggestions.

My advice is to not to try to make up a naming convention, but to use the bibliographic record identification number for the specific resource to be scanned that is found in the MARC record for the title in your OPAC. Most of the materials that we are digitally reformatting are cataloged in our OPAC. Call numbers can change, several books can have the same title, and using truncated titles for file names frequently don't offer much information. The bibliographic record number is unique, does not change, and we use this as the persistent identifier for the files. Also, data from OPACs already have a fairly reliable track record of being migrated into the future.

In our OPAC, the bibliographic record number is a six digit number. When we send materials to be scanned, we also send the vendor an Excel spreadsheet that includes the bibliographic record number, the title, and other information. The vendor returns the digital files of the materials scanned on a portable USB hard drive. The drive contains a series of folders all named by the six digit bibliographic id number. Inside each of the folders are the master, derivative, and metadata files. For example, the parent folder would be named 123456 or whatever the actual number is. Inside the parent folder are four other folders named 123456.tif or 123456.jp2 depending on what we've chosen for the master file. The other folders are 123456.pdf and 123456.xml.

Please let me know if you have other questions.

David Walls

Preservation Librarian, Yale University Library.

Head, Reformatting and Media Preservation.

From: Ann Marie Willer [mailto:amwillerala@xxxxxxxxx]
Sent: Wednesday, July 09, 2008 5:22 PM
To: digipres@xxxxxxx; padg@xxxxxxx
Subject: [padg] file naming conventions?

Colleagues,
I am involved in discussions about file naming conventions for the products of digitization projects. Could you (1) recommend guidelines recently published or posted and/or (2) share what you do at your institution?

If I've missed a previous discussion, please let me know, and I will consult the archives as well.

Thanks,
Ann Marie

Ann Marie Willer
Preservation Services Librarian
Massachusetts Institute of Technology
77 Massachusetts Ave.
Building 14-0513
Cambridge, MA 02139
617-253-5692 phone

Send ALA business to: AMWillerALA@xxxxxxxxx

[Table of Contents]

[padg] RE: RE: file naming conventions?

[Subject index] [Index for current month] [Table of Contents]