[Table of Contents]


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ARSCLIST] MIC and cataloguing



From: Patent Tactics, George Brock-Nannestad

Jane Johnson gave us a magnificient overview of a universal and very flexible
database structure that it is proposed to implement.

Here I have picked on a very important paragraph, important, because it
contains a statement that spells trouble if nobody uniforms terminology.

----- the paragraph is:

The only criteria for Union Catalog participation are 1) machine-readable
records and 2) an entry in the MIC Archive Directory, because the Directory
and the Union Catalog databases are linked.  (A key innovation of MIC is to
integrate the Archive Directory and the Union Catalog so that information
about obtaining an organization's resources is displayed right alongside the
bibliographic record supplied by that organization.) The organization (or
individual) submits an application, sample records and field list, then MIC
populates an online form with this data so that the organization can name MIC
data element equivalents for its own fields.  This utility will allow small
under-supported archives--and individuals--with very little metadata
expertise to share their records with a much broader audience, while enabling
large archives to integrate multiple metadata schema into a single system.

----- and the statement that I shall discuss briefly is:

The organization (or
individual) submits an application, sample records and field list, then MIC
populates an online form with this data so that the organization can name MIC
data element equivalents for its own fields.

----- The key term here is "MIC data element equivalents". What if there is
no equivalent, or if there is only a partial overlap between the type of
content in the organization's hierarchy and definitions and the definition of
a particular MIC data element?

If a search is made, then all of those items that appear, which upon scrutiny
are not as expected, must be regarded as false drops or noise. So, in the
metadata (comment field) concerning a descriptor, there must be an
authorization code. That way a searcher can limit the search to an authority
that he trusts. Or else, if the searcher has had good experiences searching a
particular individual's database, he should be able to limit the search to
that.

All of this is really best done by means of a thesaurus structure (controlled
vocabulary). Time invested in creating a full set of authorized descriptors
and maintaining it is to the good of all, but obviously to the cost of those
who do the work. In a previous posting I have lamented that with the
appearance of fast hard drives, the perceived need for thesauri disappeared -
sequential sorting being resorted to. But really, it is the only way of
mastering a field and obtaining precision in retrieval. Just think of the
fact that anything misspelt in the wrong place of a word will not be
retrieved using the correct form of that word. Certain misspellings may
sometimes still be caught by truncation. Again here, terms from a controlled
vocabulary would increase precision.

Kind regards,


George


[Subject index] [Index for current month] [Table of Contents]