Report of the
Biological Collections Data Standards Workshop
August 18-24, 1992
REPORT OF THE BIOLOGICAL COLLECTIONS DATA STANDARDS WORKSHOP (August 18-24, 1992) TABLE OF CONTENTS I. Introduction II. An Information Model for Biological Collections A. A Context for Information Modeling B. Components of the Information Model C. List of Entities D. Entity Descriptions and Their Attributes E. List of Relationships F. Relationship Descriptions APPENDIX A Information Model - Definitions and Conventions APPENDIX B Data Standards Workshop Participants REPORT OF THE BIOLOGICAL COLLECTIONS DATA STANDARDS WORKSHOP (AUGUST 18-24, 1992) I. Introduction The Association of Systematics Collections Committee on Computerization and Networking met at Cornell University in Ithaca, New York, from August 18-24, 1992. Co-chairs Julian Humphries and Janet Gomon organized the meeting with assistance from Elaine Hoagland of ASC. The focus of the workshop was to initiate the process of establishing data standards for biological collection information. The opportunity exists for natural history museums to be at the forefront of digital access to information about specimens, taxa, and organismal biology. Conservation biologists, molecular geneticists, ecologists, functional morphologists, law enforcement officials -- the list is very long of those who are potential users of natural history collection information. But traditional means of access: personal visits to collections and long diligent searches of paper records guarantees that most of these people will find other means of acquiring the information they need or act with insufficient information. If these researchers could have simple, rapid access to the huge amount of knowledge that our collections represent, then natural history collection institutions can be at the vanguard of information providers. In order to support a broad audience accessing our collections, as well as ensure efficient access for traditional users, certain guidelines and rules by which we record information about collections will need to be established. Workshop participants agreed that previous standardization efforts had primarily focused on individual elements of collection information and that no interdisciplinary model of this information existed. It was decided that for a cross-disciplinary effort to succeed, a high-level description of biological collections was required. Workshop participants undertook this effort by producing a draft information model for biological collections, described in this report. The draft model is being circulated to scientific societies and made available via Taxacom. Comments are welcome. II. An Information Model for Biological Collections A. A Context for Information Modeling The purpose of most databases is to describe things and processes in the real world. Descriptions of the real world are maintained in data structures reflecting the categories of information that are of interest to users. An information model is a tool for designing databases and represents a highly structured description of information in the real world. It contains specifications for the both data structures and the rules that must be followed to keep the data internally consistent within a database. It allows problem domain experts to describe the domain without becoming programmers. Because information models describe the real world, they are independent of the hardware or software tools used in a particular database implementation. They evolve or change only as the problem domain changes or the needs of users change. Information modeling has been proven effective in the development of numerous business and scientific databases, and information management professionals have come to regard modeling is the basis for designing correct, consistent, sharable, and flexible databases (Fleming & von Halle, 1989). Models are particularly useful when the problem domain is large, and when the desired database is intended to serve a diverse community of users. In such cases, the database design almost certainly will require input from several experts. One of the most important reasons for building an information model is that it allows multiple experts to contribute to the problem description, and allows the description to be validated or revised as necessary by additional reviewers. Once completed, the model also serves as a communication tool between domain experts and database programmers. B. Components of an Information Model Information models typically have two components, a highly structured textual description, and one or more illustrations that summarize the model. The illustrations are called entity-relationship diagrams (ERDs), and depict the principal information entities of the problem domain, as well as the interrelationships among them. The structural components of an information model, including the diagramming conventions, are defined and explained in Appendix A. Figure 1 represents a "first cut" at a high-level information model for biological collection catalogs. (It spans four pages, so we encourage readers to remove these pages and paste them together.) The textual description of the model follows the summary illustration and contains an alphabetical list of entities, a definition and description of each entity (including example data elements in some cases), an alphabetical list of relationships, and the relationship descriptions. C. List of Entities (Supertypes and Subtypes in Alphabetical Order): AGENT ASSOCIATED-COLLECTING-EVENT CITATION COLLECTING-EVENT COLLECTING-EVENT-CITATION COLLECTING-METHOD COLLECTING-UNIT COLLECTING-UNIT-CITATION COLLECTION COLLECTOR DERIVED-OBJECT DERIVED-OBJECT-TYPE DETERMINATION DETERMINER ELEVATION (delete?) ESTUARINE-HABITAT-DESCRIPTION FRESHWATER-HABITAT-DESCRIPTION GAZETTEER-CITATION GEOLOGICAL-TIME-SCALE GEOMETRIC-LOCALITY HABITAT-DESCRIPTION HABITAT-TYPE LINE LOCALITY LOCALITY-CITATION LOT MARINE-HABITAT-DESCRIPTION NAME NAME-USE NODE NODE-USE ORGANIZATION PALEO-COLLECTING-EVENT PERSON PLATFORM POINT POLYGON PREPARATION-TECHNIQUE PREPARATOR REAL-WORLD RECENT-COLLECTING-EVENT REFERENCE SPECIMEN SPECIMEN-ASSOCIATION SPECIMEN-ASSOCIATION-TYPE SPECIMEN-COMPONENT SPECIMEN-COMPONENT-TYPE STORAGE-LOCATION STORAGE-MEDIUM STORAGE-REGIME TERRESTRIAL-HABITAT-DESCRIPTION TIME TRANSACTION TRANSACTOR UNSORTED-LOT D. Entity Descriptions Entity Name: AGENT (supertype) Definition: A PERSON, ORGANIZATION, or PLATFORM that performs actions on various biological and collection entities. Subtypes: PERSON ORGANIZATION PLATFORM Primary Key: AGENT-ID Foreign Keys: Target Entity: none Data Elements: Example Data Elements: AGENT-ID AGENT-TYPE-CD Remarks: The subtype entities are collected into the AGENT supertype because more than one of the AGENT subtypes may play the same role in relationships with other entities. For example, any combination of PERSON, ORGANIZATION and PLATFORM may serve as a COLLECTOR in a COLLECTING-EVENT.Entity Name: ASS Definition: Establishes and describes a (recursive) relationship between two COLLECTING-EVENTs. Primary Key: ASSOCIATED-COLLECTING-EVENT-ID COLLECTING-EVENT-ID Foreign Keys: Target Entity: COLLECTING-EVENT Data Elements: COLLECTING-EVENT-ID Target Entity: COLLECTING-EVENT Data Elements: ASSOCIATED-COLLECTING-EVENT-ID Example Data Elements: ASSOCIATED-COLLECTING-EVENT-ID COLLECTING-EVENT-ID ASSOCIATION-NAM An arbitrary name given to an association of COLLECTING-EVENTs. Examples: Albatross; International Indian Ocean Expedition; Bill & Ted's Excellent Adventure. ASSOCIATION-DESCRIPTION-TXT Describes how COLLECTING-EVENTs are related. Examples: Same expedition; same cruise; same locality; replicate sampling protocol. Entity Name: CITATION (supertype) Definition: Subtypes: SPECIMEN-CITATION LOCALITY-CITATION GAZETTEER-CITATION COLLECTING-EVENT-CITATION ETC. Primary Key: CITATION-ID Foreign Keys: Target Entity: REFERENCE Data Elements: REFERENCE-ID Example Data Elements: CITATION-ID CITATION-TYPE-CD REFERENCE-ID Entity Name: COLLECTING-EVENT (supertype) Definition: The act of collecting zero or more COLLECTING-UNITs at a particular LOCALITY and TIME. Subtypes: PALEO-COLLECTING-EVENT RECENT-COLLECTING-EVENT Primary Key: COLLECTING-EVENT-ID A unique tag (surrogate key) to allow other entities to connect to COLLECTING-EVENT. Foreign Keys: Target Entity: LOCALITY Data Elements: LOCALITY-ID Example Data Elements: COLLECTING-EVENT-ID A unique tag to allow other entities to connect to COLLECTING-EVENT. COLLECTING-EVENT-TYPE-CD A classification attribute, indicating the type (kind) of COLLECTING-EVENT. STATED-TIME-TXT Specification of points and/or intervals of time in absolute or indefinite units, or relative to each other. Examples: evening; late 1980s; spring; March or June; 17:56:01, 12 JUN 1992; three hours after second dredge haul. STATED-LOCALITY-TXT Original statement (literal quotation) of the location of the COLLECTING-EVENT. COLLECTING-EVENT-COMMENTS-TXT (Unstructured text.) Example: Nothing was collected at this station; dredge not adequately cleaned between hauls. Entity Name: COLLECTING-EVENT-CITATION Definition: A subtype of REFERENCE, which associates a REFERENCE work with a COLLECTING-EVENT. Primary Key: CITATION-ID Foreign Keys: Target Entity: COLLECTING-EVENT Data Elements: COLLECTING-EVENT-ID Example Data Elements: Entity Name: COLLECTING-METHOD Description: A description of the technique(s), equipment, and/or process(es) by which COLLECTING-UNITs are collected. Primary Key: COLLECTING-METHOD-ID Foreign Keys: none Example Data Elements: COLLECTING-METHOD-ID COLLECTING-METHOD-DESCRIPTION-TXTEntity Name: COLLECTING-UNIT (super Definition: An operational sample, typically, but not necessarily a SPECIMEN, LOT, or a DERIVED-OBJECT from a single COLLECTING- EVENT. A COLLECTING-UNIT may be an UNSORTED-LOT, LOT, SPECIMEN, SPECIMEN-COMPONENT, or DERIVED-OBJECT. Subtypes: UNSORTED-LOT LOT SPECIMEN SPECIMEN-COMPONENT DERIVED-OBJECT Primary Key: COLLECTING-UNIT-ID Foreign Keys: Target Entity: COLLECTING-EVENT Data Elements: COLLECTING-EVENT-ID Target Entity: HABITAT-DESCRIPTION Data Elements: HABITAT-DESCRIPTION-ID Example Data Elements: COLLECTING-UNIT-ID COLLECTING-UNIT-TYPE-CD NUMBER-OF-ITEMS-CNT HABITAT-DESCRIPTION-ID (or SPECIMEN-RELATED-HABITAT-DESCRIPTION-ID if a separate entity is used for finer-scale descriptions) Entity Name: COLLECTION Definition: An assemblage of biological specimens maintained by an educational or research institution to be used as a research resource in biological systematics and/or ecology. Primary Key: COLLECTION-ID Foreign Keys: ORGANIZATION-ID Example Data Elements: COLLECTION-ID COLLECTION-NAM ORGANIZATION-ID Entity Name: COLLECTOR Definition: A person, platform, or organization (AGENT) that collects biological collecting units. This isn't a real entity; it duplicates the AGENT entity Primary Key: COLLECTOR-ID Foreign Keys: none Example Data Elements: Entity Name: DERIVED-OBJECT Definition: A COLLECTING-UNIT of one or more observations, images, or representations of UNSORTED LOTs, LOTs, SPECIMENs, ASSOCIATED SPECIMENs or SPECIMEN-COMPONENTs. Primary Key: DERIVED-OBJECT-ID (=COLLECTING-UNIT-ID) Foreign Keys: Target Entity: COLLECTING-UNIT Data Elements: ORIGINAL-COLLECTING-UNIT-ID Target Entity: DERIVED-OBJECT-TYPE Data Elements: DERIVED-OBJECT-TYPE-ID Example Data Elements: DERIVED-OBJECT-ID (=COLLECTING-UNIT-ID) DERIVED-OBJECT-TYPE-ID ORIGINAL-COLLECTING-UNIT-ID (=COLLECTING-UNIT-ID) Entity Name: DERIVED-OBJECT-TYPE Definition: The class of DERIVED-OBJECTs obtained from a COLLECTING- UNIT. Primary Key: DERIVED-OBJECT-TYPE-ID Foreign Keys: Target Entity: Data Elements: Example Data Elements: DERIVED-OBJECT-TYPE-ID DERIVED-OBJECT-TYPE-NAME DERIVED-OBJECT-TYPE-DESCRIPTION-TXT Entity Name: DETERMINATION Definition: An association of a COLLECTING-UNIT with a NAME by an authority at a particular time. Primary Key: COLLECTING-UNIT-ID NAME-ID DETERMINATION-DAT Foreign Keys: Target Entity: COLLECTING-UNIT Data Elements: COLLECTING-UNIT-ID Target Entity: NAME Data Elements: NAME-ID Example Data Elements: COLLECTING-UNIT-ID NAME-ID DETERMINER-ID (=AGENT-ID) Entity Name: DETERMINER Definition: An authority (person) that makes the association between a taxon name and a COLLECTING-UNIT Primary Key: DETERMINER-ID (=AGENT-ID) Foreign Keys: none Example Data Elements: Same as PERSON Entity Name: ELEVATION Definition: Is this a real entity; or is it the Z coordinate of any point contained in one of the 3 subtypes of geometric locality subtypes (point, line, or polygon)? -- expressed as deviation from sea-level in meters; different from STATED-ELEVATION, which may be in other units, or expressed as a range. Primary Key: Foreign Keys: Example Data Elements: Entity Name: ESTUARINE-HABITAT-DESCRIPTION Definition: Primary Key: HABITAT-DESCRIPTION-ID Foreign Keys: none Example Data Elements: Entity Name: FRESHWATER-HABITAT-DESCRIPTION Definition: Primary Key: HABITAT-DESCRIPTION-ID Foreign Keys: none Example Data Elements: Entity Name: GAZETTEER-CITATION Definition: A named place. Primary Key: GAZETTEER-CITATION-ID Foreign Keys: Target Entity: GAZETTEER-CITATION Data Elements: CONTAINING-GAZETTEER-CITATION-ID Example Data Elements: GAZETTEER-CITATION-ID CONTAINING-GAZETTEER-CITATION-ID PLACE-NAM PLACE-TYPE-CD Entity Name: GEOLOGICAL-TIME-SCALE Definition: Primary Key: Foreign Keys: Target Entity: Data Elements: Example Data Elements: Entity Name: GEOMETRIC-LOCALITY Definition: A geographical location defined (in a standard coordinate system) by a point, line, or polygon. Primary Key: LOCALITY-ID Foreign Keys: none Example Data Elements: Entity Name: HABITAT-DESCRIPTION Definition: A description of the physical and biotic environment at the time and place of a COLLECTING-EVENT. Subtypes: TERRESTRIAL-HABITAT-DESCRIPTION FRESHWATER-HABITAT-DESCRIPTION MARINE-HABITAT-DESCRIPTION ESTAURINE-HABITAT-DESCRIPTION ETC. Primary Key: HABITAT-DESCRIPTION-ID Foreign Keys: none Example Data Element Groups: Geomorphic Features Physical/Chemical Measurements Sampling Scale Meteorological Data Life Zone Vegetation Type Soil Type Entity Name: HABITAT-TYPE Definition: An Primary Key: HABITAT-TYPE-ID Foreign Keys: Example Data Elements: HABITAT-TYPE-ID HABITAT-TYPE-NAM HABITAT-TYPE-DEFINITION-TXT Entity Name: LINE Definition: A complex data type including latitude, longitude, and direction. (Are we really talking about a chain of points, each with a three dimensional location? We might consult David Mark about this.) Primary Key: LOCALITY-ID Foreign Keys: none Example Data Elements: (as represented in a GIS) Entity Name: LOCALITY Definition: A geographical mappable location. Primary Key: LOCALITY-ID Foreign Keys: Example Data Elements: LOCALITY-ID LOCALITY-NAME Entity Name: LOCALITY-CITATION Definition: An association between a reference and a locality. Primary Key: CITATION-ID Foreign Keys: Target Entity: LOCALITY Data Elements: LOCALITY-ID Example Data Elements: Entity Name: LOT Definition: A COLLECTING-UNIT of one or more individuals of the same taxon from the same COLLECTING-EVENT. Primary Key: LOT-ID (=COLLECTING-UNIT-ID) Foreign Keys: Target Entity: UNSORTED-LOT Data Elements: UNSORTED-LOT-ID (=COLLECTING-UNIT-ID) Example Data Elements: COLLECTING-UNIT-ID AGE-RANGE SIZE-RANGE DISTRIBUTION-OF-DUPLICATES STAGES SEXES Entity Name: MARINE-HABITAT-DESCRIPTION Definition: Primary Key: MARINE-HABITAT-DESCRIPTION-ID (=HABITAT-DESCRIPTION-ID) Foreign Keys: none Example Data Elements: Entity Name: NAME Definition: Literature citation for the original source of a name. Primary Key: NAME-ID Foreign Keys: Target Entity: REFERENCE Data Elements: REFERENCE-ID Example Data Elements: NAME-ID NAME-NAM AUTHOR-NAM PAGE-CNT REFERENCE-ID (Original-Rank) not needed if we assume every name is represented by a name use (classification). Entity Name: NAME-USE Definition: The application of a NAME (including a synonym) to a NODE- USE. Primary Key: REFERENCE-ID NODE-ID NAME-ID Foreign Keys: Target Entity: Data Elements: Example Data Elements: RANK-CD NAME-STATUS-CD Entity Name: NODE Definition: A vertex on a directed acyclic or cyclic graph; a tip or junction in a classification hierarchy or overlapping classification hierarchies. Primary Key: NODE-ID Foreign Keys: none Example Data Elements: NODE-ID Entity Name: NODE-USE Definition: Any particular instance (a published reference) of a NODE is a placeholder for the name of a taxon in a classification hierarchy. Primary Key: REFERENCE-ID NODE-ID Foreign Keys: Target Entity: NODE-USE Data Elements: PARENT-NODE-REFERENCE-ID PARENT-NODE-ID Example Data Elements: REFERENCE-ID NODE-ID PARENT-NODE-REFERENCE-ID PARENT-NODE-ID Entity Name: ORGANIZATION Definition: Primary Key: ORGANIZATION-ID (=AGENT-ID) Foreign Keys: none Example Data Elements: ORGANIZATION-ID (=AGENT-ID) ACRONYM-CD DEPARTMENT-NAM INSTITUTION-NAM Entity Name: PALEO-COLLECTING-EVENT Definition: Primary Key: PALEO-COLLECTING-EVENT-ID (=COLLECTING-EVENT-ID) Foreign Keys: Target Entity: LOCALITY Data Elements: LOCALITY-ID Example Data Elements: Bed Stated-Age Dating-Method Lithology Entity Name: PERSON Definition: Primary Key: PERSON-ID (=AGENT-ID) Foreign Keys: none Example Data Elements: PERSON-ID (=AGENT-ID) LAST-NAM FIRST-NAM TITLE-TXT Entity Name: PLATFORM Definition: Primary Key: PLATFORM-ID (=AGENT-ID) Foreign Keys: Target Entity: (We may wish to record a relationship between PLATFORM and ORGANIZATION.) Data Elements: Example Data Elements: PLATFORM-ID (=AGENT-ID) PLATFORM-NAM Entity Name: POINT Definition: Latitude, longitude, elevation. Primary Key: LOCALITY-ID Foreign Keys: none Example Data Elements: LATITUDE-DEGREES LATITUDE-MINUTES LATITUDE-SECONDS LATITUDE-DIRECTION LONGITUDE-DEGREES LONGITUDE-MINUTES LONGITUDE-SECONDS LONGITUDE-DIRECTION ACCURACY ELEVATION Entity Name: POLYGON Definition: A complex data type with an array of latitude and longitude data. Primary Key: LOCALITY-ID Foreign Keys: none Example Data Elements: Accuracy Entity Name: PREPARATION-TECHNIQUE Definition: An action taken to develop or preserve a specimen that departs from, or goes beyond the standard processing of a specimen. A preparation technique may produce a derived object. Primary Key: COLLECTING-UNIT-ID PREPARATION-TECHNIQUE-ID Foreign Keys: Target Entity: COLLECTING-UNIT Data Elements: COLLECTING-UNIT-ID Target Entity: PREPARATOR (=AGENT) Data Elements: PREPARATOR-ID (=AGENT-ID) Example Data Elements: COLLECTING-UNIT-ID PREPARATION-TECHNIQUE-ID PREPARATOR-ID Entity Name: PREPARATOR Definition: An agent that performs a preparation technique. Primary Key: PREPARATOR-ID (=AGENT-ID) Foreign Keys: none Example Data Elements: PREPARATOR-ID (=AGENT-ID) Entity Name: REAL-WORLD Definition: Biological entities subject to a COLLECTING-EVENT, DETERMINATION or other actions. Primary Key: REAL-WORLD-ID Foreign Keys: Target Entity: Data Elements: Example Data Elements: Entity Name: RECENT-COLLECTING-EVENT Definition: Primary Key: RECENT-COLLECTING-EVENT-ID (=COLLECTING-EVENT-ID) Foreign Keys: Target Entity: HABITAT-DESCRIPTION Data Elements: RECENT-HABITAT-DESCRIPTION-ID (=HABITAT-DESCRIPTION-ID) Example Data Elements: Entity Name: REFERENCE Definition: A published or unpublished work that contains information about a biological collection entity. Examples: an article, book, occasional report, field notes, map, catalog, etc. Primary Key: REFERENCE-ID Foreign Keys: None Example Data Elements: REFERENCE-KIND-CODE REFERENCE-DESCRIPTION-TEXT REFERENCE-AUTHOR-NAME REFERENCE-PUBLISHED-DATE REFERENCE-TITLE-TEXT REFERENCE-JOURNAL-NAME REFERENCE-VOLUME-IDENTIFIER REFERENCE-ISSUE-IDENTIFIER REFERENCE-PAGES-IDENTIFIER REFERENCE-PUBLISHER-NAME REFERENCE-PUBLISHER-CITY-NAME Entity Name: SPECIMEN Definition: A COLLECTING-UNIT of one or more individuals or parts of individuals from a single COLLECTING-EVENT. Primary Key: SPECIMEN-ID (=COLLECTING-UNIT-ID) Foreign Keys: Target Entity: LOT Data Elements: LOT-ID (=COLLECTING-UNIT-ID) Example Data Elements: SPECIMEN-ID SPECIMEN-SEX-CD SPECIMEN-PHENOLOGY-CD SPECIMEN-LIFE-STAGE-CD SPECIMEN-STANDARD-LENGTH-DMSN SPECIMEN-AGE-QTY Entity Name: SPECIMEN-ASSOCIATION Definition: An association of COLLECTION-UNITs, either SPECIMENs, LOTs, UNSORTED LOTs. Primary Key: COLLECTING-UNIT-ID ASSOCIATED-COLLECTING-UNIT-ID Foreign Keys: Target Entity: COLLECTING-UNIT Data Elements: COLLECTING-UNIT-ID Target Entity: ASSOCIATED-COLLECTING-UNIT (=COLLECTING-UNIT) Data Elements: ASSOCIATED-COLLECTING-UNIT-ID (=COLLECTING-UNIT-ID) Target Entity: SPECIMEN-ASSOCIATION-TYPE Data Elements: SPECIMEN-ASSOCIATION-TYPE-ID Example Data Elements: Entity Name: SPECIMEN-ASSOCIATION-TYPE Definition: A kind of SPECIMEN-ASSOCIATION. Primary Key: SPECIMEN-ASSOCIATION-TYPE-ID Foreign Keys: none Example Data Elements: Entity Name: SPECIMEN-CITATION Definition: Associates a specimen and a reference work, and describes the relationship between them. Subtypes: TYPE-SPECIMEN-CITATION Primary Key: CITATION-ID Foreign Keys: Target Entity: Specimen Data Elements: SPECIMEN-ID Example Data Elements: SPECIMEN-CITATION-KIND-CODE SPECIMEN-CITATION-PAGE-ID SPECIMEN-CITATION-PLATE-ID SPECIMEN-CITATION-FIGURE-ID SPECIMEN-CITATION-REMARKS-TXT Entity Name: SPECIMEN-COMPONENT Definition: A COLLECTING-UNIT of individual organisms or parts of individual organisms from a single COLLECTING-EVENT. Primary Key: SPECIMEN-COMPONENT-ID (=COLLECTING-UNIT-ID) Foreign Keys: Target Entity: SPECIMEN (=COLLECTING-UNIT) Data Elements: SPECIMEN-ID (=COLLECTING-UNIT-ID) Target Entity: SPECIMEN-COMPONENT-TYPE Data Elements: SPECIMEN-COMPONENT-TYPE-ID Example Data Elements: SPECIMEN-COMPONENT-ID (=COLLECTING-UNIT-ID) SPECIMEN-ID (=COLLECTING-UNIT-ID) SPECIMEN-COMPONENT-TYPE-ID Entity Name: SPECIMEN-COMPONENT-TYPE Definition: A class or kind of SPECIMEN-COMPONENT. Primary Key: SPECIMEN-COMPONENT-TYPE-ID Foreign Keys: none Example Data Elements: SPECIMEN-COMPONENT-TYPE-ID SPECIMEN-COMPONENT-TYPE-NAM Entity Name: STORAGE-LOCATION Definition: The physical location of a COLLECTING-UNIT in relation to a COLLECTION. Primary Key: STORAGE-LOCATION-ID Foreign Keys: none Example Data Elements: STORAGE-LOCATION-ID STORAGE-LOCATION-DESCRIPTION (a local issue) Entity Name: STORAGE-MEDIUM Definition: The physical medium, container, mount, used for the STORAGE- REGIME OF COLLECTING-UNITs. Primary Key: STORAGE-MEDIUM-ID Foreign Keys: none Example Data Elements: STORAGE-MEDIUM-ID STORAGE-MEDIUM-CD e.g.: sheet, packet, box, jar, case, shelf, vial Entity Name: STORAGE-REGIME Definition: The physical location, kind of storage and availability of a COLLECTING-UNIT in relation to a COLLECTION. Primary Key: STORAGE-REGIME-ID Foreign Keys: Target Entity: STORAGE-LOCATION Data elements: STORAGE-LOCATION-ID Target Entity: STORAGE-MEDIUM Data elements: STORAGE-MEDIUM-ID Example Data Elements: STORAGE-REGIME-ID STORAGE-LOCATION-ID STORAGE-MEDIUM-ID START-DAT END-DAT AUTHORITY-NAM COMMENTS-TXT Entity Name: TERRESTRIAL-HABITAT-DESCRIPTION Definition: Primary Key: Foreign Keys: none Target Entity: RECENT-COLLECTING-EVENT Data elements: RECENT-COLLECTING-EVENT-ID (=COLLECTING-EVENT-ID) Example Data Elements: Entity Name: TIME Definition: Translation of Stated-Time into instant(s) and/or duration(s) of calendar time, where such can be made unambiguously. Primary Key: TIME-ID Foreign Keys: (Some confusion exists around cardinalities in the relationship between TIME and COLLECTING-EVENT, and therefore, placement of the foreign key) Example Data Elements: CLOCK-TIME-QTY (point) START-CLOCK-TIME-QTY END-CLOCK-TIME-QTY CLOCK-TIME-QUALIFIER-CD TIME-ZONE-CD (determined from location and date) DATE START-DAT END-DAT DATE-QUALIFIER-CD Entity Name: TRANSACTION Definition: An action that changes the location, physical custody or ownership status of a COLLECTING-UNIT. Primary Key: Foreign Keys: none Example Data Elements: Entity Name: TRANSACTOR Definition: An agent that performs TRANSACTIONs. Primary Key: TRANSACTOR-ID (=AGENT-ID) Foreign Keys: none Example Data Elements: Entity Name: UNSORTED-LOT Definition: A COLLECTING-UNIT of mixed taxa from a single COLLECTING- EVENT. Primary Key: UNSORTED-LOT-ID (=COLLECTING-UNIT-ID) Foreign Keys: Target Entity: ORIGINAL-UNSORTED-LOT (=COLLECTING-UNIT) Data Elements: ORIGINAL-UNSORTED-LOT-ID (=COLLECTING-UNIT- ID) Example Data Elements: UNSORTED-LOT-ID (=COLLECTING-UNIT-ID) ORIGINAL-UNSORTED-LOT-ID (=COLLECTING-UNIT-ID) E. LIST OF ENTITY RELATIONSHIPS ENTITY relationship ENTITY AGENT writes, edits, or publishes REFERENCE COLLECTING-EVENT refers to COLLECTING-EVENT COLLECTING-EVENT takes place at LOCALITY COLLECTING-EVENT involves REAL-WORLD COLLECTING-EVENT occurs in TIME COLLECTING-EVENT-CITATION refers to COLLECTING-EVENT COLLECTING-METHOD is used in COLLECTING-EVENT COLLECTING-UNIT results from COLLECTING-EVENT COLLECTING-UNIT refers to COLLECTING-UNIT COLLECTING-UNIT has STORAGE-REGIME COLLECTING-UNIT-CITATION refers to COLLECTING-UNIT COLLECTOR (=AGENT) participates in COLLECTING-EVENT DERIVED-OBJECT is derived from COLLECTING-UNIT DERIVED-OBJECT-TYPE validates DERIVED-OBJECT DETERMINATION is made on COLLECTING-UNIT DETERMINATION involves NAME-USE DETERMINER (=AGENT) makes DETERMINATION GAZETTEER-CITATION is contained within GAZETTEER-CITATION LOCALITY is closest to/contained within GAZETTEER-CITATION LOCALITY is bounded by GEOLOGICAL-TIME-SCALE LOCALITY-CITATION refers to LOCALITY LOT is sorted from UNSORTED-LOT NAME is based on COLLECTING-UNIT NAME is used in NAME-USE NAME-USE applies to NODE-USE NODE-USE is contained in (parent-)NODE-USE NODE-USE involves NODE PALEO-COLLECTING-EVENT is bounded by GEOLOGICAL-TIME-SCALE PREPARATION-TECHNIQUE is applied to a COLLECTING-UNIT PREPARATOR (=AGENT) uses PREPARATION-TECHNIQUE RECENT-COLLECTING-EVENT is described by HABITAT-DESCRIPTION REFERENCE contains CITATION REFERENCE establishes NAME REFERENCE contains NAME-USE REFERENCE contains NODE-USE SPECIMEN is sorted from LOT SPECIMEN-ASSOCIATION-TYPE validates SPECIMEN-ASSOCIATION SPECIMEN-COMPONENT-TYPE validates SPECIMEN-COMPONENT SPECIMEN-COMPONENT is derived from SPECIMEN STORAGE-REGIME uses STORAGE-MEDIUM STORAGE-REGIME has STORAGE-LOCATION TRANSACTION involves COLLECTING-UNIT TRANSACTOR (=AGENT) ? COLLECTION TRANSACTOR (=AGENT) participates in TRANSACTION F. RELATIONSHIP DESCRIPTIONS Relationship: AGENT <> REFERENCE Each AGENT writes, edits, or publishes zero to many REFERENCEs. Each REFERENCE is written, edited, or published by one to many AGENTs. Relationship: COLLECTING-EVENT <> COLLECTING-EVENT Each COLLECTING-EVENT is associated with zero to many COLLECTING-EVENTS. Relationship: COLLECTING-EVENT <> LOCALITY Each COLLECTING EVENT takes place at one and only one LOCALITY. Each LOCALITY may be the subject of one or more COLLECTING EVENTs. Relationship: COLLECTING-EVENT <> REAL-WORLD Each COLLECTING-EVENT involves one and only one REAL-WORLD. Each (the?) REAL-WORLD is subject to zero to many COLLECTING-EVENTS Relationship: COLLECTING-EVENT <> TIME Each COLLECTING-EVENT occurs at, or spans a range of, one and only one TIME. Each point or range of TIME may encompass zero to many COLLECTING-EVENTS. Relationship: COLLECTING-EVENT-CITATION <> COLLECTING-EVENT Each COLLECTING-EVENT-CITATION refers to one and only one COLLECTING-EVENT. Each COLLECTING-EVENT is referenced in zero to many COLLECTING-EVENT-CITATIONS. Relationship: COLLECTING-METHOD <> COLLECTING-EVENT Each COLLECTING-METHOD is used in zero to many COLLECTING- EVENTS. Each COLLECTING-EVENT is conducted using one to many COLLECTING-METHODS. Relationship: COLLECTING-UNIT <> COLLECTING-EVENT Each COLLECTING-UNIT results from one and only one COLLECTING-EVENT. Each COLLECTING-EVENT produces zero to many COLLECTING- UNITs. In several disciplines there is no distinction between a "biological" individual and specimen; a single individual can be collected only once. In other disciplines (e.g., Botany, Vertebrate Paleontology, Invertebrate Zoology), it is possible to collect only part of an individual in one event, and return to collect additional parts or samples later. Whether a SPECIMEN may be collected in more than one COLLECTING-EVENT will depend on the definition (scope) of COLLECTING-UNIT. Relationship: COLLECTING-UNIT <> COLLECTING-UNIT Each COLLECTING-UNIT is associated with zero to many COLLECTING-UNITS (through the SPECIMEN-ASSOCIATION entity). Relationship: COLLECTING-UNIT <> STORAGE-REGIME Each COLLECTING-UNIT has one and only one STORAGE-REGIME. Each STORAGE-REGIME may involve zero to many COLLECTING- UNITS. Relationship: COLLECTING-UNIT-CITATION <> COLLECTING-UNIT Each COLLECTING-UNIT-CITATION refers to one and only one COLLECTING-UNIT. Each COLLECTING-UNIT may be referred to by zero to many COLLECTING-UNIT-CITATIONs. Relationship: COLLECTOR (=AGENT) <> COLLECTING-EVENT Each COLLECTOR participates in one or more COLLECTING- EVENTs. Each COLLECTING-EVENT is conducted by one or more COLLECTORs. Relationship: DERIVED-OBJECT <> COLLECTING-UNIT Each DERIVED-OBJECT is made from one and only one COLLECTING-UNIT. Each COLLECTING-UNIT produces zero to many DERIVED-OBJECTS. Relationship: DERIVED-OBJECT-TYPE <> DERIVED-OBJECT Each DERIVED-OBJECT-TYPE validates zero to many DERIVED- OBJECTs. Each DERIVED-OBJECT is validated by one and only one DERIVED-OBJECT-TYPE. Relationship: DETERMINATION <> COLLECTING-UNIT Each DETERMINATION is made on one and only COLLECTING-UNIT. Each COLLECTING-UNIT may be have one to many DETERMINATIONs. Relationship: DETERMINATION <> NAME-USE Each DETERMINATION involves one and only one NAME-USE. Each NAME-USE is involved in zero to many DETERMINATIONS. Relationship: DETERMINER (=AGENT, PERSON) <> DETERMINATION Each DETERMINER makes zero to many DETERMINATIONs. Each DETERMINATION is made by one and only one DETERMINER. Relationship: GAZETTEER-CITATION <> GAZETTEER-CITATION Each GAZETTEER-CITATION is contained in one and only one GAZETTEER-CITATION. Each GAZETTEER-CITATION contains zero to many GAZETTEER- CITATIONS This relationship represents a set-subset hierarchy among named places in a gazetteer. A strict hierarchical representation may not be adequate for our purposes as it represents only the "contains/is contained in" relationship between places. The relationship does not encompass the "overlaps" and "is adjacent to" relationships. Relationship: LOCALITY <> GAZETTEER-CITATION Each LOCALITY is closest to or contained within one and only one GAZETTEER-CITATION. Each GAZETTEER-CITATION is close to or contains zero to many LOCALITYs. The "close to" and "contains" relationships are semantically distinct, not mutually exclusive, and the related objects may be different. Therefore, they should be represented in the model as two distinct relationships between the same entities (this is allowed). Relationship: LOCALITY <> GEOLOGICAL-TIME-SCALE Each LOCALITY is bounded by zero to many (points on the) GEOLOGICAL-TIME-SCALE. Each point on the GEOLOGICAL-TIME-SCALE limits (the range of) zero to many LOCALITYs. Relationship: LOCALITY-CITATION <> LOCALITY Each LOCALITY is cited in zero to many LOCALITY-CITATIONs. Each LOCALITY-CITATION refers to one and only one LOCALITY. Relationship: LOT <> UNSORTED-LOT Each LOT is sorted from zero or one UNSORTED-LOT. Each UNSORTED-LOT is sorted into zero to many LOTs. Relationship: NAME <> COLLECTING-UNIT Each NAME is based on zero to many COLLECTING-UNITS. Each COLLECTING-UNIT is the type specimen(s) for zero to many NAMEs. (Names based on the same type are objective synonyms.) Relationship: NAME <> NAME-USE Each NAME is used in one to many NAME-USEs. Each NAME-USE uses one and only one NAME. Relationship: NAME-USE <> NODE-USE Each NAME-USE applies to one and only one NODE-USE. Each NODE-USE has one to many NAME-USEs. Relationship: NODE-USE <> PARENT-NODE-USE Each NODE-USE is contained in one and only one (parent- )NODE-USE. Each (parent)-NODE-USE contains zero to many NODE-USEs. Relationship: NODE-USE <> NODE Each NODE-USE involves one and only one NODE. Each NODE may be involved in one to many NODE-USEs. Relationship: PALEO-COLLECTING-EVENT <> GEOLOGICAL-TIME-SCALE Each PALEO-COLLECTING-EVENT is bounded by zero to many (points on the) GEOLOGICAL-TIME-SCALE. Each (point on the) GEOLOGICAL-TIME-SCALE limits (the range of) zero to many PALEO-COLLECTING-EVENTs. Relationship: PREPARATION-TECHNIQUE <> COLLECTING-UNIT Each PREPARATION-TECHNIQUE is performed on one and only one COLLECTING-UNIT. Each COLLECTING-UNIT is prepared in one to many PREPARATION- TECHNIQUEs. Relationship: PREPARATOR (=AGENT) <> PREPARATION-TECHNIQUE Each PREPARATOR (=AGENT) uses one and only one PREPARATION- TECHNIQUE. Each PREPARATION-TECHNIQUE is performed by one and only one PREPARATOR. Relationship: RECENT-COLLECTING-EVENT <> HABITAT-DESCRIPTION RECENT-COLLECTING-EVENT is described by zero to many HABITAT-DESCRIPTIONs. Each HABITAT-DESCRIPTION describes one and only one RECENT- COLLECTING-EVENT. Relationship: REFERENCE <> CITATION Each REFERENCE contains one to many CITATIONs. Each CITATION is contained in one and only one REFERENCE Relationship: REFERENCE <> NAME Each REFERENCE contains zero to many NAMEs. Each NAME is contained in one and only one REFERENCE. Relationship: REFERENCE <> NAME-USE Each REFERENCE contains zero to many NAME-USEs. Each NAME-USE is contained in one and only one REFERENCE. Relationship: REFERENCE <> NODE-USE Each REFERENCE contains zero to many NODE-USEs. Each NODE-USE is contained in one and only one REFERENCE. Relationship: SPECIMEN <> LOT Each SPECIMEN is sorted from zero to one LOT. Each LOT is sorted into zero to one SPECIMENs. Relationship: SPECIMEN-ASSOCIATION-TYPE <> SPECIMEN-ASSOCIATION Each SPECIMEN-ASSOCIATION-TYPE validates zero to many SPECIMEN-ASSOCIATIONs. Each SPECIMEN-ASSOCIATION is validated by one and only one SPECIMEN-ASSOCIATION-TYPE. Relationship: SPECIMEN-COMPONENT-TYPE <> SPECIMEN-COMPONENT Each SPECIMEN-COMPONENT-TYPE validates zero to many SPECIMEN-COMPONENTs. Each SPECIMEN-COMPONENT is validated by one and only one SPECIMEN-COMPONENT-TYPE. Relationship: SPECIMEN-COMPONENT <> SPECIMEN Each SPECIMEN-COMPONENT is derived from one and only one SPECIMEN. Each SPECIMEN is represented by zero to many SPECIMEN- COMPONENTS. Relationship: STORAGE-REGIME <> STORAGE-MEDIUM Each STORAGE-REGIME uses one and only one STORAGE-MEDIUM. Each STORAGE-MEDIUM is used in zero to many STORAGE-REGIMEs. Relationship: STORAGE-REGIME <> STORAGE-LOCATION Each STORAGE-REGIME has one and only one STORAGE-LOCATION. Each STORAGE-LOCATION may be involved in one to many STORAGE-REGIMES. Relationship: TRANSACTION <> COLLECTING-UNIT Each TRANSACTION involves one to many COLLECTING-UNITs. Each COLLECTING-UNIT is involved in one to many TRANSACTIONs. Relationship: TRANSACTOR (=AGENT) <> COLLECTION TRANSACTOR ? COLLECTION Relationship: TRANSACTOR (=AGENT) <> TRANSACTION Each TRANSACTOR participates in zero to many TRANSACTIONs. Each TRANSACTION is conducted by zero to many TRANSACTORs. APPENDIX A INFORMATION MODEL - DEFINITIONS AND CONVENTIONS Information models typically have two components, a structured textual description, and one or more illustrations that summarize the model. The illustrations are called entity-relationship diagrams (ERDs), and depict the principal entities of the problem domain, as well as the interrelationships among them. Figure 2 illustrates the two basic components of an ERD: entities (boxes) and relationships (the lines between the boxes). An entity is a grouping of people, places, physical objects, events, actions, or even concepts that can be described by the same information categories or attributes. Example entities from biological collections might include SPECIMEN, COLLECTING-EVENT, and LOCALITY. Individual things or events, etc, that comprise an entity are called instances (not depicted because a model focuses is on generalities). In a relational database implementation of an information model, entities and attributes translate into data tables and their associated data fields; instances translate into the rows of a table. The attributes of an entity are the place holders for data. They are important not only because they flesh-out the information of interest, but because restrictions on the values attributes may or must contain ultimately affects the scope and definition of the entity. The first restriction on attributes is that they must be single-valued at any given time. If an attribute legitimately may have many values simultaneously (that a list of values needs to be recorded at a given time for a given instance), the supposed attribute probably isn't an attribute in the context of an information model, but is rather another entity or a relationship. The convention followed in information modeling is to remove multi-valued attributes into their own entities. Another important aspect of entity attributes, is that, for each entity, a combination of attributes must be identified or chosen that distinguishes every instance in the entity. Every instance must be uniquely identified. This rule grounds the model in reality. If information is to be recorded about a thing in the real world, the thing must be identifiable. The identifying information and descriptive information must always be associated. The identifying attributes of an entity are called its primary key. Any attribute that is part of the primary key must always be populated with data for every valid instance; it can never be blank. Repeated interactions or associations among the things in the real world are represented in the model as relationships. Relationships are depicted as lines between entities. Note that while relationships connect entities in the diagram, they are understood to represent possible associations between instances contained in the entities. Relationships are instance-to- instance, rather than group-to-group. Example relationships from biological collections might include (expressed in words): 1) a SPECIMEN is collected in a COLLECTING-EVENT, and 2) a COLLECTING-EVENT occurs at a LOCALITY. Relationships between instances are not always one to one. For example, a single COLLECTING-EVENT may produce more than one SPECIMEN. The symbols on the line next to an entity (a circle, cross-hatch, or crow's foot) depict the cardinality of the relationship; the number of individuals in the entity that may be related to a single individual in the other entity (at the opposite end of the line), zero, one, or many, respectively. Note that relationships are directional, and the symbols at opposite ends of a line are usually different. (The words describing a relationship may also change with direction.) The outer symbol (closer to the entity) indicates the maximum, and may be either a cross-hatch or a crow's foot. A cross-hatch indicates that, at most, one instance in that entity may be related to a single instance in the other. A crow's foot indicates that many instances may be related to a given instance in the other entity. In Figure 2 A, the relationship between the COLLECTING-EVENT and SPECIMEN entities is one-to-many; a single COLLECTING-EVENT may be produce many SPECIMENS. From the perspective of the SPECIMEN entity, the relationship is many-to- one; a SPECIMEN is collected in one and only one COLLECTING- EVENT. The inner symbol (further from the entity) may be either a zero or a one, and indicates the minimum number of individuals that must be present in that entity for a given instance in the other. For example, the relationship between the COLLECTING-EVENT and SPECIMEN entities (Figure 2 A) indicates that a COLLECTING-EVENT must exist for every SPECIMEN. In other words, the existence of a SPECIMEN is predicated on the existence of a COLLECTING-EVENT. The zero by the SPECIMEN entity indicates that there may be no corresponding instance for a given COLLECTING-EVENT. This implies that there is a reason for keeping information about a COLLECTING-EVENT even though no SPECIMENs were collected. (This is just a pedagogical example and not intended to bias the reader one way or the other about the correctness of this representation.) A relationship may also be one-to-one, or many-to-many. One-to- one relationships are relatively uncommon, except in the depiction of supertype-subtype hierarchies, discussed below. Many-to-many relationships are more common, and can be depicted in two ways, depending on whether or not additional information needs to be captured about the relationship beyond its existence. If only the existence of the relationship needs to be recorded, a many-to-many relationship can be drawn as in Figure 2 B. The relationship between the SPECIMEN and REFERENCE entities is many- to-many. Crow's feet are present at both ends of the line. A single specimen may cited by many REFERENCEs. A single REFERENCE can cite many SPECIMENs. If the relationship itself needs to be described, the relationship should be drawn as an associative entity, (a box with a diamond in it) as in Figure 2 C, and labeled so that it can be populated with descriptive attributes. Note that all many-to-many relationships imply an associative entity, whether or not one is drawn in the ERD. The placement of the crow's feet around the associative entity in a many-to-many relationship may seem counter intuitive at first, but can be explained as follows. An associative entity records each instance of a relationship. If an individual SPECIMEN is cited to REFERENCEs, each of these individual relationships is recorded in the associative entity, CITATION. The "relationship" between the SPECIMEN entity and the CITATION entity is then one- to-many. The same one-to-many "relationship" exists between the REFERENCE and CITATION entities. Note that there are no zeroes by the main entities; each and every instance of the relationship (a instance in the CITATION entity) is existence dependent on each of the "target" entities. A CITATION, in this case, cannot exist without both a SPECIMEN and REFERENCE. A recursive relationship is used to indicate relationships between individuals of the same entity. The TAXON entity (Figure 2 D) shows a recursive relationship. A TAXON may contain other TAXONs (also an illustration of how the entity naming convention has priority over grammar in modeling). Recursive relationships are particularly important in biology because they model hierarchies (individuals that are related to each other in a potentially large and indefinite tree or network structure). The last kind of relationship commonly used in information modeling is that depicting a superset - subset relationship between entities known as supertypes and subtypes, respectively. The supertype-subtype concept is used to portray important commonalities and distinctions between groups of similar things in the real world. An entity (supertype) may have zero to many subtypes. A subtype inherits all of the attributes of its supertype, but also has additional attributes. The additional attributes of one subtype are different from the additional attributes of another subtype. A common example from the business world concerns employees, which may be full-time or part-time. A business typically records certain information (attributes) about all employees, but then records additional information for full-time employees that it does not for part-time employees, and vice versa. (Some authors refer to the entities in a supertype-subtype relationship as an "Is A" hierarchy; e.g., a part-time employee "is a" kind of employee.) Subtypes may or may not be mutually exclusive. The supertype-subtype relationship places two requirements on the attributes of related entities. First, the primary key of a subtype must be exactly the same as its supertype. If EMPLOYEE- ID is the primary key of EMPLOYEE, then the primary key of PART- TIME-EMPLOYEE must also be EMPLOYEE-ID. This restriction maintains a one-to-one relationship between instances of the subtype and supertype, and therefore ensures the inheritance of attributes. Second, the supertype must contain one or more classification variables to indicate explicitly that a given instance of the supertype is also none, one, or many of the possible subtypes. and a supertype can be partitioned into more than one hierarchy of subtypes.) Using the business example again, an attribute called EMPLOYEE-TYPE-CODE, with the possible values 'Full-Time' and 'Part-Time', could be added to the EMPLOYEE entity to serve as the classification variable. The diagramming conventions used here for super- and subtypes are illustrated in Figure 2 E. A single cross-hatch with the words "Is A" to the side indicate that the entity (or entities) below is a subtype of the entity above. The branching relationship line in the employee example illustrates mutually exclusive relationship, in this case, mutually exclusive subtypes. Completely separate lines would have been used if the subtypes were not mutually exclusive. Although it is possible to build models that don't include subtypes, they are useful for indicating commonalities and distinctions between entities. This is especially important when one subtype can participate in relationships that the supertype or other subtypes cannot. The last modeling tool that should be understood is the use of primary key attributes in the specification of a relationship. Recall two points made above: 1) attributes of the primary key uniquely identify instances in an entity, and 2) relationships exist between instances, not entities. To relate any two instances, the identifying information (primary keys) for both must be placed in the same entity (information cannot exist in the model outside an entity). Because all attributes must be single-valued, one-to-many relationships are implemented by distributing (copying) the primary key attributes of the "one" entity into the "many" entity. In the SPECIMEN and COLLECTING- EVENT example, a place-holder is created in the SPECIMEN entity for the primary key of COLLECTING-EVENT (e.g., COLLECTING-EVENT- ID). Within the SPECIMEN entity, the new attribute COLLECTING- EVENT-ID is called a foreign key. In the textual portion of the model foreign keys are typically listed among the attributes on an entity. In many-to-many relationships, both foreign keys are placed in the associative entity. Even though information cannot exist in the model without being placed in an entity, modelers sometimes allow themselves the short-hand notation on not listing associative entities. In these cases, the foreign keys that specify the relationship are not shown. Conventions of the Textual Description The textual portion of the model provides the definitions and descriptions of entities, the attributes of each entity, and descriptions of the relationships. The text should also describe briefly the outstanding issues involving a particular concept in the model. Additional sections may be added as necessary in later iterations. As many individuals will be contributing to development of this model it is important that descriptive standards be adopted to ensure that the description is complete and consistent. We propose the following information be recorded in descriptions of all data entities. Entity Name Definition Primary Key Foreign Keys Target Entity Data Elements Data Elements Remarks In addition, we propose to name all data objects, entities and data elements, according to a naming convention. A naming convention is a system for translating a data concept (e.g., a data element or data entity) into a name. Naming conventions are used in data administration to facilitate the development and use of a standard reference (e.g., model or data dictionary) by reducing ambiguity and redundancy among data objects. The name of a data object should suggest its definition and possible content. Armed with a knowledge of the naming convention, a person looking in a dictionary for a data object should find it easier to locate the corresponding object, or determine that the object does not exist in the dictionary. Note that the names for data objects used in a dictionary are not intended to be used as the names for corresponding tables or data elements in a database. The naming conventions used here conform to guidelines set forth by the National Institute of Standards and Technology, formerly Bureau of Standards, (Newton, 1987; Rosen & Law, 1989). All data object names are written in upper case. Different methods are used to derive names for entities and data elements. Entity Names. Entities are the primary subjects or concepts of interest to an enterprise. To make an entity name meaningful, it should always contain at least one noun. Adjectives or modifiers are used to clarify and restrict the meaning of the noun. The format of an entity name follows the English convention, in which modifiers are placed before the noun. Modifiers are optional, and used only to clarify the scope of the entity and eliminate ambiguity. Hyphens are used to join words in a name. Entity Name = [Modifiers] + Noun Examples: SPECIMEN DERIVED-OBJECT TYPE-SPECIMEN-CITATION Entities are always named in the singular, to represent a typical instance of the entity, except in cases where the instance itself is a plural concept. Data Element Names. Data element names are composed of two parts, a prime term and a class term. The prime term is simply an existing entity name, and the class term is composed of optional modifiers plus a data class name. Note that modifiers may be nouns as well as adjectives, and again are used to clarify meaning. In some cases, an entity name and a class name are sufficient to convey the meaning of the data element, and no modifiers are required. A data class is used to describe the content of the data element. A provisional list of class names (after a document describing the naming conventions used by a federal agency) is included below. (Other documents on data administration contain similar lists of data classes.) Data Element Name = Entity Name + [Modifiers] + Class Term Examples: SPECIMEN-IDENTIFIER LOCALITY-COUNTRY-NAME COLLECTING-EVENT-EQUIPMENT-NAME Table of Class Terms Class Name Abbrev Definition Amount AMT A monetary value. (May include average, balance, and other derived values). Angle ANGL The rotational measurement between two lines or planes, diverging from a common point or line, respectively. Area AREA The measurement of a surface. Count CNT An integer value representing the number of items. Code CD A combination of one or more numbers, letters, or special characters which is substituted for a specific meaning. Indicates the existence of a predetermined, finite set of values. Coordinate COORD The designation of location by a line or plane. (Includes latitude and longitude.) Date DT The notation of a specific period of time. Dimension DMSN A measured linear distance. (Includes: altitude, depth, diameter, distance, elevation, height, length, radius, width.) Flag FLG A boolean variable for recording a yes/no, or on/off state. Identifier ID A combination of one or more numbers, letters, or special characters that designate the identity of a specific object, entity, or instance, but has no other meaning. Mass MASS The measure of inertia of a body. Name NM A designation of an object, entity, or instance, expressed as a word, phrase. Quantity QTY A non-monetary value. (Includes, count, average, balance, deviation, factor, index, and scale.) Rate RT A quantity, amount or degree of something in relation to a unit of something else. (Includes: acceleration, density, flow, speed, force, frequency, humidity, etc.) Temperature TP The measure of heat in an object or ambient medium. Text TXT An unformatted character string, generally in the form of words. Time TM A notation of a specified chronological point within a period. Volume VOL The measurement of space occupied by a three dimensional object. Weight WT The force with which an object is attracted toward the earth by gravitation. APPENDIX B WORKSHOP PARTICIPANTS James H. Beach Harvard University, MCZ Stan Blum Smithsonian Institution, NMNH David Cannatella Texas Memorial Museum, Univ. Texas Jim R. Croft Australian National Botanic Garden John Damuth Univ. California, Santa Barbara * Janet Gomon Smithsonian Institution, NMNH Bruce Gritton Monterey Bay Aquarium Research Institute Ronald Hellenthal Notre Dame University Elaine Hoagland Association of Systematics Collections * Julian Humphries Cornell University David Mark State University of New York, Buffalo Sue McLaren Carnegie Museum of Natural History Richard Mooi California Academy of Sciences Peter Rauch Univ. California, Berkeley Gary Rosenburg Academy of Natural Sciences, Phila. Wayt Thomas New York Botanical Garden * Workshop Chairs