Library of Information Models for Biological Collections:

Background


Hawaii Biological SurveyBiological Collections and Databases

Biological specimens in museum collections provide verifiable, physical documentation (vouchers) of the Earth's biodiversity and indispensable resources to the science of biological systematics. Access to collections data is crucial to both biodiversity assessment and the rapid progress of systematics, but currently access is impeded by heterogeneity among the hundreds of independently conceived and developed collections databases. To enhance the accessibility and value of collections data, the collections community must develop interoperability among its collection databases. A critical component of this interoperability is the semantic reconciliation of, or the mapping of correspondences among, our heterogeneous data structures, across both institutions and systematics disciplines.

Molokai cloud forest. Photo by Clyde Imada,
courtesy Hawaii Biological Survey, Bishop Museum

The ASC Information Modeling Project    

The first attempt to describe biological collections data, across disciplines, above the level of a single institution, and with a modern methodology, was a 1992, NSF-sponsored, workshop of the Association of Systematics Collections, Computerization and Networking Committee (ASC-CNC), at Cornell University (Julian Humphries and Janet Gomon, Co-Chairs). The workshop produced a high-level, entity relationship (ER) model (ASC, 1993), containing more than 50 entities and 100 pages of description, but relatively little detail about data elements. In 1992, the model was arguably ahead of its time, but was used subsequently as the point of departure for further modeling and database development efforts. Awareness of the model's significance has grown in the intervening years and calls have been repeated for the model to be updated and made more explicit.

The ASC-CNC effort to integrate collections data across disciplines has now been revived through a year-long grant from the NSF Database Activities Program to the Bishop Museum (Allison and Blum, Co-PIs). The centerpiece of the project will be this Library of Information Models for Biological Collections. The library will function as a publicly accessible repository where models of collection databases can be deposited, compared, studied, and ultimately, recombined into templates for "next-generation" collection databases. Within the framework of this library, the ASC model will serve as a reference, or to use a navigational metaphor, a series of conceptual landmarks that will enable users to stay oriented while examining alternative data structures. The existing ASC model will be revised as input is gathered from local institutions through regional workshops and site visits during the first half of 1997. (If you would like to attend one of these regional workshops or schedule a site visit at your institution, please contact the "librarian", sblum@bishop.bishop.hawaii.org.)

The ASC-CNC & OPM Project Collaboration

The concept for this particular schema library, and the collaboration between the ASC and OPM projects, grew from a recent workshop at the San Diego Supercomputer Center (October, 1996), in which a small group of collections information specialists and computer scientists met to discuss the ASC project and its relationship to current thinking in computer science. The technology to compile, maintain, and use the library is being developed as part of the OPM Project at the Lawrence Berkeley National Laboratories.