next up previous
Next: Supporting Multidatabase Queries Up: Pursuing the OPM Previous: Constructing OPM Views

The Multidatabase Directory

As mentioned in section 3, central to our strategy of assembling MBDs into a multidatabase system is a Multidatabase Directory that includes general information on MBDs, MBD links, and MBD schemas. In our current implementation, the MBD Schema Library consists only of the OPM schema, associated schema documentation, and mapping information for each MBD. OPM supports extensive schema documentation capabilities: each class or attribute in an OPM schema can be associated with description and user-specified properties; for a controlled value class, each controlled value can also be associated with its description. Therefore, detailed schema descriptions can be embedded in an OPM schema definition. We are not aware of any other data models that support such documentation capabilities. Nevertheless, this is still not adequate for assisting users in examining and understanding the semantics of MBDs, nor in specifying and interpreting multidatabase queries.

We plan to develop an extended Multidatabase Directory as an independent resource that will provide support for examining and understanding MBDs as well as help scientists in specifying queries across multiple MBDs. The MBD Schema Library part of this Directory will contain schemas for MBDs expressed not only in OPM but in a variety of different DDLs as well, including each MBD's native DDL and several DDLs (e.g., ASN.1 and ACeDB), which are widely used within the molecular biology community. Consequently, scientists interested in a particular MBD will be able to view the MBD schema in a DDL with which they are familiar. The versions of an MBD schema represented in different DDLs will be generated using schema conversion tools that will follow the iterative schema conversion methodology underlying the OPM Retrofitting tool. The MBD Schema Library will also contain abstract overview schemas, in which related schema components will be grouped together into higher-level components in order to provide a more concise and comprehensible high-level view of the MBDs.

MBD schema documentation will contain sample data that will help to reveal schema nuances that are not evident in the schema definition. Further observing how the same or similar data are represented in different MBDs will help to give insights into how to exchange data between MBDs. Sample data will be annotated in order to explain the significance of its various components.

As a development and maintenance resource, the Multidatabase Directory will provide facilities for constructing, extending, and maintaining (revising, updating) information on MBDs. These facilities will include tools for constructing abstract overview schemas and schema and data converters for transforming schemas expressed in an MBD's native DDL into schemas expressed in alternative DDLs. Since MBD schemas evolve over time, the MBD Schema Library will support schema versioning and will include tools for keeping track of MBD schema changes. Schema annotation facilities will allow scientists to share their understanding and/or view of MBD schemas and thus contribute to enhancing the comprehensibility and value of MBD schema documentation. Search engines will be provided for identifying MBDs relevant to a particular topic, and for quickly determining the relevant parts of a particular MBD.

Certain MBDs provide additional tools such as sequence analysis programs for analyzing a DNA sequence. Such data analysis tools can also be employed in a multidatabase system, so the Multidatabase Directory needs to be extended in order to include information regarding software support.



next up previous
Next: Supporting Multidatabase Queries Up: Pursuing the OPM Previous: Constructing OPM Views



& Markowitz
Thu Mar 14 15:45:38 PST 1996