next up previous
Next: An Example Up: The OPM Tool-Based Previous: Supporting Multidatabase Queries

Formulating and Interpreting Multidatabase Queries

The most difficult problems of querying multiple heterogeneous MBDs are (1) formulating a query, which involves determining the MBDs contain relevant data, understanding how data are represented in each of these MBDs, and how data in these MBDs relate to one another; and (2) interpreting the result of a query. Addressing these problems requires comprehensive information on the MBDs that are explored, and unfortunately such information is seldom available. While it cannot fill existing gaps in the documentation of MBDs, the Multidatabase Directory can help by making existing documentation available through a single resource and in a uniform representation. Browsing and keyword-search tools can be used to identify MBDs potentially of interest. Documentation on MBDs and their schemas can be then examined in order to determine whether they do indeed contain relevent data. When the MBD schema and documentation are not sufficient to clarify certain semantic issues, sample data can provide additional insight by allowing comparisons of data representations for the same or similar data in different MBDs. The MBD Link Library can be consulted in order to determine known correspondences between relevent data in heterogeneous MBDs. Furthermore, using inter-database links in multidatabase queries simplifies their formulation by resolving representational incompatibilities, such as different formats for accession numbers.

The information in the Multidatabase Directory together with the semantics of the operations underlying multidatabase query processing can be used for interpreting query results. For example, information on the semantics of objects in a given class can be used for annotating query results, information about inconsistent inter-database links can be used for explaining null query results, and so on. Consider, for instance, class Citation in database containing only citations published between April 1990 and March 1996; the result of a query requesting all the citations in Citation can then state that the results refer to citations published in this time range.



next up previous
Next: An Example Up: The OPM Tool-Based Previous: Supporting Multidatabase Queries



& Markowitz
Thu Mar 14 15:45:38 PST 1996