The multidatabase query processing strategy followed by the current version of OPM*QS is simple and general, but can be inefficient for certain types of queries. For example, a query that retrieves all the genes in class Gene of GSDB in order to find related genes in GDB via accession numbers, is often inefficient. A more efficient query strategy can put an order on the sub-queries and evaluate them in sequence, using the results of each sub-query to restrict the next sub-query in the sequence. However such a strategy would be considerably more difficult to implement, requiring statistics on sizes of individual classes and the selectivity of constraints in order to determine an optimal evaluation order. Although we consider examining such strategies in the future, in the short term we plan to increase the efficiency of multidatabase query processing by using inter-database links.
Inter-database links are known connections between heterogeneous databases that are recorded in the database directory together with the metadata on the individual databases. An example of such a link is the link between the Gene class in GSDB and the Gene class in GDB, represented by attribute gdb_xref of class Gene in GSDB; this attribute contains GDB accession numbers and thus indirectly points to GDB Gene objects. Following such a link allows the system to retrieve individual objects from a remote database, determined by the starting object of the link, rather than having to retrieve entire classes: in a sense following inter-database links in a query determines a pre-defined evaluation strategy and order for the query, rather than leaving the system to choose between multiple query evaluation plans.
From the perspective of a user constructing OPM*QL queries, inter-database links should appear to be much the same as regular OPM abstract attributes (intra-database links), except that the result of following such a link will be an object in another database rather than an object in a different class of the same database. Thus the database directory would associate an attribute name with each inter-database link, augmenting the attribute names already present in the OPM definition of a class, which could then be used to include the inter-database link in attribute paths in a query. It should be noted that such links do not subsume the general multidatabase joins already implemented, but rather complement them: using a combination of multidatabase joins, inter-database links, and other locally performed data manipulations, it should be possible to express very general and efficient multidatabase queries.