next up previous contents
Next: Iterative Query Processing Up: Processing Select Queries Previous: Method Based Query

Query Rewriting

OPM queries can be translated directly into one or several SQL queries by rewriting the OPM query conditions into SQL query conditions and by incorporating the necessary join conditions into the generated SQL queries, as proposed in [1]. Provided that all OPM query conditions can be translated into SQL conditions, query processing outside the DBMS is not needed.

Query rewriting outperforms the retrieval method based approach when simple conditions are involved in the OPM queries and when only a small amount of data satisfy the query conditions. For expressing more complex conditions, such as conditions involving set comparisons, translation into the SQL dialect of commercial DBMSs becomes more difficult and less efficient since it requires the generation of complex SQL (sub)queries. For example, conditions involving aggregate function derived attributes that entail using group by and having SQL constructs, must be translated into several SQL subqueries, and therefore involve temporary relations for intermediate query results. Query processing involving temporary relations is usually inefficient when large amounts of data are involved. Furthermore, if the resulting SQL queries involve several levels of nested subqueries or multiple joins, then the execution of such queries is very slow, and therefore is not desirable for a production system. Generating SQL queries involving numerous joins is particularly problematic when some of the joins are outerjoins. Left outerjoins and regular joins are not associative and join ordering cannot be expressed in a single SQL query in DBMSs such as Oracle and Sybase.

Finally, an OPM query can be translated into different SQL queries that are semantically equivalent but perform differently. For example, an OPM query involving disjunctive conditions can be translated into different SQL queries, such as: (i) an SQL query involving a union of subqueries, (ii) an SQL query involving select distinct (for removing duplicates) and or conditions, or (iii) an SQL query involving nested subqueries. The performance of these three queries is significantly different and depends on the underlying DBMS (e.g., for Sybase the first query performs best). Such DBMS-dependent performance issues are not addressed in research papers on query optimization.

An additional cost of the query rewriting approach regards the conversion of relational query results into an object-oriented representation (see step 3 above). For example, when an SQL query is used for retrieving instances together with several set-valued attributes, the returned result contains the cross product of all set-valued attribute values. Further processing is needed for converting such a result into class instances with proper attribute values.



next up previous contents
Next: Iterative Query Processing Up: Processing Select Queries Previous: Method Based Query