SDM
People Publications Projects

Query Estimator

The Query Estimator (QE) is one of three main components of the Storage Manager module (the Query Estimator, the Query Monitor, and the Cache Manager). It is responsible for determining, for each query request, which events and files need to be accessed. The QE has two main components: 1) The Index Constructor, and the Query Processor. The Index Constructor takes as input an \"event table\" (also referred to as n-tuple) and a \"directory of files\" and which event was assigned to which file, and generates the necessary indexes for determining which events qualify for a query. A specialized index (called a \"compressed bit-sliced\" index) is constructed to be used for quick (real-time) estimation of the number of events that qualify given a query. 2) The Query Processor is the component that uses the indexes to respond to various queries. All query requests from the Query Object are submitted to the QE. There are three types of queries:

  1. SELECT. This query type submits predicate conditions on the properties of the events (such as \"number-of-pions > 2\" AND 2<energy-level<5\"). The QE uses its indexes to generate an event-OID set which is returned to the Query Object. We use the following notation for what is returned: {EID} - to denote set of event OIDs. A second SELECT query type submits either predicate conditions or an event-OID set {EID}, and one or more properties (P1, P2, …, Pn). The QE returns a set of tuples {(EID1, P11, P21, …, Pn1), (EID2, P12, P22, …, Pn2), …} for events that qualify.
  2. EXECUTE. This query type submits either predicate conditions or an event-OID set, and asks that the Storage Monitor will execute this request. The QE uses its indexes to generate a file-OID set {FID}, and for each file the event-OID set of events that qualify for this query. It passes this information to the Query Monitor (QM) along with the user-ID and query-ID. We use the following notation for what is passed to the QM: UID, QID {FID {EID}}.
  3. ESTIMATE. This query type submits either predicate conditions or an event-OID set. The QE returns to the Query Object estimate statistics. There are 2 levels of estimates, which we call ESTIMATE-QUICK, and ESTIMATE-FULL. For ESTIMATE-QUICK the QE uses in-memory indexes only to estimate the min-max number of events that qualify, the number of files that have to be cached, and the total MBs that have to cached. For ESTIMATE-FULL the QE uses secondary indexes (on disk) to determine the precise number of events that qualify, as well as the file statistics mentioned above. In addition, it provides the % of events that qualify in files, as well as a time estimate given what is in cache at that time.