Historically, bitmap indexing has provided an important database
capability to accelerate queries. However, only a few database systems
have implemented these indexes because of the difficulties of modifying
fundamental assumptions in the low-level design of a database system and
in the expectations of customers, both of which have developed in an
environment that does not support bitmap indexes. Another problem that
arises, and one that may more easily be addressed by a research article,
is that there is no definitive design for bitmap indexes; bitmap index
designs in Oracle, Sybase IQ, Vertica and MODEL 204 are idiosyncratic,
and some of them were designed for older machine architectures.
To investigate an efficient design on modern processors, this paper provides details of the Set Query benchmark and a comparison of two research implementations of bitmap indexes. One, called RIDBit, uses the N-ary storage model to organize table rows, and implements a strategy that gracefully switches between the well-known B-tree RID-list structure and a bitmap structure. The other, called FastBit is based on vertical organization of the table data, where all columns are individually stored. It implements a compressed bitmap index, with a linear organization of the bitmaps to optimize disk accesses. Through this comparison, we evaluate the pros and cons of various design choices. Our analysis adds a number of subtleties to the conventional indexing wisdom commonly quoted in the database community.