Using Bitmap Indexing Technology for Combined Numerical and Text Queries

Kurt Stockinger
John Cieslewicz
Kesheng Wu
Doron Rotem
Arie Shoshani


In this paper, we describe a strategy of using compressed bitmap indices to speed up queries on both numerical data and text documents. By using an efficient compression algorithm, these compressed bitmap indices are compact even for indices with millions of distinct terms. Moreover, bitmap indices can be used very efficiently to answer Boolean queries over text documents involving multiple query terms. Existing inverted indices for text searches are usually inefficient for corpora with a very large number of terms as well as for queries involving a large number of hits. We demonstrate that our compressed bitmap index technology overcomes both of those short-comings. In a performance comparison against a commonly used database system, our indices answer queries 30 times faster on average. To provide full SQL support, we integrated our indexing software, called FastBit, with MonetDB. The integrated system MonetDB/FastBit provides not only efficient searches on a single table as FastBit does, but also answers join queries efficiently. Furthermore, MonetDB/FastBit also provides a very efficient retrieval mechanism of result records.

full text of LBNL-61768 (PDF)

In Data Warehousing and Data Analysis Annals of Information Systems Vol 3. Pages 1-23. 2008.

More research work by John Wu
Bitmap Index
Connected Component Labeling
Eigenvalue Computation
Inforamtion available elsewhere on the web
Google Scholar
Contact us

John Wu