FastBit
  FastBit Front Page Research Publications Software Documentation Software Download Software License  

Organization: LBNL » CRD » SDM » FastBit » Documentation » Command Line Tool

FastBit IBIS Command Line Interface Document

 

IBIS [1] is an Implementation of Bitmap Indexing System named FastBit. This document explains the command line tool named ibis, which is a shorthand for Interactive Bitmap Index Search. Under FastBit, user data is organized into containers called tables and each table consists of an arbitrary number of rows and columns. In SQL algebra, the rows and columns are also called tuples and attributes. In this document, we will use the terms attribute and column interchangeably.

A table is physically organized into one or more data partitions, so that one column from a partition can comfortably fit in computer memory. Each data partition is stored in a directory on file systems and the command line tool ibis works with data directories.

An example

Take the dataset on falkland.jgi-psf.org as an example, the following command prints all the machine names (mchn) and the temperature values (tmpr) where the temperature is not one of the nominal values (55 for MegaBace and 60 for ABI).
/home/kwu/bin/ibis -c /home/kwu/bin/ibis.rc -v
-q "select mchn, tmpr where ! (tmpr == 55 || tmpr == 60)"
On the particular machine, the most current version of the ibis executable is /home/kwu/bin/ibis. The file name following option -c is the configuration file name. Alternatively, one may directly specify the data directory on command line use '-d data-directory-path'. The particular file contains the current version of JGI trace data header information. The attribute names [2] are available in the data directories /psf/QC/Projects/IBIS/Datasets.

The main option is -q which is followed by a query string. The basic syntax follows that of SQL, however, only the basic features of the SQL's select statement is implemented. Here we will first mention a few limitations that might cause non-descriptive failures of ibis.

The option -v tells ibis to be verbose. If this option is not supplied, only the number of hits and the result of the select clause are printed. The result of the select clause may be appended to a file instead of printed to standard output. To use this option specify '-output output-file'.

List of options

Here is the full list of options.

Query statement syntax

The command line tool ibis supports a limited version of the SQL select statement. It recognizes three key words, SELECT, FROM and WHERE. The key words are not case-sensitive, neither are the names of variables and functions described next.

The key word SELECT must be followed by a list of attribute names or one of the four functions AVG, MAX, MIN and SUM, separated by commas (,). The attribute names must be from the available datasets. If any name is not in the available dataset, IBIS treats it same as no attribute provided. If no attribute is provided, the SELECT key must not be used. In which case, only the number of hits would be printed. The four functions each take one argument that must be a column name of an available dataset. The variables not appearing in any functions are implicitly passed to a SQL 'GROUP BY' clause and the functions are defined on the groups defined by this implicit 'GROUP BY' clause. For example, the select clause 'SELECT mchn, avg(q20), min(snra)' will order the selected records according to machine name (mchn) and for each machine the average Q20 score and the minimum SNRA value will be computed. In the printout, the number of selected records is printed at the end. This is equivalent to adding 'count(*)' to the end of every (non-empty) select clause.

The key word FROM must be followed by a list of table names. Conceptually, the data under the management of IBIS are organized into tables; and each table must have a name. The table names may contain wild cards, '%' and '_', where '%' matches zero or more any characters and '_' matches exactly one character as in SQL "LIKE" expression. If no table name is specified, the key word FROM must not be used. In which case, all know data table would be queried.

The key word WHERE must be followed by a set of range conditions joined by logical operators 'AND', 'OR', 'XOR', and '!'. A range condition can be one-sided as "A = 5" or "B > 10", or two-sided as "10 <= B < 20." The supported operators are = (alternatively ==), <, <=, >, and >=. The range condition that involves only one attribute with constant bounds are known as simple conditions, which can be very efficiently processed by IBIS. A range condition can also involve multiple attributes, such as, "A < B <= 5", or even arithmetic expressions, such as, "sin(A) + fabs(B) < sqrt(cx*cx+cy*cy)". Note all one-argument and two-argument arithmetic functions available in math.h are supported. The key word WHERE and the conditions following it are essential to a query and can not be ommited.

The select clause may also contain arithmetic expressions, e.g.,

-q "select pressure, sqrt(vx*vx+vy*vy+vz*vz) where temperature > 1000"

Sample output

Here is a sample output produced on June 13, 2005 using the sample command shown above.
ibis::part::readTDC (Tue Jun 14 03:30:32 2005 UTC) --- failed to open file "/home/xhe/TRAC/tmp/dir1/-part.txt" -- No such file or directory
buildTables: directory /home/xhe/TRAC/tmp/dir1 does not contain a valid -part.txt file or contains an empty table.
Make sure the parameter Number_of_events has the correct value
Completed construction of an ibis::part named 200506.
1766540 records each with 36 attributes


./ibis: batch mode, log level 1

Tables:
200506

Query:
select mchn, tmpr where ! (tmpr == 55 or tmpr == 60)

doQuery (Tue Jun 14 03:31:13 2005 UTC) --- processing query ! (tmpr == 55 or tmpr == 60) on table 200506
query[LT55J4AkJu400000]::setWhereClause -- WHERE "! (tmpr == 55 or tmpr == 60)"
query[LT55J4AkJu400000]::setSelectClause -- SELECT mchn,tmpr
ibis::column[200506.Tmpr](INT)::readIndex -- the basic index was read (/home/xhe/TRAC/tmp/dir1/200506) in 0 sec(CPU), 0.00101352 sec(elapsed)
query[LT55J4AkJu400000]::estimate -- time to compute the bounds: 0.01 sec(CPU), 0.00476098 sec(elapsed).
query[LT55J4AkJu400000]::estimate -- # of hits for query "! (tmpr == 55 or tmpr == 60)" is 0
doQuery (Tue Jun 14 03:31:13 2005 UTC) --- number of hits in [0, 0]
query[LT55J4AkJu400000]::evaluate -- time to compute the 15744 hits: 0 sec(CPU), 0.000848532 sec(elapsed).
query[LT55J4AkJu400000]::evaluate -- user kwu SELECT mchn,tmpr FROM 200506 WHERE ! (tmpr == 55 or tmpr == 60) ==> 15744 hit(s).
doQuery (Tue Jun 14 03:31:13 2005 UTC) --- number of hits = 15744
doQuery:: evaluate(! (tmpr == 55 or tmpr == 60)) took 0.00626969 sec(elapsed)
ibis::part[200506]::getRIDs -- number of RIDs (0) does not match the size of the mask (1766540)
Tue Jun 14 03:31:13 2005 UTC
Warning -- query[LT55J4AkJu400000]::getRIDs -- got 0 row IDs from table 200506, expected 15744
ibis::part[200506]::getRIDs -- number of RIDs (0) does not match the size of the mask (1766540)
ibis::column[200506.MCHN](KEY)::readIndex -- the basic index was read (/home/xhe/TRAC/tmp/dir1/200506) in 0 sec(CPU), 0.000895739 sec(elapsed)
Query LT55J4AkJu400000 produces 32 distinct tuples of columns mchn,tmpr
MegaBACE # MB424 -128 
MegaBACE # MB424 -120 
MegaBACE # MB424 -112 
MegaBACE # MB424 -104 
MegaBACE # MB424 -96 
MegaBACE # MB424 -88 
MegaBACE # MB424 -80 
MegaBACE # MB424 -72 
MegaBACE # MB424 -64 
MegaBACE # MB424 -56 
MegaBACE # MB424 -48 
MegaBACE # MB424 -40 
MegaBACE # MB424 -32 
MegaBACE # MB424 -24 
MegaBACE # MB424 -16 
MegaBACE # MB424 -8 
MegaBACE # MB424 0 
MegaBACE # MB424 8 
MegaBACE # MB424 16 
MegaBACE # MB424 24 
MegaBACE # MB424 32 
MegaBACE # MB424 40 
MegaBACE # MB424 48 
MegaBACE # MB424 56 
MegaBACE # MB424 64 
MegaBACE # MB424 72 
MegaBACE # MB424 80 
MegaBACE # MB424 88 
MegaBACE # MB424 96 
MegaBACE # MB424 104 
MegaBACE # MB424 112 
MegaBACE # MB424 120 
Cleaning up table 200506
Cleaning up the file manager
Total pages accessed through read(unistd.h) is 145

The number of hits is printed in the following line

query[LT55J4AkJu400000]::evaluate -- user ... ==> 15744 hit(s).

The SELECT clause produced the output with the following heading.

Query LT55J4AkJu400000 produces 32 distinct tuples of columns mchn,tmpr

In this particular case, it prints the machine name with the abnormal temperature, 'MegaBACE # MB 424', and the abnormal temperature values, which incidentally all appears to be multiple of 8.

Endnotes

  1. FastBit IBIS has been released as open source software under LGPL license. You may download a free copy from FastBit download page.
  2. Additional documentation about FastBit can be found on-line at FastBit website. Questions may be directed to [email protected] (free subscription required, click here).