Summary | Applications | Compression | Publications | Software | Sample data |
Here are two sets of sample data from a high-energy experiment called STAR. The first dataset named star2000 was used in a number of earlier performance measurements involving FastBit, e.g., CIKM 2001 and SSDBM 2002. The second dataset named star2002 is larger. It contains more rows and more columns.
This page also contains a few examples of using star2000 and using star2002. More general instructions are in dataLoading.html.
name | description | binary | CSV |
star2000 | 12 columns, 2173762 rows | star2000.tgz.bz2 | star2000.csv.gz | star2002 | 16 columns, 15857625 rows | all
15857625 rows in one CSV file; in three separate parts (part 1, part 2, part 3) |
star» time ../fastbit-b0.9.6/examples/ardea -d s0 -m "charge:f, clus:i, dst:i, hist:i, enumber:i, etime:d, rnumber:i, nlb:i, qxb:f, tracks:i, vertex:f, zdc:i" -t star2000.csv -v /data/kewu/fastbit-b0.9.6/examples/.libs/lt-ardea: verbose level 1 Will parse 1 CSV file with the following column names and types charge:f, clus:i, dst:i, hist:i, enumber:i, etime:d, rnumber:i, nlb:i, qxb:f, tracks:i, vertex:f, zdc:i /data/kewu/fastbit-b0.9.6/examples/.libs/lt-ardea start reading CSV file star2000.csv ... ibis::tafel::write completed writing partition s0 (user-supplied data parsed by ardea.cpp) with 12 columns and 2173762 rows (total 2173762) to s0 using 0.586911 sec(CPU) and 1.38501 sec(elapsed) Completed construction of an ibis::part named s0 (user-supplied data parsed by ardea.cpp) 2173762 rows each with 12 columns ibis::mensa -- constructed table T-s0 (s0) from directory s0. It consists of 1 partition with 12 columns and 2173762 rows -- begin printing table -- Table (on disk) T-s0 (s0) consists of 1 partition with 12 columns and 2173762 rows charge FLOAT clus INT dst INT enumber INT etime DOUBLE hist INT nlb INT qxb FLOAT rnumber INT tracks INT vertex FLOAT zdc INT 266, 909, 159625, 2635, 20000827.0117590018, 159627, 1341, -26.399744, 1239029, 1228, 0.56000537, 48 317, 1243, 159625, 2636, 20000827.0117590018, 159627, 1470, -29.07542, 1239029, 1415, 0.4644182, 53 281, 1285, 159625, 2637, 20000827.0117590018, 159627, 1663, -6.7535419, 1239029, 1533, 0.53597438, 8 204, 1198, 159625, 2638, 20000827.0117590018, 159627, 1806, 11.985353, 1239029, 1123, 0.66897142, 32 288, 988, 159625, 2639, 20000827.0117999986, 159627, 1409, -12.928666, 1239029, 1314, 0.58498305, 39 251, 1232, 159625, 2640, 20000827.0117999986, 159627, 930, 26.815605, 1239029, 1153, 0.53236401, 54 281, 1040, 159625, 2641, 20000827.0117999986, 159627, 916, 9.3687754, 1239029, 1166, 0.57459098, 52 194, 1286, 159625, 2642, 20000827.0117999986, 159627, 1495, 0, 1239029, 1027, 0.66013932, 25 184, 907, 159625, 2643, 20000827.011800997, 159627, 1852, 0, 1239029, 1104, 0.52850509, 54 207, 1143, 159625, 2644, 20000827.011800997, 159627, 730, -12.205997, 1239029, 990, 0.59274352, 47 -- 2173752 skipped... -- end printing table -- 14.217u 1.377s 0:16.39 95.0% 0+0k 0+0io 0pf+0w
star» ../fastbit-b0.9.6/examples/ibis -d s0 -v -q "select avg(charge) where zdc>90" /data/kewu/fastbit-b0.9.6/examples/.libs/lt-ibis: batch mode, log level 1 ibis::resource::read -- Reading configuration file "ibis.rc" Completed construction of an ibis::part named s0 (user-supplied data parsed by ardea.cpp) 2173762 rows each with 12 columns query[Pkkm=4YZDmK00000]::setWhereClause -- WHERE "zdc>90" query[Pkkm=4YZDmK00000]::estimate -- time to compute the bounds: 0.802879 sec(CPU) and 0.87511 sec(elapsed). query[Pkkm=4YZDmK00000]::estimate -- # of hits for query "zdc>90" is 5536 query[Pkkm=4YZDmK00000]::evaluate -- time to compute the 5536 hits: 0 sec(CPU) and 1.50204e-05 sec(elapsed). query[Pkkm=4YZDmK00000]::evaluate -- user kewu SELECT AVG(charge) FROM s0 WHERE zdc>90 ==> 5536 hits. Bundle1 Pkkm=4YZDmK00000 has 1 distinct value AVG(charge) (with counts) 17.1549855491, 5536 doQuery:: evaluate(SELECT avg(charge) FROM s0 WHERE zdc>90) produced 5536 hits, took 0.829874 CPU seconds and 0.901902 elapsed seconds
star» ../fastbit-b0.9.6/examples/ibis -d s0 -v -q "select avg(charge) where zdc>90" /data/kewu/fastbit-b0.9.6/examples/.libs/lt-ibis: batch mode, log level 1 ibis::resource::read -- Reading configuration file "ibis.rc" Completed construction of an ibis::part named s0 (user-supplied data parsed by ardea.cpp) 2173762 rows each with 12 columns query[Pkkm=4YZDq000000]::setWhereClause -- WHERE "zdc>90" query[Pkkm=4YZDq000000]::estimate -- time to compute the bounds: 0.003999 sec(CPU) and 0.00354004 sec(elapsed). query[Pkkm=4YZDq000000]::estimate -- # of hits for query "zdc>90" is 5536 query[Pkkm=4YZDq000000]::evaluate -- time to compute the 5536 hits: 0 sec(CPU) and 1.81198e-05 sec(elapsed). query[Pkkm=4YZDq000000]::evaluate -- user kewu SELECT AVG(charge) FROM s0 WHERE zdc>90 ==> 5536 hits. Bundle1 Pkkm=4YZDq000000 has 1 distinct value AVG(charge) (with counts) 17.1549855491, 5536 doQuery:: evaluate(SELECT avg(charge) FROM s0 WHERE zdc>90) produced 5536 hits, took 0.030996 CPU seconds and 0.0307651 elapsed seconds
star» ../fastbit-b0.9.6/examples/ibis -d s0 -v -q "where zdc>90" /data/kewu/fastbit-b0.9.6/examples/.libs/lt-ibis: batch mode, log level 1 ibis::resource::read -- Reading configuration file "ibis.rc" Completed construction of an ibis::part named s0 (user-supplied data parsed by ardea.cpp) 2173762 rows each with 12 columns query[Pkkm=4YZE1a00000]::setWhereClause -- WHERE "zdc>90" query[Pkkm=4YZE1a00000]::estimate -- time to compute the bounds: 0.002999 sec(CPU) and 0.00353098 sec(elapsed). query[Pkkm=4YZE1a00000]::estimate -- # of hits for query "zdc>90" is 5536 query[Pkkm=4YZE1a00000]::evaluate -- time to compute the 5536 hits: 0 sec(CPU) and 1.78814e-05 sec(elapsed). query[Pkkm=4YZE1a00000]::evaluate -- user kewu FROM s0 WHERE zdc>90 ==> 5536 hits. doQuery:: evaluate( FROM s0 WHERE zdc>90) produced 5536 hits, took 0.004999 CPU seconds and 0.00481892 elapsed seconds
One thing to note is that the time used to process the three files is about one minute each on this particular test machine. We intentionally repeated the append operations a number of times, and the time required to process the files vary in a fairly narrow range.
FastBit programs attempt to use no more than of half of the physical memory available on a machine. If your machine happen to have 2 GB of memory, you may put the following line in a file and pass the file name as the the argument to -c option of ardea executable.
fileManager.maxBytes=1500MbIn the particular example, the file containing the above line was called 'ibis.rc'.