/Users/john/src/ibis/examples/ardea.cpp File Reference ()

This is a simple test program for functions defined in ibis::tablex. More...

#include "ibis.h"
#include <set>
#include <memory>
#include <iomanip>

Typedefs
typedef std::set< const char *, ibis::lessi >	qList

Functions
int	main (int argc, char **argv)

Detailed Description

This is a simple test program for functions defined in ibis::tablex.

The user may specify a set of records to be read by using a combination of -m option (for meta data, i.e., column names and types) and -t or -r options or specify a SQL dump file. Option -t is used to specify the name of a text/CSV file and option -r is used to specify a row of text/CSV data on the command line. Specify a SQL dump file with '-sqldump filename'.

The caller may further specify a number of queries to be run on the data after they are written to disk.

If the directory specified in -d option (default to "tmp") contains data, the new records will be appended. When the names match, the records are assumed to the same type (not checked). When the names do not match, the rows with missing values are padded with NULL values. See ibis::tablex::appendRow for more information about NULL values.

If the user does not specify any new data, it will write a built-in set of data (91 rows and 8 columns) and then run 10 built-in queries with known numbers of hits.

Here is a list of arguments.

-b break/delimiters-in-text-data the delimiters to be expected in the input data, the default value is ", ".
-c conf-file a configuration file for FastBit
-d data-dir the output directory to write the data.
-h print a brief message about usage. Any unknown options will trigger this print function, which also terminates this program.
-M metadata-filename name the metadata file that contains the name and type information. The names and types can be either specified in the form of 'name:type' pairs or in the form of "-part.txt" files. The 'name:type' string is parsed by the function ibis::tablex::parseNamesAndTypes.
-m name:type[name:type,..] metadata, i.e., the names and types of the columns. All specification of 'name:type' pairs are concatenated according to the order they appear on the command line. This order is used to match with the order of the columns in the in the text file to be processed.
-m max-rows-per-file an upper bound on the number of rows in an input file, used to allocate internal read buffer. This is an optional advisory parameter.
-k column-name dictionary-filename supply an ASCII dictionary for the column of categorical values. The ASCII dictionary contains a pair of "integer-code: string value" on each line. Must provide two separate arguments to -k.
-n name-of-dataset the name to be associated with the dataset.
-tag metatags the name=value pairs associated with the data set.
-r a-row-in-ascii give one row of input data.
-sqldump sqldump-filename name of the SQL dump file to be read.
-select clause a select clause to be used for test queries. There can only be one select clause.
-t text-filename name of the text file to be read.
-where clause a where clause to be used for test queries. All where clauses specified on the command line will be tried in turn. A query will be composed of the select clause and one of the where clauses.

Note: This program uses standard unix functions to perform the read and write operations. If your input data file is not using unix-style end-of-line character, then it is possible that this program will not process the end-of-line correctly. If you see this program putting an entire line of text into one field, it is likely that you are experiencing the problem with the end-of-line characters. Please convert the end-of-line to unix-type.; This file is named after Cattle Egret, whose Latin name is Ardea ibis.



	Contact us Disclaimers FastBit source code FastBit mailing list archive

Typedefs

Functions

Detailed Description