Class ibis::mensa contains multiple (horizontal) data partitions (ibis::part
) to form a logical data table.
More...
#include <mensa.h>
Classes | |
class | cursor |
Public Member Functions | |
virtual int | addPartition (const char *) |
Add data partitions defined in the named directory. More... | |
virtual int | backup (const char *dir, const char *tname=0, const char *tdesc=0) const |
Write the current content to the specified output directory in the raw binary format. More... | |
virtual int | buildIndex (const char *, const char *) |
The following functions deal with auxillary data for accelerating query processing, primarily for building indexes. More... | |
virtual int | buildIndexes (const char *) |
Create indexes for every column of the table. More... | |
virtual int | buildIndexes (const ibis::table::stringArray &) |
The following functions deal with auxillary data for accelerating query processing, primarily for building indexes. More... | |
virtual stringArray | columnNames () const |
Return the column names in a list. More... | |
virtual typeArray | columnTypes () const |
Return the column types in a list. | |
virtual ibis::table::cursor * | createCursor () const |
Create a cursor object to perform row-wise data access. | |
virtual void | describe (std::ostream &) const |
!< Return data types. More... | |
virtual int | dropPartition (const char *) |
Remove the named data partition from this data table. More... | |
virtual int | dump (std::ostream &, const char *) const |
Print the values in ASCII form to the specified output stream. More... | |
virtual int | dump (std::ostream &, uint64_t, const char *) const |
Print the first nr rows. | |
virtual int | dump (std::ostream &, uint64_t, uint64_t, const char *) const |
Print nr rows starting with row offset. More... | |
virtual void | dumpNames (std::ostream &, const char *) const |
Print all column names on one line. | |
virtual void | estimate (const char *cond, uint64_t &nmin, uint64_t &nmax) const |
Estimate the number of rows satisfying the selection conditions. More... | |
virtual void | estimate (const ibis::qExpr *cond, uint64_t &nmin, uint64_t &nmax) const |
Estimate the number of rows satisfying the selection conditions. More... | |
virtual int64_t | getColumnAsBytes (const char *, char *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsDoubles (const char *, double *, uint64_t=0, uint64_t=0) const |
virtual int64_t | getColumnAsDoubles (const char *, std::vector< double > &, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsFloats (const char *, float *, uint64_t=0, uint64_t=0) const |
virtual int64_t | getColumnAsInts (const char *, int32_t *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsLongs (const char *, int64_t *, uint64_t=0, uint64_t=0) const |
virtual int64_t | getColumnAsOpaques (const char *, std::vector< ibis::opaque > &, uint64_t=0, uint64_t=0) const |
Retrieve the blobs as ibis::opaque objects. More... | |
virtual int64_t | getColumnAsShorts (const char *, int16_t *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsStrings (const char *, std::vector< std::string > &, uint64_t=0, uint64_t=0) const |
Convert values to their string form. More... | |
virtual int64_t | getColumnAsUBytes (const char *, unsigned char *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsUInts (const char *, uint32_t *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual int64_t | getColumnAsULongs (const char *, uint64_t *, uint64_t=0, uint64_t=0) const |
virtual int64_t | getColumnAsUShorts (const char *, uint16_t *, uint64_t=0, uint64_t=0) const |
Retrieve all values of the named column. More... | |
virtual double | getColumnMax (const char *) const |
Compute the maximum of all valid values in the name column. More... | |
virtual double | getColumnMin (const char *) const |
Compute the minimum of all valid values in the name column. More... | |
virtual long | getHistogram (const char *, const char *, double, double, double, std::vector< uint32_t > &) const |
virtual long | getHistogram2D (const char *, const char *, double, double, double, const char *, double, double, double, std::vector< uint32_t > &) const |
Compute a two-dimension histogram on columns cname1 and cname2 . More... | |
virtual long | getHistogram3D (const char *, const char *, double, double, double, const char *, double, double, double, const char *, double, double, double, std::vector< uint32_t > &) const |
Compute a three-dimensional histogram on the named columns. More... | |
virtual int | getPartitions (ibis::constPartList &) const |
Retrieve the list of partitions. | |
virtual table * | groupby (const stringArray &) const |
Directly performing group-by on the base data (without selection) is not currently supported. More... | |
virtual table * | groupby (const char *) const |
Directly performing group-by on the base data (without selection) is not currently supported. More... | |
virtual const char * | indexSpec (const char *) const |
Retrieve the current indexing option. More... | |
virtual void | indexSpec (const char *, const char *) |
Replace the current indexing option. More... | |
mensa (const char *dir) | |
This function expects a valid data directory to find data partitions. More... | |
mensa (const char *dir1, const char *dir2) | |
This function expects a pair of data directories to define data partitions. More... | |
virtual int | mergeCategories (const ibis::table::stringArray &) |
Merge the dictionaries of categorical value from different data partitions. More... | |
virtual uint32_t | nColumns () const |
Number of columns. More... | |
virtual uint64_t | nRows () const |
The number of rows in this table. | |
virtual void | orderby (const stringArray &, const std::vector< bool > &) |
Reordering the rows using the specified columns. More... | |
virtual void | orderby (const stringArray &) |
Reordering the rows using the specified columns. More... | |
virtual void | orderby (const char *str) |
Reorder the rows. The column names are separated by commas. | |
virtual void | reverseRows () |
Reversing the ordering of the rows on disk requires too much work but has no obvious benefit. More... | |
virtual table * | select (const char *sel, const char *cond) const |
Given a set of column names and a set of selection conditions, compute another table that represents the selected values. More... | |
virtual table * | select2 (const char *sel, const char *cond, const char *pts) const |
A variation of the function select defined in ibis::table. More... | |
Public Member Functions inherited from ibis::table | |
virtual const char * | description () const |
Free text description. May return a null pointer. | |
virtual const char * | name () const |
Name of the table. More... | |
virtual table * | select (const char *sel, const ibis::qExpr *cond) const |
Process the selection conditions and generate another table to hold the answer. More... | |
virtual | ~table () |
Destructor. | |
Protected Member Functions | |
void | clear () |
Clear the existing content. | |
int64_t | computeHits (const char *cond) const |
Compute the number of hits. | |
Protected Member Functions inherited from ibis::table | |
table () | |
!< Description of the table. More... | |
table (const char *na, const char *de) | |
Constructor. Use the user-supplied name and description. | |
Protected Attributes | |
ibis::table::namesTypes | naty |
A combined list of columns names. | |
uint64_t | nrows |
ibis::partList | parts |
List of data partitions. | |
Protected Attributes inherited from ibis::table | |
std::string | desc_ |
!< Name of the table. | |
std::string | name_ |
Friends | |
class | cursor |
Additional Inherited Members | |
Public Types inherited from ibis::table | |
typedef ibis::array_t< void * > | bufferArray |
A list to hold the in-memory buffers. More... | |
typedef std::map< const char *, ibis::TYPE_T, ibis::lessi > | namesTypes |
An associative array of names and types. | |
typedef ibis::array_t< const char * > | stringArray |
A list of strings. More... | |
typedef std::vector< const char * > | stringVector |
typedef ibis::array_t< ibis::TYPE_T > | typeArray |
A list of data types. | |
Static Public Member Functions inherited from ibis::table | |
static void * | allocateBuffer (ibis::TYPE_T, size_t) |
Allocate a buffer of the specified type and size. | |
static int64_t | computeHits (const ibis::constPartList &parts, const char *cond) |
Compute the number of rows satisfying the specified conditions. More... | |
static int64_t | computeHits (const ibis::constPartList &parts, const ibis::qExpr *cond) |
Compute the number of rows satisfying the specified query expression. More... | |
static void | consecrateName (char *) |
Remove unallowed characters from the given string to produce a valid column name. More... | |
static ibis::table * | create (ibis::part &) |
Create a simple of container of a partition. More... | |
static ibis::table * | create (const ibis::partList &) |
Create a container of externally managed data partitions. More... | |
static ibis::table * | create (const char *dir) |
Create a table object from the specified data directory. More... | |
static ibis::table * | create (const char *dir1, const char *dir2) |
Create a table object from a pair of data directories. More... | |
static void | freeBuffer (void *buffer, ibis::TYPE_T type) |
Freeing a buffer for storing in-memory values. More... | |
static void | freeBuffers (bufferArray &, typeArray &) |
Freeing a list of buffers. More... | |
static bool | isValidName (const char *) |
Is the given string a valid FastBit name for a data column? | |
static void | parseNames (char *in, stringVector &out) |
Parse the incoming string into a set of names. More... | |
static void | parseNames (char *in, stringArray &out) |
Parse the incoming string into a set of names. More... | |
static void | parseOrderby (char *in, stringArray &out, std::vector< bool > &direc) |
Parse the incoming string as an order-by clause. More... | |
static table * | select (const ibis::constPartList &parts, const char *sel, const char *cond) |
Perform the select operation on a list of data partitions. More... | |
static table * | select (const ibis::constPartList &parts, const char *sel, const ibis::qExpr *cond) |
Perform select operation using a user-supplied query expression. More... | |
Class ibis::mensa contains multiple (horizontal) data partitions (ibis::part
) to form a logical data table.
The base data contained in this table is logically immutable as reordering rows (through function orderby
) does not change the overall content of the table. The functions reverseRows
and groupby
are not implmented.
|
explicit |
This function expects a valid data directory to find data partitions.
If the incoming directory is not a valid string, it will use ibis::gParameter() to find data partitions.
References ibis::table::desc_, ibis::util::gatherParts(), ibis::gParameters(), naty, and parts.
ibis::mensa::mensa | ( | const char * | dir1, |
const char * | dir2 | ||
) |
This function expects a pair of data directories to define data partitions.
If either dir1 and dir2 is not valid, it will attempt to find data partitions using global parameters ibis::gParameters().
References ibis::table::desc_, ibis::util::gatherParts(), ibis::gParameters(), naty, and parts.
|
virtual |
Add data partitions defined in the named directory.
It uses opendir and friends to traverse the subdirectories, which means it will only able to descend to subdirectories on unix and compatible systems.
Reimplemented from ibis::table.
Reimplemented in ibis::liga.
References ibis::util::gatherParts(), and ibis::gParameters().
|
virtual |
Write the current content to the specified output directory in the raw binary format.
May optionally overwrite the name and description of the table.
Implements ibis::table.
|
virtual |
The following functions deal with auxillary data for accelerating query processing, primarily for building indexes.
Create the index for the named column. The existing index will be replaced. If an indexing option is not specified, it will use the internally recorded option for the named column or the table containing the column.
Implements ibis::table.
References ibis::column::loadIndex(), and ibis::column::unloadIndex().
|
virtual |
Create indexes for every column of the table.
Existing indexes will be replaced. If an indexing option is not specified, the internally recorded options will be used.
Implements ibis::table.
|
virtual |
The following functions deal with auxillary data for accelerating query processing, primarily for building indexes.
Create the index for the named column. The existing index will be replaced. If an indexing option is not specified, it will use the internally recorded option for the named column or the table containing the column.
Implements ibis::table.
|
virtual |
Return the column names in a list.
Implements ibis::table.
|
virtual |
!< Return data types.
Print a description of the table to the specified output stream.
Implements ibis::table.
References ibis::TYPESTRING.
|
virtual |
Remove the named data partition from this data table.
The incoming argument is expected to the name of the data partition.
Reimplemented from ibis::table.
References ibis::util::envLock.
|
virtual |
Print the values in ASCII form to the specified output stream.
The default delimiter is coma (","), which produces Comma-Separated-Values (CSV).
Implements ibis::table.
References ibis::mensa::cursor::dumpBlock(), and ibis::mensa::cursor::fetch().
Referenced by ibis::mensa::cursor::dumpSome().
|
inlinevirtual |
Print nr rows starting with row offset.
Note that the row number starts with 0, i.e., the first row is row 0.
Implements ibis::table.
References ibis::mensa::cursor::dumpSome(), ibis::mensa::cursor::fetch(), and parts.
|
virtual |
Estimate the number of rows satisfying the selection conditions.
The number of rows is between [nmin
, nmax
] (inclusive).
Implements ibis::table.
References ibis::countQuery::estimate(), ibis::countQuery::getMaxNumHits(), ibis::countQuery::getMinNumHits(), ibis::countQuery::setPartition(), and ibis::countQuery::setWhereClause().
|
virtual |
Estimate the number of rows satisfying the selection conditions.
The number of rows is between [nmin
, nmax
] (inclusive).
Implements ibis::table.
References ibis::countQuery::estimate(), ibis::countQuery::getMaxNumHits(), ibis::countQuery::getMinNumHits(), ibis::countQuery::setPartition(), and ibis::countQuery::setWhereClause().
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::part::nRows(), and ibis::UBYTE.
|
virtual |
Implements ibis::table.
References ibis::util::copy(), ibis::DOUBLE, ibis::FLOAT, ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::DOUBLE, ibis::FLOAT, ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.
|
virtual |
Implements ibis::table.
References ibis::util::copy(), ibis::FLOAT, ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, and ibis::USHORT.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.
|
virtual |
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::LONG, ibis::part::nRows(), ibis::OID, ibis::SHORT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
|
virtual |
Retrieve the blobs as ibis::opaque objects.
Only work on the column type BLOB.
Implements ibis::table.
References ibis::BLOB, ibis::CATEGORY, ibis::util::copy(), ibis::DOUBLE, ibis::FLOAT, ibis::part::getColumn(), ibis::column::getOpaque(), ibis::column::getString(), ibis::column::getValuesArray(), ibis::INT, ibis::LONG, ibis::part::nRows(), ibis::OID, ibis::SHORT, ibis::TEXT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, and ibis::USHORT.
|
virtual |
Convert values to their string form.
Many data types can be converted to strings, however, the conversion may take a significant amount of time.
Implements ibis::table.
References ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::part::getColumn(), ibis::util::getString(), ibis::column::getValuesArray(), ibis::INT, ibis::LONG, ibis::part::nRows(), ibis::OID, ibis::SHORT, ibis::TEXT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::part::nRows(), and ibis::UBYTE.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.
|
virtual |
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::INT, ibis::LONG, ibis::part::nRows(), ibis::OID, ibis::SHORT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
|
virtual |
Retrieve all values of the named column.
The member functions of this class only support access to one column at a time. Use table::cursor
class for row-wise accesses.
The arguments begin and end are given in row numbers starting from 0. If begin < end, then rows begin till end-1 are packed into the output array. If 0 == end (i.e., leaving end as the default value), then the values from begin till end of the table is packed into the output array. The default values where both begin and end are 0 define a range covering all rows of the table.
These functions return the number of elements copied upon successful completion, otherwise they return a negative number to indicate failure.
Implements ibis::table.
References ibis::util::copy(), ibis::part::getColumn(), ibis::column::getValuesArray(), ibis::part::nRows(), ibis::SHORT, ibis::UBYTE, and ibis::USHORT.
|
virtual |
Compute the maximum of all valid values in the name column.
In case of error, such as an invalid column name or an empty table, this function will return FASTBIT_DOUBLE_NULL or -DBL_MAX to ensure that the following test fails getColumnMin <= getColumnMax.
Implements ibis::table.
References ibis::column::getActualMax().
|
virtual |
Compute the minimum of all valid values in the name column.
In case of error, such as an invalid column name or an empty table, this function will return FASTBIT_DOUBLE_NULL or DBL_MAX to ensure that the following test fails getColumnMin <= getColumnMax.
Implements ibis::table.
References ibis::column::getActualMin().
|
virtual |
Compute the histogram of the named column. This version uses the user specified bins:
A record is placed in bin
where the first bin is bin 0. The total number of bins is
stride
is considered as an error. end
is less than begin
, an empty array counts
is returned along with return value 0. Implements ibis::table.
|
virtual |
Compute a two-dimension histogram on columns cname1
and cname2
.
The bins along each dimension are defined the same way as in function getHistogram
. The array counts
stores the two-dimensional bins with the first dimension as the slow varying dimension following C convention for ordering multi-dimensional arrays.
Implements ibis::table.
|
virtual |
Compute a three-dimensional histogram on the named columns.
The triplets <begin, end, stride> are used the same ways in getHistogram
and getHistogram2D
. The three dimensional bins are linearized in counts
with the first being the slowest varying dimension and the third being the fastest varying dimension following the C convention for ordering multi-dimensional arrays.
Implements ibis::table.
|
inlinevirtual |
Directly performing group-by on the base data (without selection) is not currently supported.
Implements ibis::table.
|
inlinevirtual |
Directly performing group-by on the base data (without selection) is not currently supported.
Reimplemented from ibis::table.
|
virtual |
Retrieve the current indexing option.
If no column name is specified, it retrieve the indexing option for the table.
Implements ibis::table.
|
virtual |
Replace the current indexing option.
If no column name is specified, it resets the indexing option for the table.
Implements ibis::table.
|
virtual |
Merge the dictionaries of categorical value from different data partitions.
The argument is a list of column names. If the incoming list is empty, then dictionaries of categorical columns with the same names are combined. If a list is provided by the caller, then all columns with the given names will be placed in a single dictionary. Additionally, all indexes associated with the columns will be updated to make use of the new combined dictionary.
A default implementation is provided. This default implementation does nothing and returns 0. This action is valid for a table with only a single partition and the incoming list is empty.
Reimplemented from ibis::table.
References ibis::CATEGORY, ibis::category::getDictionary(), ibis::category::loadIndex(), ibis::dictionary::merge(), ibis::column::name(), ibis::array_t< T >::push_back(), ibis::util::ref(), ibis::category::setDictionary(), ibis::dictionary::size(), ibis::array_t< T >::size(), and ibis::dictionary::sort().
|
virtual |
Number of columns.
It actually returns the number of columns of the first data partition. This is consistent with other functions such as columnTypes and columnNames.
Implements ibis::table.
|
virtual |
Reordering the rows using the specified columns.
Each data partition is reordered separately.
Implements ibis::table.
|
virtual |
Reordering the rows using the specified columns.
Each data partition is reordered separately.
Implements ibis::table.
|
inlinevirtual |
Reversing the ordering of the rows on disk requires too much work but has no obvious benefit.
Implements ibis::table.
|
virtual |
Given a set of column names and a set of selection conditions, compute another table that represents the selected values.
Implements ibis::table.
References ibis::table::select().
|
virtual |
A variation of the function select defined in ibis::table.
It accepts an extra argument for caller to specify a list of names of data partitions that will participate in the select operation. The argument pts may contain wild characters accepted by SQL function 'LIKE', more specifically, '_' and ''. If the argument pts is a nil pointer or an empty string
References ibis::table::computeHits(), ibis::table::select(), and ibis::util::strMatch().