The class for expandable tables. More...
#include <table.h>
Public Member Functions | |
virtual int | addColumn (const char *cname, ibis::TYPE_T ctype, const char *cdesc=0, const char *idx=0)=0 |
Add a column. | |
virtual int | append (const char *cname, uint64_t begin, uint64_t end, void *values)=0 |
Add values to the named column. More... | |
virtual int | appendRow (const ibis::table::row &)=0 |
Add one row. More... | |
virtual int | appendRow (const char *line, const char *delimiters=0)=0 |
Append a row stored in ASCII form. More... | |
virtual int | appendRows (const std::vector< ibis::table::row > &)=0 |
Add multiple rows. More... | |
virtual uint32_t | bufferCapacity () const |
Capacity of the memory buffer. More... | |
virtual void | clearData ()=0 |
Remove all data recorded. More... | |
virtual void | describe (std::ostream &) const =0 |
Print a description of the table to the specified output stream. | |
virtual const char * | getASCIIDictionary (const char *) const =0 |
Retrieve the name of the ASCII dictionary file associated with a column of categorical values. More... | |
virtual uint32_t | getPartitionMax () const |
Get the recommended number of rows in a data partition. | |
virtual uint32_t | mColumns () const =0 |
The number of columns in this table. | |
virtual uint32_t | mRows () const =0 |
The number of rows in memory. More... | |
virtual int | parseNamesAndTypes (const char *txt) |
Parse names and data types in string form. More... | |
virtual int | readCSV (const char *inputfile, int memrows=0, const char *outputdir=0, const char *delimiters=0)=0 |
Read the content of the named file as comma-separated values. More... | |
virtual int | readNamesAndTypes (const char *filename) |
Read a file containing the names and types of columns. More... | |
virtual int | readSQLDump (const char *inputfile, std::string &tname, int memrows=0, const char *outputdir=0)=0 |
Read a SQL dump from database systems such as MySQL. More... | |
virtual int32_t | reserveBuffer (uint32_t) |
Reserve enough buffer space for the specified number of rows. More... | |
virtual void | setASCIIDictionary (const char *, const char *)=0 |
Set the name of the ASCII dictionary file for a column of categorical values. More... | |
virtual void | setPartitionMax (uint32_t m) |
Set the recommended number of rows in a data partition. | |
virtual table * | toTable (const char *nm=0, const char *de=0)=0 |
Stop expanding the current set of data records. More... | |
virtual int | write (const char *dir, const char *tname=0, const char *tdesc=0, const char *idx=0, const char *nvpairs=0) const =0 |
Write the in-memory data records to the specified directory and update the metadata on disk. More... | |
virtual int | writeMetaData (const char *dir, const char *tname=0, const char *tdesc=0, const char *idx=0, const char *nvpairs=0) const =0 |
Write out the information about the columns. More... | |
Static Public Member Functions | |
static ibis::tablex * | create () |
Create a minimalistic table exclusively for entering new records. More... | |
Protected Member Functions | |
tablex () | |
Protected default constructor. More... | |
Protected Attributes | |
uint32_t | ipart |
Current partition number being used for writing. | |
uint32_t | maxpart |
Recommended size of data partitions to be created. | |
The class for expandable tables.
It is designed to temporarily store data in memory and then write the records out through the function write. After creating a object of this type, the user must first add columns by calling addColumn. New data records may be added one column at a time or one row at a time. An example of using this class is in examples/ardea.cpp.
|
inlineprotected |
Protected default constructor.
Derived classes need a default constructor.
|
pure virtual |
Add values to the named column.
The column name must be in the table already. The first value is to be placed at row begin
(the row numbers start with 0) and the last value before row end
. The array values
must contain (end - begin) values of the type specified through addColumn.
The expected types of values are "const std::vector<std::string>*" for string valued columns, and "const T*" for a fix-sized column of type T. For example, if the column type is float, the type of values is "const float*"; if the column type is category, the type of values is "const std::vector<std::string>*".
addColumn
.Implemented in ibis::tafel.
Referenced by fastbit_add_values().
|
pure virtual |
Add one row.
If an array of names has the same number of elements as the array of values, the names are used as column names. If the names are not specified explicitly, the values are assigned to the columns of the same data type in the order as they are specified through addColumn
or if the same order as they are recreated from an existing dataset (which is typically alphabetical).
Return the number of values added to the new row.
append
, this function can not be used to introduce new columns in a table. A new column must be added with addColumn
.Implemented in ibis::tafel.
|
pure virtual |
Append a row stored in ASCII form.
The ASCII form of the values are assumed to be separated by comma (,) or space, but additional delimiters may be added through the second argument.
Return the number of values added to the new row.
Implemented in ibis::tafel.
|
pure virtual |
Add multiple rows.
Rows in the incoming vector are processed on after another. The ordering of the values in earlier rows are automatically carried over to the later rows until another set of names is specified.
Return the number of new rows added.
Implemented in ibis::tafel.
|
inlinevirtual |
Capacity of the memory buffer.
Report the maximum number of rows can be stored with this object before more memory will be allocated. A return value of zero (0) may also indicate that it does not know about its capacity.
Reimplemented in ibis::tafel.
|
pure virtual |
Remove all data recorded.
Keeps the information about columns. It is intended to prepare for new rows after invoking the function write.
Implemented in ibis::tafel.
|
static |
Create a minimalistic table exclusively for entering new records.
Create a tablex for entering new data.
|
pure virtual |
Retrieve the name of the ASCII dictionary file associated with a column of categorical values.
Implemented in ibis::tafel.
|
pure virtual |
The number of rows in memory.
It is the maximum number of rows in any column.
Implemented in ibis::tafel.
|
virtual |
Parse names and data types in string form.
A column name must start with an alphabet or a underscore (_); it can be followed by any number of alphanumeric characters (including underscores). For each built-in data types, the type names recognized are as follows:
If it can not find a type, but a valid name is found, then the type is assumed to be int.
Characters following '#' or '–' on a line will be treated as comments and discarded.
References ibis::tafel::addColumn(), ibis::BLOB, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::SHORT, ibis::TEXT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
Referenced by readNamesAndTypes().
|
pure virtual |
Read the content of the named file as comma-separated values.
Append the records to this table. If the argument memrows is greater than 0, this function will reserve space to read this many records. If the total number of records is more than memrows and the output directory name is specified, then the records will be written the outputdir and the memory is made available for later records. If outputdir is not specified, this function attempts to expand the memory allocated, which may run out of memory. Furthermore, repeated allocations can be time-consuming.
By default the records are delimited by comma (,) and blank space. One may specify alternative delimiters using the last argument.
Upon successful completion of this funciton, the return value is the number of rows processed. However, not all of them may remain in memory because ealier rows may have been written to disk.
Implemented in ibis::tafel.
|
virtual |
Read a file containing the names and types of columns.
The content of the file is either the simple list of "name:type" pairs or the more verbose version used in '-part.txt' files. If it is the plain 'name:type' pair form, the pairs can be either specified one at a time or a group at a time. This function attempts to read one line at a time and will automatically grow the internal buffer used if the existing buffer is too small to read a long line. However, it is typically a good idea to keep the lines relatively short so it can be examined manually if necessary.
References ibis::fileManager::buffer< T >::address(), parseNamesAndTypes(), ibis::util::readString(), ibis::fileManager::buffer< T >::resize(), and ibis::fileManager::buffer< T >::size().
|
pure virtual |
Read a SQL dump from database systems such as MySQL.
The entire file will be read into memory in one shot unless both memrows and outputdir are specified. In cases where both memrows and outputdir are specified, this function reads a maximum of memrows before write the data to outputdir under the name tname, which leaves no more than memrows number of rows in memory. The value returned from this function is the number of rows processed including those written to disk. Use function mRows to determine how many are still in memory.
If the SQL dump file contains statement to create table, then the existing metadata is overwritten. Otherwise, it reads insert statements and convert the ASCII data into binary format in memory.
Implemented in ibis::tafel.
|
inlinevirtual |
Reserve enough buffer space for the specified number of rows.
Return the number of rows that can be stored or a negative number to indicate error. Since the return value is a 32-bit signed integer, it is not possible to represent number greater or equal to 2^31 (~2 billion), the caller shall not attempt to reserve space for 2^31 rows (or more).
The intention is to mimize the number of dynamic memory allocations needed expand memory used to hold the data. The implementation of this function is not required, and the user is not required to call this function.
Reimplemented in ibis::tafel.
|
pure virtual |
Set the name of the ASCII dictionary file for a column of categorical values.
Implemented in ibis::tafel.
|
pure virtual |
Stop expanding the current set of data records.
Convert a tablex object into a table object, so that they can participate in queries. The data records held by the tablex object is transfered to the table object, however, the metadata remains with this object.
Implemented in ibis::tafel.
|
pure virtual |
Write the in-memory data records to the specified directory and update the metadata on disk.
If the table name (tname
) is a null string or an empty string, the last component of the directory name is used. If the description (tdesc
) is a null string or an empty string, a time stamp will be printed in its place. If the specified directory already contains data, the new records will be appended to the existing data. In this case, the table name specified here will overwrite the existing name, but the existing name and description will be retained if the current arguments are null strings or empty strings. The data type associated with this table will overwrite the existing data type information. If the index specification is not null, the existing index specification will be overwritten.
dir
The output directory name. Must be a valid directory name. The named directory will be created if it does not already exist.tname
Table name. Should be a valid string, otherwise, a random name is generated as FastBit requires a name for each table.tdesc
Table description. An optional description of the table. It can be an arbitrary string.idx
Indexing option for all columns of the table without its own indexing option. More information about indexing options is available elsewhere.nvpairs
An arbitrary list of name-value pairs to be associated with the data table. An arbitrary number of name-value pairs may be given here, however, FastBit may not be able to do much about them. One useful of the form "columnShape=(nd1, ...,
ndk)" can be used to tell FastBit that the table table is defined on a simple regular k-dimensional mesh of size nd1 x ... x ndk. Internally, these name-value pairs associated with a data table is known as meta tags or simply tags.Return the number of rows written to the specified directory on successful completion.
Implemented in ibis::tafel.
Referenced by fastbit_flush_buffer().
|
pure virtual |
Write out the information about the columns.
It will write the metadata file containing the column information and index specifications if no metadata file already exists. It returns the number of columns written to the metadata file upon successful completion, returns 0 if a metadata file already exists, and returns a negative number to indicate errors. If there is no column in memory, nothing is written to the output directory.
Implemented in ibis::tafel.