A roster is a list of values in ascending order plus their original positions. More...
#include <iroster.h>
Public Member Functions | |
const array_t< uint32_t > & | array () const |
const ibis::column * | getColumn () const |
template<typename T > | |
int | locate (const ibis::array_t< T > &vals, ibis::bitvector &positions) const |
Locate the values and set their positions in the bitvector. More... | |
template<typename T > | |
int | locate (const ibis::array_t< T > &vals, std::vector< uint32_t > &positions) const |
Locate the values and set their positions in the bitvector. More... | |
template<typename T > | |
int | locate (const std::vector< T > &vals, ibis::bitvector &positions) const |
Locate the values and set their positions in the bitvector. More... | |
template<typename T > | |
int | locate (const std::vector< T > &vals, std::vector< uint32_t > &positions) const |
Locate the values and set their positions in the bitvector. More... | |
template<> | |
int | locate (const ibis::array_t< double > &vals, ibis::bitvector &positions) const |
This explicit specialization of the locate function does not require column type to match the incoming data type. More... | |
template<> | |
int | locate (const std::vector< double > &vals, ibis::bitvector &positions) const |
This explicit specialization of the locate function does not require column type to match the incoming data type. More... | |
const char * | name () const |
uint32_t | operator[] (uint32_t i) const |
Return the row number of the ith smallest value. | |
void | print (std::ostream &out) const |
Output a minimal information about the roster list. More... | |
int | read (const char *idxfile) |
int | read (ibis::fileManager::storage *st) |
roster (const ibis::column *c, const char *dir=0) | |
Construct a roster list. More... | |
roster (const ibis::column *c, ibis::fileManager::storage *st, uint32_t offset=8) | |
Reconstruct from content of a fileManager::storage . More... | |
uint32_t | size () const |
int | write (const char *dt) const |
Write two files, .ind for indices and .srt to the sorted values. More... | |
int | writeSorted (const char *dt) const |
Write the sorted version of the attribute values to a .srt file. More... | |
Static Public Member Functions | |
template<class T > | |
static long | mergeBlock2 (const char *dsrc, const char *dout, const uint32_t segment, array_t< T > &buf1, array_t< T > &buf2, array_t< T > &buf3) |
A two-way merge algorithm. More... | |
Protected Member Functions | |
template<typename T > | |
int | icSearch (const std::vector< T > &vals, std::vector< uint32_t > &pos) const |
In-core searching function. More... | |
template<typename T > | |
int | icSearch (const ibis::array_t< T > &vals, std::vector< uint32_t > &pos) const |
In-core searching function. More... | |
uint32_t | locate (const double &val) const |
Return the smallest i such that val >= val[ind[i]]. | |
template<typename inT , typename myT > | |
int | locate2 (const std::vector< inT > &, std::vector< uint32_t > &) const |
Cast the incoming values into the type of the column (myT) and then locate the positions of the records that match one of the values. More... | |
template<typename inT , typename myT > | |
int | locate2 (const ibis::array_t< inT > &, std::vector< uint32_t > &) const |
Cast the incoming values into the type of the column (myT) and then locate the positions of the records that match one of the values. More... | |
template<typename T > | |
int | oocSearch (const std::vector< T > &vals, std::vector< uint32_t > &pos) const |
Out-of-core search function. More... | |
template<typename T > | |
int | oocSearch (const ibis::array_t< T > &vals, std::vector< uint32_t > &pos) const |
Out-of-core search function. More... | |
A roster is a list of values in ascending order plus their original positions.
It can use an external sort if the data and indices can not fit into memory. The indices will be written to a file with extension .ind and the sorted values in a file with extension .srt. If the indices can not be loaded into memory as a whole, the .ind file will be opened for future read operations.
ibis::roster::roster | ( | const ibis::column * | c, |
const char * | dir = 0 |
||
) |
Construct a roster list.
It attempts to read a roster list from the specified directory. If a roster list can not be read and dir is not nil, this function will attempt to sort the existing data records to build a roster list.
References ibis::fileManager::bytesFree(), ibis::column::elementSize(), ibis::part::nRows(), print(), roster(), and ibis::array_t< T >::size().
Referenced by roster().
ibis::roster::roster | ( | const ibis::column * | c, |
ibis::fileManager::storage * | st, | ||
uint32_t | offset = 8 |
||
) |
Reconstruct from content of a fileManager::storage
.
The content of the file (following the 8-byte header) is the index array ind
.
|
protected |
In-core searching function.
Attempts to read .ind and .srt into memory. Returns a negative value if it fails to read the necessary data files into memory.
References ibis::util::find(), ibis::fileManager::getFile(), ibis::fileManager::instance(), and ibis::util::read().
|
protected |
In-core searching function.
Attempts to read .ind and .srt into memory. Returns a negative value if it fails to read the necessary data files into memory. Returns 0 if there is no hits, a positive number if there are some hits.
References ibis::util::find(), ibis::fileManager::getFile(), ibis::fileManager::instance(), ibis::util::read(), and ibis::array_t< T >::size().
int ibis::roster::locate | ( | const ibis::array_t< T > & | vals, |
ibis::bitvector & | positions | ||
) | const |
Locate the values and set their positions in the bitvector.
Return a negative value for error, zero or a positive value for in case of success.
References ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::bitvector::decompress(), ibis::bitvector::set(), ibis::bitvector::setBit(), and ibis::array_t< T >::size().
Referenced by ibis::column::evaluateRange(), and ibis::keywords::readTermDocFile().
int ibis::roster::locate | ( | const ibis::array_t< T > & | vals, |
std::vector< uint32_t > & | positions | ||
) | const |
Locate the values and set their positions in the bitvector.
Error code:
Return the positions as a list of 32-bit integers.
vals
.roster
object.References ibis::array_t< T >::size().
int ibis::roster::locate | ( | const std::vector< T > & | vals, |
ibis::bitvector & | positions | ||
) | const |
Locate the values and set their positions in the bitvector.
Return the positions of the matching entries as a bitvector. Return a negative value for error, zero or a positive value for success. The input values are assumed to be sorted in ascending order.
References ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::bitvector::decompress(), ibis::bitvector::set(), and ibis::bitvector::setBit().
int ibis::roster::locate | ( | const std::vector< T > & | vals, |
std::vector< uint32_t > & | positions | ||
) | const |
Locate the values and set their positions in the bitvector.
Error code:
Return the positions as a list of 32-bit integers.
vals
.roster
object.int ibis::roster::locate | ( | const ibis::array_t< double > & | vals, |
ibis::bitvector & | positions | ||
) | const |
This explicit specialization of the locate function does not require column type to match the incoming data type.
Instead, it casts the incoming data type explicitly before performing any comparisons.
References ibis::bitvector::adjustSize(), ibis::bitvector::decompress(), ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::bitvector::set(), ibis::bitvector::setBit(), ibis::SHORT, ibis::array_t< T >::size(), ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
int ibis::roster::locate | ( | const std::vector< double > & | vals, |
ibis::bitvector & | positions | ||
) | const |
This explicit specialization of the locate function does not require column type to match the incoming data type.
Instead, it casts the incoming data type explicitly before performing any comparisons.
References ibis::bitvector::adjustSize(), ibis::bitvector::decompress(), ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::bitvector::set(), ibis::bitvector::setBit(), ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.
|
protected |
Cast the incoming values into the type of the column (myT) and then locate the positions of the records that match one of the values.
|
protected |
Cast the incoming values into the type of the column (myT) and then locate the positions of the records that match one of the values.
References ibis::array_t< T >::size().
|
static |
A two-way merge algorithm.
Uses std::less<T> for comparisons. Assumes the sorted segment size is segment
elements of type T.
References ibis::array_t< T >::clear(), ibis::array_t< T >::push_back(), ibis::array_t< T >::read(), ibis::horometer::realTime(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), ibis::horometer::start(), ibis::horometer::stop(), and UnixOpen.
|
protected |
Out-of-core search function.
It requires at least .ind file to be in memory. Need to implement a version that can read both .ind and .srt files during search.
References ibis::fileManager::buffer< T >::address(), ibis::util::read(), ibis::fileManager::buffer< T >::size(), and UnixOpen.
|
protected |
Out-of-core search function.
It requires at least .ind file to be in memory. Need to implement a version that can read both .ind and .srt files during search.
References ibis::fileManager::buffer< T >::address(), ibis::util::read(), ibis::array_t< T >::size(), ibis::fileManager::buffer< T >::size(), and UnixOpen.
void ibis::roster::print | ( | std::ostream & | out | ) | const |
Output a minimal information about the roster list.
Print a terse message about the roster.
If the roster list is not initialized correctly, it prints a warning message.
Referenced by roster().
int ibis::roster::write | ( | const char * | df | ) | const |
Write two files, .ind for indices and .srt to the sorted values.
Write both .ind and .srt file.
The argument can be the name of the ouput directory, then column name will be added. If the last segment of the name (before the last directory separator) matches the file name of the column, it is assumed to be the data file name and only the extension .ind and .srt will be added.
References ibis::fileManager::flushFile(), ibis::fileManager::instance(), ibis::util::flock::isLocked(), and UnixOpen.
int ibis::roster::writeSorted | ( | const char * | df | ) | const |
Write the sorted version of the attribute values to a .srt file.
Write the sorted values into .srt file.
Attempt to read the whole column into memory first. If it fails to do so, it will read one value at a time from the original data file.
References ibis::DOUBLE, ibis::FLOAT, ibis::fileManager::getFile(), ibis::util::getFileSize(), ibis::fileManager::instance(), ibis::INT, ibis::LONG, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.