Public Member Functions | Protected Member Functions | List of all members
ibis::blob Class Reference

A class to provide a minimal support for byte arrays. More...

#include <blob.h>

Inheritance diagram for ibis::blob:
ibis::column

Public Member Functions

virtual long append (const void *, const ibis::bitvector &)
 Append the records in vals to the current working dataset. More...
 
virtual long append (const char *dt, const char *df, const uint32_t nold, const uint32_t nnew, uint32_t nbuf, char *buf)
 Append the content in df to the end of files in dt. More...
 
 blob (const part *, FILE *)
 Contruct a blob by reading from a metadata file.
 
 blob (const part *, const char *)
 Construct a blob from a name.
 
 blob (const ibis::column &)
 Copy an existing column object of type ibis::BLOB.
 
virtual void computeMinMax ()
 Compute the actual min/max values. More...
 
virtual void computeMinMax (const char *)
 Compute the actual min/max values. More...
 
virtual void computeMinMax (const char *, double &, double &, bool &) const
 Compute the actual min/max of the data in directory dir. More...
 
long countRawBytes (const bitvector &) const
 Count the number of bytes in the blobs selected by the mask. More...
 
virtual double getActualMax () const
 Compute the actual maximum value by reading the data or examining the index. More...
 
virtual double getActualMin () const
 A group of functions to compute some basic statistics for the column values. More...
 
int getBlob (uint32_t ind, char *&buf, uint64_t &size) const
 Extract a single binary object. More...
 
virtual int getOpaque (uint32_t, ibis::opaque &) const
 Return the raw binary value for the ith row. More...
 
virtual double getSum () const
 Compute the sum of all values by reading the data.
 
virtual int getValuesArray (void *) const
 Copy all rows of the column into an array_t object. More...
 
virtual long indexSize () const
 Compute the index size (in bytes). More...
 
virtual void loadIndex (const char *, int) const throw ()
 Load the index associated with the column. More...
 
virtual void print (std::ostream &) const
 Print information about this column.
 
virtual array_t< signed char > * selectBytes (const bitvector &) const
 Retrieve selected 1-byte integer values. More...
 
virtual array_t< double > * selectDoubles (const bitvector &) const
 Put the selected values into an array as doubles. More...
 
virtual array_t< float > * selectFloats (const bitvector &) const
 Put selected values of a float column into an array. More...
 
virtual array_t< int32_t > * selectInts (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual array_t< int64_t > * selectLongs (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual std::vector< ibis::opaque > * selectOpaques (const bitvector &mask) const
 Extract the blobs from the rows marked 1 in the mask. More...
 
int selectRawBytes (const bitvector &, array_t< char > &, array_t< uint64_t > &) const
 Extract the blobs from the rows marked 1 in the mask. More...
 
virtual array_t< int16_t > * selectShorts (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual std::vector< std::string > * selectStrings (const bitvector &) const
 Return the selected rows as strings. More...
 
virtual array_t< unsigned char > * selectUBytes (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual array_t< uint32_t > * selectUInts (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual array_t< uint64_t > * selectULongs (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual array_t< uint16_t > * selectUShorts (const bitvector &) const
 Return selected rows of the column in an array_t object. More...
 
virtual long stringSearch (const char *, ibis::bitvector &) const
 
virtual long stringSearch (const std::vector< std::string > &, ibis::bitvector &) const
 
virtual long stringSearch (const char *) const
 
virtual long stringSearch (const std::vector< std::string > &) const
 
virtual void write (FILE *) const
 Write metadata about the column.
 
virtual long writeData (const char *dir, uint32_t nold, uint32_t nnew, ibis::bitvector &mask, const void *va1, void *va2)
 Write the content of BLOBs packed into two arrays va1 and va2. More...
 
- Public Member Functions inherited from ibis::column
virtual int attachIndex (double *, uint64_t, int64_t *, uint64_t, void *, FastBitReadBitmaps) const
 
virtual int attachIndex (double *, uint64_t, int64_t *, uint64_t, uint32_t *, uint64_t) const
 
void binWeights (std::vector< uint32_t > &) const
 Retrive the number of rows in each bin.
 
template<typename T >
long castAndWrite (const array_t< double > &vals, ibis::bitvector &mask, const T special)
 Cast the incoming array into the specified type T before writing the values to the file for this column. More...
 
 column (const column &rhs)
 The copy constructor. More...
 
 column (const part *tbl, FILE *file)
 Reconstitute a column from the content of a file. More...
 
 column (const part *tbl, ibis::TYPE_T t, const char *name, const char *desc="", double low=DBL_MAX, double high=-DBL_MAX)
 Construct a new column object based on type and name.
 
int contractRange (ibis::qContinuousRange &rng) const
 Contract the range expression so that the new range falls exactly on the bin boundaries. More...
 
const char * dataFileName (std::string &fname, const char *dir=0) const
 Name of the data file in the given data directory. More...
 
const char * description () const
 Description of the column. Can be an arbitrary string.
 
void description (const char *d)
 
int elementSize () const
 Size of a data element in bytes.
 
virtual double estimateCost (const ibis::qContinuousRange &cmp) const
 Estimate the cost of evaluating the query expression.
 
virtual double estimateCost (const ibis::qDiscreteRange &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
 
virtual double estimateCost (const ibis::qIntHod &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
 
virtual double estimateCost (const ibis::qUIntHod &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
 
virtual double estimateCost (const ibis::qString &) const
 Estimate the cost of evaluating a string lookup.
 
virtual double estimateCost (const ibis::qAnyString &) const
 Estimate the cost of looking up a group of strings.
 
virtual long estimateRange (const ibis::qContinuousRange &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound on the number of hits using the bitmap index. More...
 
virtual long estimateRange (const ibis::qDiscreteRange &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits. More...
 
virtual long estimateRange (const ibis::qIntHod &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits. More...
 
virtual long estimateRange (const ibis::qUIntHod &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits. More...
 
virtual long estimateRange (const ibis::qContinuousRange &cmp) const
 Use the index of the column to compute an upper bound on the number of hits. More...
 
virtual long estimateRange (const ibis::qDiscreteRange &cmp) const
 
virtual long estimateRange (const ibis::qIntHod &cmp) const
 Compute an upper bound on the number of hits. More...
 
virtual long estimateRange (const ibis::qUIntHod &cmp) const
 Compute an upper bound on the number of hits. More...
 
virtual long evaluateAndSelect (const ibis::qContinuousRange &, const ibis::bitvector &, void *, ibis::bitvector &) const
 Evaluate a range condition and retrieve the selected values. More...
 
virtual long evaluateRange (const ibis::qContinuousRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer. More...
 
virtual long evaluateRange (const ibis::qDiscreteRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
 
virtual long evaluateRange (const ibis::qIntHod &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
 
virtual long evaluateRange (const ibis::qUIntHod &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
 
int expandRange (ibis::qContinuousRange &rng) const
 Expand the range expression so that the new range falls exactly on the bin boundaries. More...
 
virtual const char * findString (const char *) const
 Determine if the input string has appeared in this data partition. More...
 
std::string fullname () const
 Fully qualified name. More...
 
int getDataflag () const
 
virtual const ibis::dictionarygetDictionary () const
 Return a pointer to a dictionary. More...
 
array_t< double > * getDoubleArray () const
 Return all rows of the column as an array_t object.
 
array_t< float > * getFloatArray () const
 Return all rows of the column as an array_t object.
 
array_t< int32_t > * getIntArray () const
 Return all rows of the column as an array_t object. More...
 
void getNullMask (bitvector &mask) const
 If there is a null mask stored already, return a shallow copy of it in mask. More...
 
virtual ibis::fileManager::storagegetRawData () const
 Return the content of base data file as a storage object.
 
virtual int getString (uint32_t, std::string &) const
 Return the string value for the ith row. More...
 
const unixTimeScribegetTimeFormat () const
 
virtual float getUndecidable (const ibis::qContinuousRange &cmp, ibis::bitvector &iffy) const
 Compute the locations of the rows can not be decided by the index. More...
 
virtual float getUndecidable (const ibis::qDiscreteRange &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index.
 
virtual float getUndecidable (const ibis::qIntHod &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index. More...
 
virtual float getUndecidable (const ibis::qUIntHod &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index. More...
 
bool hasIndex () const
 !< Are the values sorted? More...
 
virtual bool hasRawData () const
 Does the raw data file exist?
 
bool hasRoster () const
 Is there a roster list built for this column? Returns true for yes, false for no. More...
 
uint32_t indexedRows () const
 Compute the number of rows captured by the index of this column. More...
 
virtual void indexSerialSizes (uint64_t &, uint64_t &, uint64_t &) const
 Compute the sizes (in number of elements) of three arrays that would be produced by writeIndex. More...
 
const char * indexSpec () const
 
void indexSpec (const char *spec)
 !< Retrieve the number of bins used. More...
 
void indexSpeedTest () const
 Perform a set of built-in tests to determine the speed of common operations. More...
 
virtual int indexWrite (ibis::array_t< double > &, ibis::array_t< int64_t > &, ibis::array_t< uint32_t > &) const
 Write the index into three arrays.
 
bool isFloat () const
 Are they floating-point values?
 
bool isInteger () const
 Are they integer values?
 
bool isNumeric () const
 Are they numberical values?
 
bool isSignedInteger () const
 Are they signed integer values?
 
bool isSorted () const
 
void isSorted (bool)
 Change the flag m_sorted. More...
 
bool isUnsignedInteger () const
 Are they unsigned integer values?
 
virtual long keywordSearch (const char *, ibis::bitvector &) const
 
virtual long keywordSearch (const char *) const
 
virtual long keywordSearch (const std::vector< std::string > &, ibis::bitvector &) const
 
virtual long keywordSearch (const std::vector< std::string > &) const
 
void logMessage (const char *event, const char *fmt,...) const
 Log messages using printf syntax.
 
void logWarning (const char *event, const char *fmt,...) const
 Log warming message using printf syntax.
 
const double & lowerBound () const
 The lower bound of the values.
 
void lowerBound (double d)
 
const char * name () const
 Name of the column.
 
void name (const char *nm)
 Rename the column.
 
int nRows () const
 
const char * nullMaskName (std::string &fname) const
 Name of the NULL mask file. More...
 
uint32_t numBins () const
 !< Retrieve the index specification.
 
const partpartition () const
 
const part *& partition ()
 
virtual long patternSearch (const char *) const
 
virtual long patternSearch (const char *, ibis::bitvector &) const
 
void preferredBounds (std::vector< double > &) const
 Retrive the bin boundaries if the index currently in use.
 
void purgeIndexFile (const char *dir=0) const
 Purge the index files assocated with the current column.
 
virtual long saveSelected (const ibis::bitvector &sel, const char *dest, char *buf, uint32_t nbuf)
 Write the selected records to the specified directory. More...
 
long selectValues (const bitvector &, void *) const
 Return selected rows of the column in an array_t object. More...
 
long selectValues (const bitvector &, void *, array_t< uint32_t > &) const
 Return selected rows of the column in an array_t object along with their positions. More...
 
long selectValues (const ibis::qContinuousRange &, void *) const
 Select the values satisfying the specified range condition.
 
void setDataflag (int df)
 
int setNullMask (const bitvector &)
 Change the null mask to the user specified one. More...
 
void setTimeFormat (const char *)
 Add a custom format for the column to be interpretted as unix time stamps.
 
void setTimeFormat (const unixTimeScribe &)
 
virtual long truncateData (const char *dir, uint32_t nent, ibis::bitvector &mask) const
 Truncate the number of records in the named dir to nent. More...
 
ibis::TYPE_T type () const
 Type of the data. More...
 
virtual void unloadIndex () const
 Unload the index associated with the column. More...
 
const double & upperBound () const
 The upper bound of the values.
 
void upperBound (double d)
 
virtual ~column ()
 Destructor. More...
 
long getCumulativeDistribution (std::vector< double > &bounds, std::vector< uint32_t > &counts) const
 Compute the actual data distribution. More...
 
long getDistribution (std::vector< double > &bbs, std::vector< uint32_t > &counts) const
 Count the number of records in each bin. More...
 

Protected Member Functions

int extractAll (const bitvector &, array_t< char > &, array_t< uint64_t > &, const array_t< char > &, const array_t< int64_t > &) const
 Extract entries marked 1 in mask from raw to buffer. More...
 
int extractAll (const bitvector &, array_t< char > &, array_t< uint64_t > &, const char *, const array_t< int64_t > &) const
 Retrieve all binary objects marked 1 in the mask. More...
 
int extractSome (const bitvector &, array_t< char > &, array_t< uint64_t > &, const array_t< char > &, const array_t< int64_t > &, const uint32_t) const
 Extract entries marked 1 in mask from raw to buffer subject to a limit on the buffer size. More...
 
int extractSome (const bitvector &, array_t< char > &, array_t< uint64_t > &, const char *, const array_t< int64_t > &, const uint32_t) const
 Retrieve binary objects marked 1 in the mask subject to the specified limit on buffer size. More...
 
int extractSome (const bitvector &, array_t< char > &, array_t< uint64_t > &, const char *, const char *, const uint32_t) const
 Retrieve binary objects marked 1 in the mask subject to the specified limit on buffer size. More...
 
int readBlob (uint32_t ind, char *&buf, uint64_t &size, const array_t< int64_t > &starts, const char *datafile) const
 Read a single binary object. More...
 
int readBlob (uint32_t ind, char *&buf, uint64_t &size, const char *spfile, const char *datafile) const
 Read a single binary object. More...
 
- Protected Member Functions inherited from ibis::column
void actualMinMax (const char *fname, const ibis::bitvector &mask, double &min, double &max, bool &asc) const
 Given the name of the data file, compute the actual minimum and the maximum value. More...
 
long appendStrings (const std::vector< std::string > &, const ibis::bitvector &)
 Append the strings to the current data. More...
 
template<typename T >
long appendValues (const array_t< T > &, const ibis::bitvector &)
 Append the content of incoming array to the current data. More...
 
double computeMax () const
 Read the base data to compute the maximum value.
 
double computeMin () const
 Read the data values and compute the minimum value.
 
double computeSum () const
 Read the base data to compute the total sum.
 
template<typename T >
uint32_t findLower (int fdes, const uint32_t nr, const T tgt) const
 Find the smallest value >= tgt. More...
 
template<typename T >
uint32_t findUpper (int fdes, const uint32_t nr, const T tgt) const
 Find the smallest value > tgt. More...
 
void logError (const char *event, const char *fmt,...) const
 Print messages started with "Error" and throw a string exception.
 
virtual int searchSorted (const ibis::qContinuousRange &, ibis::bitvector &) const
 Resolve a continuous range condition on a sorted column.
 
virtual int searchSorted (const ibis::qDiscreteRange &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
 
virtual int searchSorted (const ibis::qIntHod &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
 
virtual int searchSorted (const ibis::qUIntHod &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
 
template<typename T >
int searchSortedICC (const array_t< T > &vals, const ibis::qContinuousRange &rng, ibis::bitvector &hits) const
 Resolve a continuous range condition on an array of values.
 
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qDiscreteRange &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
 
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
 
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qUIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
 
template<typename T >
int searchSortedOOCC (const char *fname, const ibis::qContinuousRange &rng, ibis::bitvector &hits) const
 Resolve a continuous range condition using file operations. More...
 
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qDiscreteRange &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations. More...
 
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations. More...
 
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qUIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations. More...
 
template<typename T >
long selectToOpaques (const char *, const bitvector &, std::vector< ibis::opaque > &) const
 
template<typename T >
long selectToStrings (const char *, const bitvector &, std::vector< std::string > &) const
 Extract the values masked 1 and convert them to strings.
 
template<>
long selectToStrings (const char *, const bitvector &, std::vector< std::string > &) const
 
template<>
long selectToStrings (const char *, const bitvector &, std::vector< std::string > &) const
 
template<>
long selectToStrings (const char *dfn, const bitvector &mask, std::vector< std::string > &str) const
 
template<>
long selectToStrings (const char *dfn, const bitvector &mask, std::vector< std::string > &str) const
 
template<typename T >
long selectValuesT (const char *, const bitvector &, array_t< T > &) const
 Select values marked in the bitvector mask. More...
 
template<typename T >
long selectValuesT (const char *, const bitvector &mask, array_t< T > &vals, array_t< uint32_t > &inds) const
 Select the values marked in the bitvector mask. More...
 
long string2int (int fptr, dictionary &dic, uint32_t nbuf, char *buf, array_t< uint32_t > &out) const
 Convert strings in the opened file to a list of integers with the aid of a dictionary. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from ibis::column
template<typename T >
static void actualMinMax (const array_t< T > &vals, const ibis::bitvector &mask, double &min, double &max, bool &asc)
 Compute the minimum and maximum of the values in the array.
 
template<typename T >
static T computeMax (const array_t< T > &vals, const ibis::bitvector &mask)
 Compute the maximum value in the array.
 
template<typename T >
static T computeMin (const array_t< T > &vals, const ibis::bitvector &mask)
 Compute the minimum value in the array.
 
template<typename T >
static double computeSum (const array_t< T > &vals, const ibis::bitvector &mask)
 Compute the sum of values in the array.
 
- Protected Attributes inherited from ibis::column
int dataflag
 Presence of the data file. More...
 
ibis::indexidx
 The index for this column. It is not considered as a must-have member.
 
ibis::util::sharedInt32 idxcnt
 The number of functions using the index.
 
double lower
 !< Are the column values in ascending order?
 
std::string m_bins
 !< Free-form description of the column.
 
std::string m_desc
 !< Name of the column.
 
std::string m_name
 !< Data type.
 
bool m_sorted
 !< Index/binning specification.
 
ibis::TYPE_T m_type
 !< The entries marked 1 are valid.
 
unixTimeScribem_utscribe
 !< The maximum value.
 
ibis::bitvector mask_
 !< Data partition containing this column.
 
const partthePart
 
double upper
 !< The minimum value.
 

Detailed Description

A class to provide a minimal support for byte arrays.

Since a byte array may contain any arbitrary byte values, we can not rely on the null terminator any more, nor use std::string as the container for each array. It is intended to store opaque data, and can not be searched.

Member Function Documentation

virtual long ibis::blob::append ( const void *  vals,
const ibis::bitvector msk 
)
inlinevirtual

Append the records in vals to the current working dataset.

The 'void*' in this function follows the convention of the function getValuesArray (not writeData), i.e., for the ten fixed-size elementary data types, it is array_t<type>* and for string-valued columns it is std::vector<std::string>*.

Return the number of entries actually written to disk or a negative number to indicate error conditions.

Reimplemented from ibis::column.

long ibis::blob::append ( const char *  dt,
const char *  df,
const uint32_t  nold,
const uint32_t  nnew,
uint32_t  nbuf,
char *  buf 
)
virtual

Append the content in df to the end of files in dt.

It returns the number of rows appended or a negative number to indicate error conditions.

Reimplemented from ibis::column.

References ibis::fileManager::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::cnt(), ibis::util::guardBase::dismiss(), ibis::util::logMessage(), ibis::bitvector::read(), ibis::bitvector::size(), ibis::fileManager::buffer< T >::size(), UnixOpen, and ibis::bitvector::write().

virtual void ibis::blob::computeMinMax ( )
inlinevirtual

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the active data directory and modifies the member variables to record the actual min/max.

Reimplemented from ibis::column.

virtual void ibis::blob::computeMinMax ( const char *  dir)
inlinevirtual

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the given directory and modifies the member variables to record the actual min/max.

Reimplemented from ibis::column.

virtual void ibis::blob::computeMinMax ( const char *  dir,
double &  min,
double &  max,
bool &  asc 
) const
inlinevirtual

Compute the actual min/max of the data in directory dir.

Report the actual min/max found back through output arguments min and max. This version does not modify the min/max recorded in this column object.

Reimplemented from ibis::column.

long ibis::blob::countRawBytes ( const bitvector mask) const

Count the number of bytes in the blobs selected by the mask.

This function can be used to compute the memory requirement before actually retrieving the blobs.

It returns a negative number in case of error.

References ibis::array_t< T >::clear(), ibis::bitvector::cnt(), ibis::fileManager::getFile(), ibis::fileManager::instance(), ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::size(), ibis::bitvector::size(), and UnixOpen.

int ibis::blob::extractAll ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions,
const array_t< char > &  raw,
const array_t< int64_t > &  starts 
) const
protected

Extract entries marked 1 in mask from raw to buffer.

Fill positions to indicate the start and end positions of each raw binary object. Caller has determined that there is sufficient amount of space to perform this operations and have reserved enough space for buffer. Even though that may not be a guarantee, we proceed as if it is.

References ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::array_t< T >::resize(), and ibis::array_t< T >::size().

int ibis::blob::extractAll ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions,
const char *  rawfile,
const array_t< int64_t > &  starts 
) const
protected

Retrieve all binary objects marked 1 in the mask.

The caller has reserved enough space for buffer and positions. This function simply needs to open rawfile and read the content into buffer. It also assigns values in starts to mark the boundaries of the binary objects.

References ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and UnixOpen.

int ibis::blob::extractSome ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions,
const array_t< char > &  raw,
const array_t< int64_t > &  starts,
const uint32_t  limit 
) const
protected

Extract entries marked 1 in mask from raw to buffer subject to a limit on the buffer size.

Fill positions to indicate the start and end positions of each raw binary object. Caller has determined that there is the amount of space to perform this operations and have reserved enough space for buffer. Even though that may not be a guarantee, we proceed as if it is.

References ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::array_t< T >::resize(), and ibis::array_t< T >::size().

int ibis::blob::extractSome ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions,
const char *  rawfile,
const array_t< int64_t > &  starts,
const uint32_t  limit 
) const
protected

Retrieve binary objects marked 1 in the mask subject to the specified limit on buffer size.

The caller has reserved enough space for buffer and positions. This function simply needs to open rawfile and read the content into buffer. It also assigns values in starts to mark the boundaries of the binary objects.

References ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and UnixOpen.

int ibis::blob::extractSome ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions,
const char *  rawfile,
const char *  spfile,
const uint32_t  limit 
) const
protected

Retrieve binary objects marked 1 in the mask subject to the specified limit on buffer size.

The caller has reserved enough space for buffer and positions. This function needs to open both rawfile and spfile. It reads starting positions in spfile to determine where to read the content from rawfile into buffer. It also assigns values in starts to mark the boundaries of the binary objects in buffer.

References ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and UnixOpen.

virtual double ibis::blob::getActualMax ( ) const
inlinevirtual

Compute the actual maximum value by reading the data or examining the index.

It returns -DBL_MAX in case of error.

Reimplemented from ibis::column.

virtual double ibis::blob::getActualMin ( ) const
inlinevirtual

A group of functions to compute some basic statistics for the column values.

Compute the actual minimum value by reading the data or examining the index. It returns DBL_MAX in case of error.

Reimplemented from ibis::column.

int ibis::blob::getBlob ( uint32_t  ind,
char *&  buf,
uint64_t &  size 
) const

Extract a single binary object.

This function is only defined for ibis::blob, therefore the caller must explicitly cast a column* to blob*. It needs to access two files, a file for start positions and another for raw binary data. Thus it has a large startup cost associated with opening the files and seeking to the right places on disk. If there is enough memory available, it will attempt to make these files available for later invocations of this function by making their content available through array_t objects. If it fails to create the desired array_t objects, it will fall back to use explicit I/O function calls.

References ibis::array_t< T >::clear(), ibis::util::copy(), ibis::fileManager::getFile(), ibis::fileManager::instance(), and ibis::array_t< T >::size().

Referenced by ibis::mensa::cursor::dumpIJ(), and ibis::mensa::cursor::getColumnAsString().

int ibis::blob::getOpaque ( uint32_t  irow,
ibis::opaque &  opq 
) const
virtual

Return the raw binary value for the ith row.

This is primarily intended to retrieve values of blobs.

See also
ibis::blob

Reimplemented from ibis::column.

virtual int ibis::blob::getValuesArray ( void *  vals) const
inlinevirtual

Copy all rows of the column into an array_t object.

The incoming argument must be array_t<Type>*. This function explicitly casts vals into one of the ten supported numerical data types. If the incoming argument is not of the correct type, this cast operatioin can will have unpredictable consequence.

It returns 0 to indicate success, and a negative number to indicate error. If vals is nil, no values is copied, this function essentially tests whether the values are accessible: >= 0 yes, < 0 no.

Reimplemented from ibis::column.

virtual long ibis::blob::indexSize ( ) const
inlinevirtual

Compute the index size (in bytes).

Return a negative value if the index is not in memory and the index file does not exist.

Reimplemented from ibis::column.

virtual void ibis::blob::loadIndex ( const char *  iopt,
int  ropt 
) const
throw (
)
inlinevirtual

Load the index associated with the column.

Parameters
ioptThis option is passed to ibis::index::create to be used if a new index is to be created.
roptThis option is passed to ibis::index::create to control the reading operations for reconstitute the index object from an index file.
Note
Accesses to this function are serialized through a write lock on the column. It blocks while acquire the write lock.

Reimplemented from ibis::column.

int ibis::blob::readBlob ( uint32_t  ind,
char *&  buf,
uint64_t &  size,
const array_t< int64_t > &  starts,
const char *  datafile 
) const
protected

Read a single binary object.

The starting position is available in an array_t object. It only needs to explicitly open the data file to read.

References UnixOpen.

int ibis::blob::readBlob ( uint32_t  ind,
char *&  buf,
uint64_t &  size,
const char *  spfile,
const char *  datafile 
) const
protected

Read a single binary object.

This function opens both starting position file and data file explicitly.

References UnixOpen.

virtual array_t<signed char>* ibis::blob::selectBytes ( const bitvector mask) const
inlinevirtual

Retrieve selected 1-byte integer values.

Note that unsigned integers are simply treated as signed integers.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<double>* ibis::blob::selectDoubles ( const bitvector mask) const
inlinevirtual

Put the selected values into an array as doubles.

Note
Any numerical values can be converted to doubles, however for 64-bit integers this conversion may cause lose of precision.
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<float>* ibis::blob::selectFloats ( const bitvector mask) const
inlinevirtual

Put selected values of a float column into an array.

Note
Only performs safe conversion. Conversions from 32-bit integers, 64-bit integers and 64-bit floating-point values are not allowed. A nil array will be returned if the current column can not be converted.
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<int32_t>* ibis::blob::selectInts ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<int64_t>* ibis::blob::selectLongs ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Can be called on all integral types. Note that 64-byte unsigned integers are simply treated as signed integers. This may cause the values to be interperted incorrectly. Shorter version of unsigned integers are treated correctly as positive values.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

std::vector< ibis::opaque > * ibis::blob::selectOpaques ( const bitvector mask) const
virtual

Extract the blobs from the rows marked 1 in the mask.

It returns a vector of opaque objects and internally uses selectRawBytes.

A negative value will be returned in case of error.

Reimplemented from ibis::column.

References ibis::bitvector::cnt(), ibis::array_t< T >::size(), and ibis::bitvector::size().

int ibis::blob::selectRawBytes ( const bitvector mask,
ibis::array_t< char > &  buffer,
ibis::array_t< uint64_t > &  positions 
) const

Extract the blobs from the rows marked 1 in the mask.

Upon successful completion, the buffer will contain all the raw bytes packed together, positions will contain the starting positions of each blobs, and the return value will be the number of blobs retrieved. Even though the positions are 64-bit integers, because the buffer has to fit in memory, it is not possible to retrieve very large objects this way. The number of bytes in buffer is limited to be less than half of the free memory available and this limite is hardcoded into this function. To determine how much memory would be needed by the buffer to full retrieve all blobs marked 1, use function ibis::blob::countRawBytes.

A negative value will be returned in case of error.

References ibis::fileManager::bytesFree(), ibis::array_t< T >::capacity(), ibis::array_t< T >::clear(), ibis::bitvector::cnt(), ibis::fileManager::getFile(), ibis::fileManager::instance(), ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::reserve(), ibis::array_t< T >::size(), and ibis::bitvector::size().

virtual array_t<int16_t>* ibis::blob::selectShorts ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Can convert all integers 2-byte or less in length. Note that unsigned integers are simply treated as signed integers. Shoter types of signed integers are treated correctly as positive values.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual std::vector<std::string>* ibis::blob::selectStrings ( const bitvector mask) const
inlinevirtual

Return the selected rows as strings.

This version returns a std::vector<std::string>, which provides wholly self-contained string values. It may take more memory than necessary, and the memory usage of std::string is not tracked by FastBit. The advantage is that it should work regardless of the actual data type of the column.

Reimplemented from ibis::column.

virtual array_t<unsigned char>* ibis::blob::selectUBytes ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<uint32_t>* ibis::blob::selectUInts ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Can be called on columns of unsigned integral types, UINT, CATEGORY, USHORT, and UBYTE.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<uint64_t>* ibis::blob::selectULongs ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Can be called on all unsigned integral types.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

virtual array_t<uint16_t>* ibis::blob::selectUShorts ( const bitvector mask) const
inlinevirtual

Return selected rows of the column in an array_t object.

Note
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

long ibis::blob::writeData ( const char *  dir,
uint32_t  nold,
uint32_t  nnew,
ibis::bitvector mask,
const void *  va1,
void *  va2 
)
virtual

Write the content of BLOBs packed into two arrays va1 and va2.

All BLOBs are packed together one after another in va1 and their starting positions are stored in va2. The last element of va2 is the total number of bytes in va1. The array va2 is expected to hold (nnew+1) 64-bit integers.

Note
The array va2 is modified in this function to have a starting position that is the end of the existing data file.

Reimplemented from ibis::column.

References ibis::bitvector::adjustSize(), ibis::util::guardBase::dismiss(), and UnixOpen.


The documentation for this class was generated from the following files:

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive