Classes | Typedefs | Functions | Variables
ibis::util Namespace Reference

Organize the miscellaneous functions under the name util. More...

Classes

class  counter
 A simple shared counter. More...
 
class  flock
 A simple wrapper on flock. More...
 
class  guardBase
 A class hierarchy for cleaning up after durable resources. More...
 
class  guardImpl0
 A concrete class for cleanup jobs that take a function without any argument. More...
 
class  guardImpl1
 A concrete class for cleanup jobs that take a function with one argument. More...
 
class  guardImpl2
 A concrete class for cleanup jobs that take a function with two arguments. More...
 
class  guardObj0
 A class to work with class member functions with no arguments. More...
 
struct  heap
 A simple heap based on std::push_heap and std::pop_heap. More...
 
class  ioLock
 A global I/O lock. More...
 
class  logger
 A class for logging messages. More...
 
class  mutexLock
 An wrapper class for perform pthread_mutex_lock/unlock. More...
 
class  quietLock
 An wrapper class for perform pthread_mutex_lock/unlock. More...
 
class  readLock
 An wrapper class for perform pthread_rwlock_rdlock/unlock. More...
 
class  refHolder
 A template to hold a reference to an object. More...
 
class  sharedInt32
 A shared unsigned 32-bit integer class. More...
 
class  sharedInt64
 A unsigned 64-bit shared integer class. More...
 
class  softLock
 An wrapper class for perform pthread_mutex_trylock/unlock. More...
 
class  timer
 Print simple timing information. More...
 
class  writeLock
 An wrapper class for perform pthread_rwlock_wrlock/unlock. More...
 

Typedefs

typedef const guardBaseguard
 The type to be used by client code. More...
 

Functions

void clear (ibis::array_t< ibis::bitvector * > &bv) throw ()
 Clear an array of bit vectors.
 
void clear (ibis::partList &pl) throw ()
 Deallocate the list of data partitions.
 
template<typename T >
void clearVec (std::vector< T * > &v)
 A template to clean up a vector of pointers.
 
void closeLogFile ()
 Close the log file. More...
 
int copy (const char *to, const char *from)
 Copy file named "from" to a file named "to". More...
 
void emptyCache (void)
 Attempt to remove all currently unused data from memory cache. More...
 
template<typename T >
size_t find (const std::vector< T > &, const T &, size_t)
 Find the first position where the value is no less than val. More...
 
template<typename T >
size_t find (const array_t< T > &, const T &, size_t)
 Find the first position where the value is no less than val. More...
 
template<typename T >
uint32_t find (const array_t< T > &, const array_t< uint32_t > &, const T &, uint32_t)
 Find the position of the first element that is no less than val. More...
 
unsigned int gatherParts (ibis::partList &parts, const char *adir, const char *bdir, bool ro=false)
 Look for data partitions in the given pair of directories. More...
 
unsigned int gatherParts (ibis::partList &parts, const char *adir, bool ro=false)
 Look for data partitions in the given directory. More...
 
unsigned int gatherParts (ibis::partList &parts, const ibis::resource &res, bool ro=false)
 Reconstruct partitions using data directories specified in the resource list. More...
 
off_t getFileSize (const char *name)
 Return size of the file in bytes. More...
 
void getGMTime (char *str)
 Return the current GMT time in string format. More...
 
void getLocalTime (char *str)
 Return the current time in string format as asctime_r. More...
 
FILE * getLogFile ()
 Retrieve the pointer to the log file. More...
 
const char * getLogFileName ()
 Return name the of the current log file. More...
 
char * getString (const char *buf)
 Extract a string from the given buf. More...
 
const char * getToken (char *&str, const char *tok_chrs)
 Return a null-terminated string from the beginning of input string str. More...
 
int getVersionNumber ()
 Return an integer designating the version of this software. More...
 
const char * getVersionString ()
 Return a pointer to the string designating the version of this software. More...
 
long intersect (const std::vector< ibis::bitvector > &bits1, const std::vector< ibis::bitvector > &bits2, std::vector< ibis::bitvector > &res)
 Intersect two sets of bit vectors. More...
 
long intersect (const std::vector< ibis::bitvector > &bits1, const std::vector< ibis::bitvector > &bits2, const std::vector< ibis::bitvector > &bits3, std::vector< ibis::bitvector > &res)
 Intersect three sets of bit vectors. More...
 
bool isFloatType (ibis::TYPE_T t)
 Is the type for floating-point values?
 
bool isIntegerType (ibis::TYPE_T t)
 Is the type for integer values?
 
bool isNumericType (ibis::TYPE_T t)
 Is the type for numberical values?
 
bool isSignedIntegerType (ibis::TYPE_T t)
 Is the type for signed integer values?
 
bool isStringType (ibis::TYPE_T t)
 Is the type for strings?
 
bool isUnsignedIntegerType (ibis::TYPE_T t)
 Is the type for unsigned integer values?
 
char * itoa (int value, char *str, int)
 
void logMessage (const char *event, const char *fmt,...)
 Print a message to standard output. More...
 
int makeDir (const char *dir)
 Recursivly create directory "dir". More...
 
template<typename F >
guardImpl0< F > makeGuard (F f)
 
template<typename F , typename A >
guardImpl1< F, A > makeGuard (F f, A a)
 
template<typename F , typename A1 , typename A2 >
guardImpl2< F, A1, A2 > makeGuard (F f, A1 a1, A2 a2)
 
bool nameMatch (const char *str, const char *pat)
 Match the string str against a simple pattern pat without considering cases. More...
 
template<class C , typename F >
guardObj0< C, F > objectGuard (C o, F f)
 
const ibis::bitvector64outerProduct (const ibis::bitvector &a, const ibis::bitvector &b, ibis::bitvector64 &c)
 Compute the outer product of a and b, add the result to c. More...
 
const ibis::bitvector64outerProductUpper (const ibis::bitvector &a, const ibis::bitvector &b, ibis::bitvector64 &c)
 Add the strict upper triangular portion of the outer production between a and b to c. More...
 
double rand ()
 A very simple pseudo-random number generator. More...
 
int64_t read (int, void *, int64_t)
 A wrapper over POSIX read function. More...
 
int readDouble (double &val, const char *&str, const char *del=ibis::util::delimiters)
 Attempt to convert the incoming string into a double. More...
 
int readInt (int64_t &val, const char *&str, const char *del=ibis::util::delimiters)
 Attempt to convert the incoming string into an integer. More...
 
int readString (std::string &str, const char *&buf, const char *delim=0)
 Copy the next string to the output variable str. More...
 
const char * readString (char *&buf, const char *delim=0)
 Attempt to extract a string token from the incoming buffer. More...
 
int readUInt (uint64_t &val, const char *&str, const char *del=ibis::util::delimiters)
 Attempt to convert the incoming string into a unsigned integer. More...
 
template<class T >
refHolder< T > ref (T &r)
 A function template to produce refHolder.
 
void removeDir (const char *name, bool leaveDir=false)
 Remove the content of named directory. More...
 
void removeTail (char *str, char tail)
 Remove trailing character 'tail' from str.
 
template<typename T >
void reorder (array_t< T > &arr, const array_t< uint32_t > &ind)
 Reorder the array arr according to the indices given in ind. More...
 
void reorder (std::vector< std::string > &arr, const array_t< uint32_t > &ind)
 Reorder string values. More...
 
template<typename T >
void reorder (array_t< T * > &arr, const array_t< uint32_t > &ind)
 Reorder the array arr according to the indices given in ind.
 
void secondsToString (const time_t, char *str)
 Converts the given time in seconds (as returned by function time) into the string (as from asctime_r). More...
 
uint32_t serialNumber ()
 Compute a serial number. More...
 
int setLogFileName (const char *filename)
 Change the current log file to the named file. More...
 
void setVerboseLevel (int v)
 Set the verboseness level. More...
 
template<typename T1 , typename T2 >
void sort_heap (array_t< T1 > &keys, array_t< T2 > &vals)
 Heapsort. More...
 
template<typename T1 , typename T2 >
void sort_insertion (array_t< T1 > &keys, array_t< T2 > &vals)
 Insertion sort. More...
 
template<typename T1 , typename T2 >
uint32_t sort_partition (array_t< T1 > &keys, array_t< T2 > &vals)
 Partition function for quicksort. More...
 
template<typename T1 , typename T2 >
void sort_partition3 (array_t< T1 > &keys, array_t< T2 > &vals, uint32_t &starteq, uint32_t &startgt)
 Three-way partitioning algorithm for quicksort. More...
 
template<typename T1 , typename T2 >
void sort_quick (array_t< T1 > &keys, array_t< T2 > &vals, uint32_t lvl)
 Quicksort. More...
 
template<typename T1 , typename T2 >
void sort_quick3 (array_t< T1 > &keys, array_t< T2 > &vals)
 Quicksort. More...
 
template<typename T1 , typename T2 >
void sort_shell (array_t< T1 > &keys, array_t< T2 > &vals)
 Shell sort. More...
 
template<typename T1 , typename T2 >
void sortAll (array_t< T1 > &arr1, array_t< T2 > &arr2)
 Sort two arrays together. More...
 
template<typename T1 , typename T2 >
void sortAll_quick (array_t< T1 > &arr1, array_t< T2 > &arr2)
 Quick sort. Uses both arrays as keys and Moves all records of both arrays.
 
template<typename T1 , typename T2 >
void sortAll_shell (array_t< T1 > &arr1, array_t< T2 > &arr2)
 Shell sort. Sort both arrays arr1 and arr2.
 
template<typename T1 , typename T2 >
uint32_t sortAll_split (array_t< T1 > &arr1, array_t< T2 > &arr2)
 The parititioning function for ibis::util::sortAll. More...
 
template<typename T1 , typename T2 >
void sortKeys (array_t< T1 > &keys, array_t< T2 > &vals)
 Sorting function with payload. More...
 
int64_t sortMerge (std::vector< std::string > &valR, array_t< uint32_t > &indR, std::vector< std::string > &valS, array_t< uint32_t > &indS)
 An in-memory sort merge join function with string values.
 
template<typename T >
int64_t sortMerge (array_t< T > &valR, array_t< uint32_t > &indR, array_t< T > &valS, array_t< uint32_t > &indS)
 An in-memory sort merge join function. More...
 
template<typename T >
int64_t sortMerge (array_t< T > &valR, array_t< uint32_t > &indR, array_t< T > &valS, array_t< uint32_t > &indS, double delta1, double delta2)
 An in-memory sort merge join function. More...
 
void sortStrings (std::vector< std::string > &keys, array_t< uint32_t > &vals)
 Sorting function with string as keys and uint32_t as payload. More...
 
void sortStrings (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 Quicksort for strings. More...
 
void sortStrings (array_t< const char * > &keys, array_t< uint32_t > &vals)
 Sorting function with string as keys and uint32_t as payload. More...
 
void sortStrings (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 Quicksort for strings. More...
 
uint32_t sortStrings_partition (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 The partitioning procedure for quick sort. More...
 
uint32_t sortStrings_partition (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 The partitioning procedure for quick sort. More...
 
void sortStrings_shell (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 Shell sorting procedure. More...
 
void sortStrings_shell (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end)
 Shell sorting procedure. More...
 
bool strMatch (const char *str, const char *pat)
 Match the string str against a simple pattern pat. More...
 
char * strnewdup (const char *s)
 Duplicate string content with C++ default new operator. More...
 
char * strnewdup (const char *s, const uint32_t n)
 Duplicate no more than n characters.
 
char * trim (char *str)
 Remove leading and trailing blank space.
 
void uniformFraction (const long unsigned idx, long unsigned &denominator, long unsigned &numerator)
 Compute a denominator and numerator pair. More...
 
void updateDatasets (void)
 Update the metadata about the data partitions. More...
 
const char * userName ()
 Return the user name. More...
 
int64_t write (int, const void *, int64_t)
 A wrapper over POSIX write function. More...
 
int writeLogFileHeader (FILE *fptr, const char *fname)
 Write a header to the log file. More...
 
uint32_t checksum (const char *str, uint32_t sz)
 Fletcher's arithmetic checksum with 32-bit result.
 
uint32_t checksum (uint32_t a, uint32_t b)
 Fletcher's checksum on two integers. Returns an integer.
 
std::string shortName (const std::string &longname)
 Use the Fletcher's checksum to produce a short string. More...
 
std::string randName (const std::string &longname)
 Generate a short string to be used as a table/partition name.
 
void int2string (std::string &str, unsigned val)
 Pack a 32-bit integer into six base-64 alphabets.
 
void int2string (std::string &str, unsigned v1, unsigned v2)
 Pack two 32-bit integers into 11 base-64 alphabets.
 
void int2string (std::string &str, unsigned v1, unsigned v2, unsigned v3)
 Pack three 32-bit integers into 16 base-64 alphabets.
 
void int2string (std::string &str, const std::vector< unsigned > &val)
 Convert the incoming numbers into a base-64 alpha-numeric representation. More...
 
void encode64 (uint64_t, std::string &)
 Turn the incoming integer into a 64-bit representation. More...
 
int decode64 (uint64_t &, const std::string &)
 Decode a number encoded using ibis::util::encode64.
 
int decode16 (uint64_t &, const char *)
 Convert a string of hexadecimal digits back to an integer. More...
 
std::string groupby1000 (uint64_t)
 Produce a string version of the unsigned integer value with the decimal digits grouped into 1000s. More...
 
double incrDouble (const double &)
 Functions to handle manipulation of floating-point numbers. More...
 
double decrDouble (const double &)
 Decrease the input value to the next smaller value. More...
 
void eq2range (const double &, double &, double &)
 Generate a range [left, right) that contains exactly the input value in. More...
 
double coarsen (const double in, unsigned prec=2)
 Reduce the decimal precision of the incoming floating-point value to specified precision. More...
 
double compactValue (double left, double right, double start=0.0)
 Compute a compact 64-bit floating-point value with a short decimal representation. More...
 
double compactValue2 (double left, double right, double start=0.0)
 Compute a compact 64-bit floating-point value with a short binary representation. More...
 
void setNaN (double &val)
 Set a double to NaN.
 
void setNaN (float &val)
 Functions to handle manipulation of floating-point numbers. More...
 
template<typename Tin , typename Tout >
void round_down (const Tin &inval, Tout &outval)
 Round the incoming value to the largest output value that is no more than the input. More...
 
template<typename Tin , typename Tout >
void round_up (const Tin &inval, Tout &outval)
 Round the incoming value to the smallest output value that is no less than the input. More...
 
template<typename Tin >
void round_up (const Tin &inval, float &)
 A specialization of round_up for the output type float. More...
 
template<typename Tin >
void round_up (const Tin &inval, double &outval)
 A specialization of round_up for the output in double.
 
int log2 (uint32_t x)
 Log_2 of a 32-bit integer.
 
int log2 (uint64_t x)
 Log_2 of a 64-bit integer.
 
template<typename T >
void sort_radix (array_t< char > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< signed char > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< unsigned char > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< uint16_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< int16_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< uint32_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< int32_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< uint64_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< int64_t > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< float > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
template<typename T >
void sort_radix (array_t< double > &keys, array_t< T > &vals)
 LSD Radix sort. More...
 
void sortRIDs (ibis::RIDSet &)
 Sort RID lists. More...
 
void sortRIDsq (ibis::RIDSet &, uint32_t, uint32_t)
 Sort a portion of the RIDSet with quick sort. More...
 
void sortRIDsi (ibis::RIDSet &, uint32_t, uint32_t)
 Sort a portion of the RIDset with insertion sort. More...
 

Variables

const short unsigned charIndex []
 charIndex maps the characters (ASCII) back to integer [0-64] More...
 
const char * charTable
 charTable lists the 64 printable characters to be used for names More...
 
const char * delimiters = ";, \v\b\f\r\t\n'\""
 Delimiters used to separate a string of names. More...
 
pthread_mutex_t envLock = PTHREAD_MUTEX_INITIALIZER
 A mutex for serialize operations FastBit wide. More...
 
const int log2table [256]
 log base 2 of an integer, the lookup table More...
 
const uint32_t shellgaps [8] = {1, 4, 10, 23, 57, 132, 301, 701}
 Gaps for Shell sort from http://en.wikipedia.org/wiki/Shell_sort by Ciura, 2001. More...
 

Detailed Description

Organize the miscellaneous functions under the name util.

Typedef Documentation

typedef const guardBase& ibis::util::guard

The type to be used by client code.

User code uses type ibis::util::guard along with the overloaded function ibis::util::makeGuard, as in

ibis::util::guard myguard = ibis::util::makeGuard...;
Note
The parameters passed to the function that does the actual clean up jobs are taken as they are at the construction time. If such parameters are modified, the caller needs to either create the guard variable after the parameters take on their final values, or dismiss the old gard and create another one.

Function Documentation

void ibis::util::closeLogFile ( )

Close the log file.

This function is registered with atexit through ibis::init. It will be automatically invoked during exit. However, if necessary, the log file will be reopened during later operations. In any case, the operating system will close the log file after the termination of the program.

Referenced by fastbit_cleanup(), and ibis::init().

double ibis::util::coarsen ( const double  in,
unsigned  prec = 2 
)
inline

Reduce the decimal precision of the incoming floating-point value to specified precision.

In an attempt to compute small values more consistently, small values are computed through division of integer values.

Since these integer values are computed through the function pow, the accuracy of the results depend on the implementation of the math library.

The value zero is always rounded to zero. Incoming value less than 1E-300 or greater than 1E300 is rounded to zero.

Referenced by ibis::fuzz::append(), ibis::bylt::append(), ibis::zona::append(), ibis::fuge::append(), ibis::bak2::mapValues(), ibis::tafel::preferredSize(), ibis::tafel::readCSV(), and ibis::tafel::readSQLDump().

double ibis::util::compactValue ( double  left,
double  right,
double  start = 0.0 
)

Compute a compact 64-bit floating-point value with a short decimal representation.

Compute a compact 64-bit floating-point value in the range (left, right].

The righ-end is inclusive because the computed value is used to define bins based 'x<compactValue' and those bins are interpreted as [low, upper) as in the typical c notion.

Note
The shortest number is considered to be zero (0.0).
If the given start is not in the valid range, the middle point of the range is used.
It returns 0.0 if either left or right is not-a-number.

References logMessage().

Referenced by ibis::bin::addBounds(), ibis::bin::bin(), ibis::range::binBoundaries(), ibis::bak::binBoundaries(), ibis::bin::binning(), ibis::bin::convertGranules(), ibis::part::equalWeightBins(), ibis::bin::expandRange(), ibis::range::expandRange(), ibis::bak::expandRange(), ibis::bak2::expandRange(), ibis::part::get1DDistribution(), ibis::relic::getCumulativeDistribution(), ibis::index::getCumulativeDistribution(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::relic::print(), ibis::direkte::print(), ibis::keywords::print(), ibis::fuzz::print(), ibis::bylt::print(), ibis::zona::print(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::part::recursiveQuery(), ibis::bin::scanAndPartition(), and ibis::bin::setBoundaries().

double ibis::util::compactValue2 ( double  left,
double  right,
double  start = 0.0 
)

Compute a compact 64-bit floating-point value with a short binary representation.

This version round the number in binary and tries to use a few binary digits in the mantissa as possible.

The resulting number should have a better chance of producing exact results in simple arithmetic operations.

See also
ibis::util::compactValue

References logMessage().

Referenced by ibis::part::equalWeightBins(), and ibis::part::get2DDistributionU().

int ibis::util::copy ( const char *  to,
const char *  from 
)

Copy file named "from" to a file named "to".

It overwrites the content of "to". It returns a negative number if the source file does not existing can not be copied.

References UnixOpen.

Referenced by ibis::tafel::append(), ibis::column::append(), ibis::dictionary::appendOrdered(), ibis::tafel::appendString(), ibis::array_t< T >::array_t(), ibis::tafel::assignDefaultValue(), ibis::part::coarsenBins(), ibis::part::get2DDistributionI(), ibis::blob::getBlob(), ibis::mensa::getColumnAsDoubles(), ibis::bord::getColumnAsDoubles(), ibis::mensa::getColumnAsFloats(), ibis::bord::getColumnAsFloats(), ibis::mensa::getColumnAsInts(), ibis::bord::getColumnAsInts(), ibis::mensa::getColumnAsLongs(), ibis::bord::getColumnAsLongs(), ibis::mensa::getColumnAsOpaques(), ibis::bord::getColumnAsOpaques(), ibis::mensa::getColumnAsShorts(), ibis::bord::getColumnAsShorts(), ibis::bord::getColumnAsStrings(), ibis::mensa::getColumnAsUInts(), ibis::bord::getColumnAsUInts(), ibis::mensa::getColumnAsULongs(), ibis::bord::getColumnAsULongs(), ibis::mensa::getColumnAsUShorts(), ibis::bord::getColumnAsUShorts(), ibis::bord::getHistogram(), ibis::bord::getHistogram2D(), ibis::bord::getHistogram3D(), ibis::dictionary::insert(), ibis::part::numbersToBitvector(), ibis::bitvector64::operator-=(), ibis::bitvector::operator-=(), ibis::qDiscreteRange::qDiscreteRange(), ibis::qIntHod::qIntHod(), ibis::qUIntHod::qUIntHod(), ibis::colInts::reduce(), ibis::colUInts::reduce(), ibis::colLongs::reduce(), ibis::colULongs::reduce(), ibis::colShorts::reduce(), ibis::colUShorts::reduce(), ibis::colBytes::reduce(), ibis::colUBytes::reduce(), ibis::colFloats::reduce(), and ibis::colDoubles::reduce().

int ibis::util::decode16 ( uint64_t &  output,
const char *  buf 
)

Convert a string of hexadecimal digits back to an integer.

The return values are:

  • 0: successful completion of the this function.
  • -1: incoming string is empty or a nil pointer.
  • -2: incoming string contaings something not hexadecimal. The string may contain '0x' or '0X' as prefix and 'h' or 'H' as suffix.
  • -3: incoming string has more than 16 hexadecimal digits.

Referenced by ibis::selectClause::decodeAName(), and ibis::selectClause::fillNames().

double ibis::util::decrDouble ( const double &  in)
inline

Decrease the input value to the next smaller value.

See also
ibis::util::incrDouble

Referenced by ibis::quaere::create().

void ibis::util::emptyCache ( void  )
void ibis::util::encode64 ( uint64_t  input,
std::string &  buf 
)

Turn the incoming integer into a 64-bit representation.

The resulting string is stored in the output variable buf. The return value of this function is buf.c_str() if this function completes successfully, otherwise, a nil pointer is returned.

References charTable.

void ibis::util::eq2range ( const double &  in,
double &  left,
double &  right 
)
inline

Generate a range [left, right) that contains exactly the input value in.

This is used to transform an expression expression "A = in" into "left <= A < right".

See also
ibis::util::incrDouble

Referenced by ibis::ambit::estimate(), ibis::pale::estimate(), ibis::pack::estimate(), and ibis::zone::estimate().

template<typename T >
size_t ibis::util::find ( const std::vector< T > &  arr,
const T &  val,
size_t  i0 
)

Find the first position where the value is no less than val.

Start the searching operation with position start. Assuming the array is already sorted in the ascending order, it returns the smallest i such that operator[](i) >= val.

Referenced by ibis::roster::icSearch(), and ibis::column::searchSortedICD().

template<typename T >
size_t ibis::util::find ( const array_t< T > &  arr,
const T &  val,
size_t  i0 
)

Find the first position where the value is no less than val.

Start the searching operation with position start. Assuming the array is already sorted in the ascending order, it returns the smallest i such that operator[](i) >= val.

References ibis::array_t< T >::size().

template<typename T >
uint32_t ibis::util::find ( const array_t< T > &  arr,
const array_t< uint32_t > &  ind,
const T &  val,
uint32_t  i0 
)

Find the position of the first element that is no less than val.

The search starts with the given position i0. Assuming ind was produced by the sort function, it returns the smallest i such that operator[](ind[i]) >= val.

Note
Because the explicit use of uint32_t to denote the positions, that array can not have more than 2^32 elements.

References ibis::array_t< T >::size().

unsigned ibis::util::gatherParts ( ibis::partList &  tlist,
const char *  adir,
const char *  bdir,
bool  ro = false 
)

Look for data partitions in the given pair of directories.

Read the two directories, if there are matching subdirs, construct an ibis::part from them.

Will descend into the subdirectories when run on unix systems to look for matching subdirectories.

Returns the number of data partitions found.

References ibis::part::clear(), envLock, logMessage(), ibis::part::name(), ibis::part::nColumns(), ibis::part::nRows(), ibis::part::rename(), and ibis::part::timestamp().

Referenced by ibis::mensa::addPartition(), gatherParts(), ibis::init(), and ibis::mensa::mensa().

unsigned ibis::util::gatherParts ( ibis::partList &  tlist,
const char *  dir1,
bool  ro = false 
)

Look for data partitions in the given directory.

Examining the given directory to look for the metadata files and to construct ibis::part.

Can only descend into subdirectories through opendir family of functions.

Returns the number of data partitions found.

References ibis::part::clear(), envLock, gatherParts(), ibis::part::name(), ibis::part::nColumns(), ibis::part::nRows(), ibis::part::rename(), and ibis::part::timestamp().

unsigned ibis::util::gatherParts ( ibis::partList &  tables,
const ibis::resource res,
bool  ro = false 
)

Reconstruct partitions using data directories specified in the resource list.

Read the parameters dataDir1 and dataDir2 to build data partitions.

Returns the number of data partitions found.

References gatherParts(), and ibis::resource::getValue().

off_t ibis::util::getFileSize ( const char *  name)
void ibis::util::getGMTime ( char *  str)

Return the current GMT time in string format.

The argument str must have 26 or more bytes and is used to carry the time output.

void ibis::util::getLocalTime ( char *  str)

Return the current time in string format as asctime_r.

The argument str must have 26 or more bytes and is used to carry the time output.

The new line character in position 24 is turned into a null character. Therefore the returned string contains only 24 characters.

Referenced by ibis::util::logger::logger(), ibis::column::logMessage(), ibis::query::logMessage(), logMessage(), ibis::column::logWarning(), ibis::fileManager::printStatus(), ibis::part::reorder(), writeLogFileHeader(), and ibis::part::writeMetaData().

FILE * ibis::util::getLogFile ( )

Retrieve the pointer to the log file.

The value of FASTBIT_DEFAULT_LOG will be returned if no log file was specified. A log file name be specified through the following means (in the specified order),

– setLogFile – environment variable FASTBITLOGFILE – configuration parameter logfile

Note
The macro FASTBIT_DEFAULT_LOG is default to stderr. It could be directly set to another FILE* during compile time by changing FASTBIT_DEFAULT_LOG or changed at run time through ibis::util::setLogFileName. Instead of directly giving a FILE* to FASTBIT_DEFAULT_LOG, one may also set either FASTBIT_LOG_TO_STDERR or FASTBIT_LOG_TO_STDOUT at compile time to change the default log file to stderr or stdout.

References ibis::gParameters(), setLogFileName(), and writeLogFileHeader().

Referenced by fastbit_get_logfilepointer(), ibis::column::logMessage(), ibis::query::logMessage(), logMessage(), ibis::column::logWarning(), and ibis::util::logger::~logger().

const char * ibis::util::getLogFileName ( )

Return name the of the current log file.

A blank string indicates a standard output stream, either stderr or stdout.

Referenced by fastbit_get_logfile().

char * ibis::util::getString ( const char *  buf)

Extract a string from the given buf.

Remove leading and trailing spaces and surrounding quotes. Returns a copy of the string allocated with the new operator.

References strnewdup().

Referenced by ibis::column::column(), ibis::dictionary::fromASCII(), ibis::mensa::cursor::getColumnAsString(), ibis::mensa::getColumnAsStrings(), ibis::text::getOpaque(), ibis::text::IDColumnForKeywordIndex(), ibis::part::readMetaData(), and ibis::part::readMetaTags().

const char * ibis::util::getToken ( char *&  str,
const char *  tok_chrs 
)

Return a null-terminated string from the beginning of input string str.

The first apparence of any character from characters in tok_chars is turned into null. The incoming argument is modified to point to the first character that is not in tok_chrs. If no character in tok_chrs is found, str is returned and the first argument is changed to null.

int ibis::util::getVersionNumber ( )
inline

Return an integer designating the version of this software.

The version number is composed of four segments each with two decimal digits. For example, version 1.3.0.2 will be represented as 1030002. The stable releases typically have the last segment as zero, which is generally referred to without the last ".0".

Referenced by fastbit_get_version_number(), and writeLogFileHeader().

const char* ibis::util::getVersionString ( )
inline

Return a pointer to the string designating the version of this software.

Referenced by fastbit_get_version_string(), and writeLogFileHeader().

std::string ibis::util::groupby1000 ( uint64_t  val)

Produce a string version of the unsigned integer value with the decimal digits grouped into 1000s.

Referenced by ibis::fileManager::clear(), ibis::fileManager::getFile(), ibis::fileManager::getFileSegment(), ibis::bord::merge(), ibis::fileManager::printStatus(), and ibis::fileManager::storage::storage().

double ibis::util::incrDouble ( const double &  in)
inline

Functions to handle manipulation of floating-point numbers.

Increment the input value to the next larger value.

If the math library has nextafter, it will use nextafter, otherwise, it will use the unit round-off error to compute the next larger value. The success of this computation is highly sensitive to the definition of DBL_EPSILON, which should be the smallest value x such that (1+x) is different from x. For 64-bit IEEE floating-point number, it is approximately 2.2E-16 (2^{-52}).

Referenced by ibis::part::adaptive2DBins(), ibis::part::adaptive3DBins(), ibis::part::adaptiveFloats(), ibis::part::adaptiveFloatsDetailed(), ibis::bak::binBoundaries(), ibis::quaere::create(), ibis::ambit::estimate(), ibis::pale::estimate(), ibis::pack::estimate(), ibis::zone::estimate(), ibis::part::get1DDistribution(), ibis::part::get2DDistributionI(), ibis::part::get2DDistributionU(), ibis::qContinuousRange::qContinuousRange(), and ibis::bin::scanAndPartition().

void ibis::util::int2string ( std::string &  str,
const std::vector< unsigned > &  val 
)

Convert the incoming numbers into a base-64 alpha-numeric representation.

Three 32-bit integers can be packed int 16 base-64 alphabets.

References int2string().

long ibis::util::intersect ( const std::vector< ibis::bitvector > &  bits1,
const std::vector< ibis::bitvector > &  bits2,
std::vector< ibis::bitvector > &  res 
)

Intersect two sets of bit vectors.

res[jj*bits2.size()+ii] = bits1[jj] & bits2[ii]

References ibis::fileManager::bytesInUse(), ibis::bitvector::compress(), and ibis::fileManager::instance().

Referenced by ibis::part::get2DBins(), and ibis::part::get3DBins().

long ibis::util::intersect ( const std::vector< ibis::bitvector > &  bits1,
const std::vector< ibis::bitvector > &  bits2,
const std::vector< ibis::bitvector > &  bits3,
std::vector< ibis::bitvector > &  res 
)

Intersect three sets of bit vectors.

res[(kk*bits2.size()+jj)*bits3.size()+ii] = bits1[kk] & bits2[jj] & bits3[ii]

References ibis::fileManager::bytesInUse(), ibis::bitvector::compress(), and ibis::fileManager::instance().

void ibis::util::logMessage ( const char *  event,
const char *  fmt,
  ... 
)

Print a message to standard output.

The format string is same as printf. A global mutex lock is used to ensure printed messages are ordered. The message is preceeded with the current local time (as from function ctime) if the macro FASTBIT_TIMED_LOG is defined at the compile time.

References getLocalTime(), and getLogFile().

Referenced by ibis::part::accessHint(), ibis::index::addBits(), ibis::blob::append(), ibis::column::append(), ibis::part::append1(), ibis::part::append2(), ibis::part::appendToBackup(), ibis::bin::binOrder(), ibis::column::column(), compactValue(), compactValue2(), ibis::bitvector64::decompress(), ibis::part::digestMeshShape(), ibis::part::doBackup(), ibis::query::doEstimate(), ibis::query::doScan(), ibis::bin::estimate(), ibis::relic::estimate(), ibis::part::evaluateJoin(), ibis::bitvector64::flip(), gatherParts(), ibis::part::get1DBins(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::query::getBounds(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::part::getJointDistribution(), ibis::column::getNullMask(), ibis::index::isIndex(), ibis::query::isValidToken(), ibis::relic::locate(), ibis::bak::locate(), ibis::bak2::locate(), ibis::part::makeBackupCopy(), makeDir(), ibis::part::old2DDistribution(), ibis::bitvector64::operator&(), ibis::bitvector64::operator&=(), ibis::bitvector64::operator-(), ibis::bitvector64::operator-=(), ibis::bitvector64::operator^(), ibis::bitvector64::operator^=(), ibis::bitvector64::operator|(), ibis::bitvector64::operator|=(), ibis::query::orderPairs(), ibis::part::packCumulativeDistribution(), ibis::part::packDistribution(), ibis::part::part(), ibis::query::processJoin(), ibis::part::purgeInactive(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::bitvector64::read(), ibis::part::readMetaData(), ibis::part::recursiveQuery(), removeDir(), ibis::query::removeFiles(), ibis::bord::reorderStrings(), ibis::part::rollback(), ibis::column::saveSelected(), ibis::column::selectBytes(), ibis::column::selectDoubles(), ibis::column::selectFloats(), ibis::column::selectInts(), ibis::column::selectLongs(), ibis::column::selectShorts(), ibis::column::selectUBytes(), ibis::column::selectUInts(), ibis::column::selectULongs(), ibis::column::selectUShorts(), ibis::column::selectValuesT(), ibis::part::selfTest(), ibis::bitvector64::setBit(), ibis::query::sortEquiJoin(), ibis::query::sortRangeJoin(), ibis::bord::sortStrings(), ibis::column::string2int(), ibis::index::sumBins(), ibis::index::sumBits(), ibis::column::truncateData(), and ibis::bitvector64::write().

int ibis::util::makeDir ( const char *  dir)

Recursivly create directory "dir".

Returns zero (0) to indicate success, a negative number to indicate error. If the directory already exists, it immediately returns 0.

References logMessage(), and strnewdup().

Referenced by ibis::bord::backup(), and ibis::part::part().

bool ibis::util::nameMatch ( const char *  str,
const char *  pat 
)

Match the string str against a simple pattern pat without considering cases.

This is to follow the SQL standard for comparing names.

See also
ibis::util::strMatch

Referenced by ibis::part::matchNameValuePair().

const ibis::bitvector64 & ibis::util::outerProduct ( const ibis::bitvector a,
const ibis::bitvector b,
ibis::bitvector64 c 
)

Compute the outer product of a and b, add the result to c.

This implementation only uses public functions of ibis::bitvector and ibis::bitvector64.

This should make it possible to use both WAH and BBC compress bitvector classes. It is not likely that we can gain much performance by directly using member variables of ibit::bitvector because the unit of compressed data from ibit::vector do not fit neatly into ibis::bitvector64::word_t.

Return a const reference of c. If the input does not have the correct size, it will be replaced by the outer product.

References ibis::bitvector64::adjustSize(), ibis::bitvector64::appendFill(), ibis::bitvector64::bytes(), ibis::bitvector64::cnt(), ibis::bitvector::cnt(), ibis::bitvector::indexSet::nIndices(), ibis::bitvector64::setBit(), ibis::bitvector::size(), ibis::bitvector64::size(), ibis::horometer::start(), and ibis::bitvector64::swap().

Referenced by ibis::index::estimate(), ibis::query::processJoin(), ibis::relic::speedTest(), and ibis::bin::speedTest().

const ibis::bitvector64 & ibis::util::outerProductUpper ( const ibis::bitvector a,
const ibis::bitvector b,
ibis::bitvector64 c 
)

Add the strict upper triangular portion of the outer production between a and b to c.

The result contains only the strict upper triangular portion of the full outer product.

References ibis::bitvector64::adjustSize(), ibis::bitvector64::appendFill(), ibis::bitvector64::bytes(), ibis::bitvector64::cnt(), ibis::bitvector::cnt(), ibis::bitvector::indexSet::nIndices(), ibis::bitvector64::setBit(), ibis::bitvector::size(), ibis::bitvector64::size(), ibis::horometer::start(), and ibis::bitvector64::swap().

double ibis::util::rand ( )
inline

A very simple pseudo-random number generator.

It produces a floating point number between 0 and 1.

The is a a Linear Congruential pseudo-random number generator. It produces a floating-point in the range of (0, 1). It is very simple and fast, however, it does not produce high-quality random numbers. It is not thread-safe. However, since the actual computation only involves two arithmetic operations, it is very unlikely to have thread-safty issues.

Referenced by ibis::part::buildQueryList(), ibis::index::mapValues(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::part::selfTest(), and ibis::part::testRangeOperators().

int64_t ibis::util::read ( int  fdes,
void *  buf,
int64_t  nbytes 
)

A wrapper over POSIX read function.

When a large chunk is requested by the user, the read function may return one piece at a time, typically a piece is no larger than 2^31 bytes, and the size of this piece is implementation dependent. This function attempts use the return value from read when it is a posive value.

Referenced by ibis::column::append(), ibis::fileManager::roFile::doRead(), ibis::roster::icSearch(), ibis::roster::oocSearch(), ibis::fileManager::storage::read(), ibis::column::searchSortedOOCC(), ibis::column::searchSortedOOCD(), ibis::column::selectValuesT(), and ibis::column::string2int().

int ibis::util::readDouble ( double &  val,
const char *&  str,
const char *  del = ibis::util::delimiters 
)

Attempt to convert the incoming string into a double.

The format recodnized is the following

[+-]?\d*\.\d*[[eE][+-]?\d+]
Note
the decimal point is the period (.)! This function does not understand any other notation.
This function leaves str at the first character that does not follow the above pattern.

References readInt().

Referenced by ibis::math::fromUnixTime::eval(), ibis::resource::getNumber(), ibis::tafel::parseLine(), and ibis::whereClause::verifyExpr().

int ibis::util::readInt ( int64_t &  val,
const char *&  str,
const char *  del = ibis::util::delimiters 
)

Attempt to convert the incoming string into an integer.

It skips leading space and converts an optional +/- sign followed by a list of decimal digits to an integer.

On successful completion of this function, the argument val contains the converted value and str points to the next unused character in the input string. In this case, it returns 0. It returns -1 if the input string is a null string, is an empty string, has only blank spaces, or has a delimiter immediately following the leading blank spaces. It returns -2 to indicate overflow, in which case, it resets val to 0 and moves str to the next character that is not a decimal digit.

References readUInt().

Referenced by ibis::tafel::parseLine(), ibis::qIntHod::qIntHod(), readDouble(), and ibis::whereClause::verifyExpr().

int ibis::util::readString ( std::string &  str,
const char *&  buf,
const char *  delim = 0 
)

Copy the next string to the output variable str.

Leading blank spaces are skipped. The content of str will be empty if buf is nil or an empty string. If the string is quoted, only spaces before the quote are skipped, and the content of the string will be everything after the first quote to the last character before the matching quote or end of buffer. If delim is not provided (i.e., is 0) and the 1st nonblank character is not a quote, then string will terminate at the 1st space (or nonprintable) character following the nonblank character.

Note
A unquoted empty string is considered a null value which is indicated by a negative return value. A quoted empty string is a valid string, which is indicated by a return value of 0.
This function uses backslash as the escape character for allowing quotes to be inside the quoted strings. Unfortunately, this also means to have a single backslash in a string, one has to input two of them right next to each other.
Input strings starting with an apostrophe, such as "'twas", must be quoted as "\"'twas"" or the apostrophe must be escaped as "\'twas". Otherwise, the leading apostrophe would be interpreted as a unmatched quote which will cause the next string value to extend beyond the intended word.

Referenced by ibis::tafel::assignDefaultValue(), ibis::keywords::tokenizer::operator()(), ibis::tafel::parseLine(), ibis::tablex::readNamesAndTypes(), ibis::tafel::readSQLDump(), ibis::text::readStrings2(), ibis::text::selectStrings(), ibis::bin::setBoundaries(), and ibis::tafel::SQLCreateTable().

const char * ibis::util::readString ( char *&  buf,
const char *  delim = 0 
)

Attempt to extract a string token from the incoming buffer.

It uses the existing buffer without allocating additional space. It creates nil terminated strings by converting terminators or space characters into nil characters. If it finds a valid string, it will return a pointer to the string, otherwise it returns a nil pointer or an empty string.

  • If no delimiter is specified, it turns all non-printable characters into the null character and returns the starting positions of groups of alphanumeric characters as tokens.
  • If a list of delimiters are provided, any of the delimiters will terminate a token. Blank spaces surrounding the delimiters will be turned into null characters along with the delimiters.
Note
If a token starts with a quote (any of signle quote, double quote, or back quote), the token will end with the matching quote. Note that the back quote can be ended with either another back quote or a single quote.
All non-printable characters are treated as blank spaces.
int ibis::util::readUInt ( uint64_t &  val,
const char *&  str,
const char *  del = ibis::util::delimiters 
)

Attempt to convert the incoming string into a unsigned integer.

It skips leading space and converts a list of decimal digits to an integer. On successful completion of this function, the argument val contains the converted value and str points to the next unused character in the input string. In this case, it returns 0. It returns -1 if the input string is a null string, is an empty string, has only blank spaces or have one of the delimiters immediately following the leading blank spaces. It returns -2 to indicate overflow, in which case, it resets val to 0 and moves str to the next character that is not a decimal digit.

Referenced by ibis::dictionary::fromASCII(), ibis::tafel::parseLine(), ibis::qUIntHod::qUIntHod(), readInt(), and ibis::keywords::readTDLine().

void ibis::util::removeDir ( const char *  name,
bool  leaveDir = false 
)

Remove the content of named directory.

If this function is run on a unix-type system and the second argument is true, it will leave all the subdirectories intact as well.

References logMessage(), and strnewdup().

Referenced by ibis::part::append2(), ibis::query::clear(), ibis::part::doBackup(), ibis::part::part(), and ibis::part::rollback().

template<typename T >
void ibis::util::reorder ( array_t< T > &  arr,
const array_t< uint32_t > &  ind 
)
void ibis::util::reorder ( std::vector< std::string > &  arr,
const array_t< uint32_t > &  ind 
)

Reorder string values.

This function keeps the actual strings in their input positions by using the function swap. This procedure should avoid most of the memory allocations.

References ibis::array_t< T >::size().

template<typename Tin , typename Tout >
void ibis::util::round_down ( const Tin &  inval,
Tout &  outval 
)

Round the incoming value to the largest output value that is no more than the input.

Both Tin and Tout must be elementary data types, and Tout must be an elementary integral type.

Referenced by ibis::part::doCount(), and ibis::part::doScan().

template<typename Tin , typename Tout >
void ibis::util::round_up ( const Tin &  inval,
Tout &  outval 
)

Round the incoming value to the smallest output value that is no less than the input.

Both Tin and Tout must be elementary data types, and Tout must be an elementary integral type.

Referenced by ibis::column::searchSortedICC(), and ibis::column::searchSortedOOCC().

template<typename Tin >
void ibis::util::round_up ( const Tin &  inval,
float &  outval 
)
inline

A specialization of round_up for the output type float.

This function uses nextafterf if the macro HAVE_NEXTAFTER is defined, otherwise it uses FLT_EPSILON to compute outval as (float)(inval)*(1+FLT_EPSILON).

void ibis::util::secondsToString ( const time_t  sec,
char *  str 
)

Converts the given time in seconds (as returned by function time) into the string (as from asctime_r).

The argument str must contain at least 26 bytes. The new line character is turned into null.

Referenced by ibis::bord::backup(), ibis::bord::bord(), and ibis::tafel::writeMetaData().

uint32_t ibis::util::serialNumber ( )

Compute a serial number.

It is a unique number that is always increasing. Even time FastBit runs, it starts this number with the value 1. This implementation uses a 32-bit shared integer and therefore could wrap-around when exceeding 2^32.

Referenced by ibis::part::queryTest(), ibis::part::quickTest(), randName(), and ibis::part::selfTest().

int ibis::util::setLogFileName ( const char *  filename)

Change the current log file to the named file.

Log files are opened in append mode, so the existing content will be preserved. The log file will be changed only if the named file can be opened correctly.

Error code

  • 0: success;
  • -1: unable to open the specified file,
  • -2: unable to write to the specified file.
Note
The incoming argument is generally interpreted as a file name and passed to fopen, however, there are three special values: a blank string (or a nil pointer), "stdout" and "stderr." The blank string (or a null pointer) resets the log file to the default value FASTBIT_DEFAULT_LOG. The default value of FASTBIT_DEFAULT_LOG is explained in function ibis::util::getLogFile. The values of "stderr" and "stdout" refer to the stderr and stdout defined on most UNIX systems. Note that both "stderr" and "stdout" must be in lower case if the caller intends to use stderr or stdout.
See also
ibis::util::getLogFile

References writeLogFileHeader().

Referenced by fastbit_set_logfile(), getLogFile(), ibis::init(), and writeLogFileHeader().

void ibis::util::setNaN ( float &  val)

Functions to handle manipulation of floating-point numbers.

Increment the input value to the next larger value.

If the math library has nextafter, it will use nextafter, otherwise, it will use the unit round-off error to compute the next larger value. The success of this computation is highly sensitive to the definition of DBL_EPSILON, which should be the smallest value x such that (1+x) is different from x. For 64-bit IEEE floating-point number, it is approximately 2.2E-16 (2^{-52}).

void ibis::util::setVerboseLevel ( int  v)
inline

Set the verboseness level.

Unless the code is compiled with DEBUG macro set, the default verboseness level is 0, which will only print out information about major errors.

std::string ibis::util::shortName ( const std::string &  de)

Use the Fletcher's checksum to produce a short string.

The result is often used as name of temporary table objects. The name conforms to the SQL standard.

References checksum(), and int2string().

Referenced by ibis::bord::bord(), ibis::jNatural::fillResult(), ibis::jRange::fillResult(), ibis::part::rename(), ibis::jNatural::select(), ibis::jRange::select(), ibis::filter::sift0(), ibis::filter::sift0S(), ibis::filter::sift1(), ibis::filter::sift1S(), ibis::filter::sift2(), and ibis::filter::sift2S().

template<typename T1 , typename T2 >
void ibis::util::sort_heap ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Heapsort.

Sort the keys only. Move the vals along with the keys.

References ibis::array_t< T >::size().

Referenced by sort_quick().

template<typename T1 , typename T2 >
void ibis::util::sort_insertion ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Insertion sort.

It has relatively straightforward memory access pattern and may be useful to sort a few numbers at the end of a recursive procedure.

References ibis::array_t< T >::size().

template<typename T1 , typename T2 >
uint32_t ibis::util::sort_partition ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Partition function for quicksort.

The return value p separates keys into two parts, keys[..:p-1] < keys[p:..]. A return value equal to the size of keys indicates all keys are sorted.

References ibis::array_t< T >::size(), and sort_shell().

Referenced by sort_quick().

template<typename T1 , typename T2 >
void ibis::util::sort_partition3 ( array_t< T1 > &  keys,
array_t< T2 > &  vals,
uint32_t &  starteq,
uint32_t &  startgt 
)

Three-way partitioning algorithm for quicksort.

Upon return from this function, keys satisfying the following order keys[0:starteq] < keys[starteq:stargt-1] < keys[startgt:..]. The keys are ordered if starteq = startgt = keys.size().

References ibis::array_t< T >::size(), and sort_shell().

Referenced by sort_quick3().

template<typename T1 , typename T2 >
void ibis::util::sort_quick ( array_t< T1 > &  keys,
array_t< T2 > &  vals,
uint32_t  lvl 
)

Quicksort.

Quick sort with introspection.

Sort the keys only. Use the standard two-way partitioning.

It will switch to heap sort after FASTBIT_QSORT_MAX_DEPTH levels of recursion. Performs recursive call only on the smaller half, while iterate over the larger half.

References ibis::array_t< T >::size(), sort_heap(), sort_partition(), sort_quick(), and sort_shell().

Referenced by sort_quick(), and sortKeys().

template<typename T1 , typename T2 >
void ibis::util::sort_quick3 ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Quicksort.

Sort the keys only. Use a nonstandard three-way partitioning.

References ibis::array_t< T >::size(), sort_partition3(), sort_quick3(), and sort_shell().

Referenced by sort_quick3().

template<typename T >
void ibis::util::sort_radix ( array_t< char > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

Referenced by sortKeys().

template<typename T >
void ibis::util::sort_radix ( array_t< signed char > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< unsigned char > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< uint16_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< int16_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< uint32_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< int32_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< uint64_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< int64_t > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< float > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T >
void ibis::util::sort_radix ( array_t< double > &  keys,
array_t< T > &  vals 
)

LSD Radix sort.

Allocates buffers needed for copying data.

References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().

template<typename T1 , typename T2 >
void ibis::util::sort_shell ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Shell sort.

It has relatively straightforward memory access pattern and may be useful to sort a few numbers at the end of a recursive sorting function.

References shellgaps, and ibis::array_t< T >::size().

Referenced by sort_partition(), sort_partition3(), sort_quick(), and sort_quick3().

template<typename T1 , typename T2 >
void ibis::util::sortAll ( array_t< T1 > &  arr1,
array_t< T2 > &  arr2 
)

Sort two arrays together.

Order arr1 in ascending order first, then when arr1 has the same value, order arr2 in ascending order as well.

References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sortAll_quick(), and sortAll_shell().

template<typename T1 , typename T2 >
uint32_t ibis::util::sortAll_split ( array_t< T1 > &  arr1,
array_t< T2 > &  arr2 
)

The parititioning function for ibis::util::sortAll.

Uses the standard two-way partitioning.

References ibis::array_t< T >::size().

Referenced by sortAll_quick().

template<typename T1 , typename T2 >
void ibis::util::sortKeys ( array_t< T1 > &  keys,
array_t< T2 > &  vals 
)

Sorting function with payload.

Sort keys in ascending order, move the vals accordingly.

References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sort_quick(), and sort_radix().

Referenced by ibis::direkte::keys(), sortMerge(), and ibis::bord::sortValues().

template<typename T >
int64_t ibis::util::sortMerge ( array_t< T > &  valR,
array_t< uint32_t > &  indR,
array_t< T > &  valS,
array_t< uint32_t > &  indS 
)

An in-memory sort merge join function.

Sort the input arrays, valR and valS. Count the number of results from join.

Note
This implementation is for elementary numberical data types only.
On input, if the size of indR is the same as that of valR, its content is preserved, otherwise it is reset to 0..valR.size()-1. The array indS is similarly set to 0..valS.size()-1 if its size is different from that of valS.

References ibis::array_t< T >::nosharing(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and sortKeys().

template<typename T >
int64_t ibis::util::sortMerge ( array_t< T > &  valR,
array_t< uint32_t > &  indR,
array_t< T > &  valS,
array_t< uint32_t > &  indS,
double  delta1,
double  delta2 
)

An in-memory sort merge join function.

Sort the input arrays, valR and valS. Count the number of results satisfying ValR-Vals between delta1 and delta2.

Note
This implementation is for elementary numberical data types only.
On input, if the size of indR is the same as that of valR, its content is preserved, otherwise it is reset to 0..valR.size()-1. The array indS is similarly set to 0..valS.size()-1 if its size is different from that of valS.

References ibis::array_t< T >::nosharing(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and sortKeys().

void ibis::util::sortRIDs ( ibis::RIDSet rids)

Sort RID lists.

None of them are stable.

Sort the given list of RIDs with quick sort.

References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sortRIDsi(), and sortRIDsq().

Referenced by ibis::part::evaluateRIDSet(), and ibis::bundle::sortRIDs().

void ibis::util::sortRIDsi ( ibis::RIDSet rids,
uint32_t  i,
uint32_t  j 
)

Sort a portion of the RIDset with insertion sort.

Sort RIDs in the range of [i, j).

Referenced by sortRIDs(), and sortRIDsq().

void ibis::util::sortRIDsq ( ibis::RIDSet rids,
uint32_t  i,
uint32_t  j 
)

Sort a portion of the RIDSet with quick sort.

Sort RIDs in the range of [i, j).

References sortRIDsi().

Referenced by sortRIDs().

void ibis::util::sortStrings ( std::vector< std::string > &  keys,
array_t< uint32_t > &  vals 
)

Sorting function with string as keys and uint32_t as payload.

It uses quick sort if there are more than FASTBIT_QSORT_MIN elements to sort, otherwise, it uses shell sort.

Note
This function operates completely in-memory; all arrays and whatever auxiliary data must fit in memory. Furthermore, FastBit does not track the memory usage of std::vector nor std::string. In case this function runs out of memory, the two input arrays are left in undefined states. Normally, one should not have lost any data values, but the values are in undetermined orders. However, this is not guaranteed to be the case.

References ibis::array_t< T >::size(), and sortStrings_shell().

Referenced by ibis::dictionary::dictionary(), ibis::dictionary::readRaw(), ibis::bord::reorder(), sortMerge(), sortStrings(), and ibis::bord::sortStrings().

void ibis::util::sortStrings ( std::vector< std::string > &  keys,
array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

Quicksort for strings.

Quick-sort for strings with shell sort as clean-up procedure.

References ibis::array_t< T >::size(), sortStrings(), sortStrings_partition(), and sortStrings_shell().

void ibis::util::sortStrings ( ibis::array_t< const char * > &  keys,
array_t< uint32_t > &  vals 
)

Sorting function with string as keys and uint32_t as payload.

It uses quick sort if there are more than FASTBIT_QSORT_MIN elements to sort, otherwise, it uses shell sort.

Note
This function operates completely in-memory; all arrays and whatever auxiliary data must fit in memory.

References ibis::array_t< T >::size(), sortStrings(), and sortStrings_shell().

void ibis::util::sortStrings ( ibis::array_t< const char * > &  keys,
ibis::array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

Quicksort for strings.

Quick-sort for strings with shell sort as clean-up procedure.

References ibis::array_t< T >::size(), sortStrings(), sortStrings_partition(), and sortStrings_shell().

uint32_t ibis::util::sortStrings_partition ( std::vector< std::string > &  keys,
array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

The partitioning procedure for quick sort.

Median-of-3 partitioning algorithm.

It implements the standard two-way partitioning with the median-of-three pivot.

References ibis::array_t< T >::size(), and sortStrings_shell().

Referenced by sortStrings().

uint32_t ibis::util::sortStrings_partition ( ibis::array_t< const char * > &  keys,
array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

The partitioning procedure for quick sort.

Median-of-3 partitioning algorithm.

It implements the standard two-way partitioning with the median-of-three pivot.

The string values are represented as "const char*". This function uses std::strcmp to perform the comparisons of string. Since strcmp can not accept nil pointers, this function has to check for nil pointers, which effectively treats a nil pointer as a special value of string that is less than all other string values.

References ibis::array_t< T >::size(), and sortStrings_shell().

void ibis::util::sortStrings_shell ( std::vector< std::string > &  keys,
ibis::array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

Shell sorting procedure.

To clean up after the quick sort procedure.

References shellgaps, and ibis::array_t< T >::size().

Referenced by sortStrings(), and sortStrings_partition().

void ibis::util::sortStrings_shell ( ibis::array_t< const char * > &  keys,
ibis::array_t< uint32_t > &  vals,
uint32_t  begin,
uint32_t  end 
)

Shell sorting procedure.

Shell sort of string values.

To clean up after the quick sort procedure.

The string values are represented as "const char*". This function uses std::strcmp to perform the comparisons of string. Since strcmp can not accept nil pointers, this function has to check for nil pointers, which effectively treats a nil pointer as a special value of string that is less than all other string values.

References shellgaps, and ibis::array_t< T >::size().

bool ibis::util::strMatch ( const char *  str,
const char *  pat 
)

Match the string str against a simple pattern pat.

If the whole string matches the pattern, this function returns true, otherwise, it returns false. The special cases are (1) if the two pointers are the same, it returns true; (2) if both arguments point to an empty string, it returns true; (3) if one of the two argument is a nil pointer, but the other is not, it returns false; (4) if str is an empty string, it matches a pattern containing only '' and '*'.

The pattern may contain five special characters, two matches any number of characters, STRMATCH_META_CSH_ANY and STRMATCH_META_SQL_ANY, two matches exactly one character, STRMATCH_META_CSH_ONE and STRMATCH_META_SQL_ONE, and STRMATCH_META_ESCAPE. This combines the meta characters used in C-shell file name substitution and SQL LIKE clause.

Note
The strings are matched with the same case, i.e., the match is case sensitive, unless the macro FASTBIT_CASE_SENSITIVE_COMPARE is defined to be 0 at compile time.

Referenced by ibis::dictionary::patternSearch(), and ibis::mensa::select2().

char * ibis::util::strnewdup ( const char *  s)
void ibis::util::uniformFraction ( const long unsigned  idx,
long unsigned &  denominator,
long unsigned &  numerator 
)

Compute a denominator and numerator pair.

The fraction (denominator / numerator) is uniformly distributed in [0, 1).

void ibis::util::updateDatasets ( void  )

Update the metadata about the data partitions.

Loop through all known data partitions and check for any update in the metadata files.

References ibis::datasets, and ibis::part::updateData().

const char * ibis::util::userName ( )

Return the user name.

It attempts to retrieve the user name from the system and store it locally.

A global mutex lock is used to ensure that only one thread can access the locally stored user name. If it fails to determine the user name, it will return '<(-_-)>', which is a long form of the robot emoticon.

References envLock.

Referenced by ibis::bord::computeHits(), ibis::part::genName(), ibis::part::get1DBins_(), and ibis::part::stringToBitvector().

int64_t ibis::util::write ( int  fdes,
const void *  buf,
int64_t  nbytes 
)

A wrapper over POSIX write function.

When a large chunk is requested by the user, the write function may return one piece at a time, typically a piece is no larger than 2^31 bytes depending implementation details. This function attempts use the return value from write when it is a posive value.

Referenced by ibis::column::append(), ibis::column::appendValues(), ibis::bin::binningT(), ibis::bin::binOrderT(), ibis::tafel::readCSV(), ibis::tafel::readSQLDump(), ibis::direkte::remapKeys(), ibis::resource::write(), ibis::bin::write32(), ibis::relic::write64(), ibis::bin::write64(), ibis::fade::write64(), and ibis::egale::write64().

int ibis::util::writeLogFileHeader ( FILE *  fptr,
const char *  fname 
)

Write a header to the log file.

The caller has opened the file named fname, this function attempts to write a simple message to verify that the file is writable. It returns zero (0) upon successful completion of the write operation, it returns a non-zero number to indicate error.

If fptr is 0, it passes fname to ibis::util::setLogFileName to open the appropriate file. Otherwise, this function attempt to write to fptr.

References getLocalTime(), getVersionNumber(), getVersionString(), and setLogFileName().

Referenced by getLogFile(), and setLogFileName().

Variable Documentation

const short unsigned ibis::util::charIndex
Initial value:
= {
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 62, 64, 0, 64, 64,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 64, 64, 64, 63, 64, 64,
64, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 64, 64, 64, 64, 37,
64, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 64, 64, 64, 64,
}

charIndex maps the characters (ASCII) back to integer [0-64]

Maps back from ASCII to positions of the characters in ibis::util::charTable.

Referenced by decode64(), and ibis::query::isValidToken().

const char * ibis::util::charTable
Initial value:
=
"-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~"

charTable lists the 64 printable characters to be used for names

A list of 65 printable ASCII characters that are not special to most of the command interpreters.

The first 64 of them are basically the same as specified in RFC 3548 for base-64 numbers, but appear in different order. A set of numbers represented in this base-64 representation will be sorted in the same order as their decimal representations.

Referenced by encode64(), and int2string().

const char * ibis::util::delimiters = ";, \v\b\f\r\t\n'\""
pthread_mutex_t ibis::util::envLock = PTHREAD_MUTEX_INITIALIZER

A mutex for serialize operations FastBit wide.

Currently it is used by the functions that generating user name, asking for password, backing up active tables, cleaning up the list of tables. It is also used extensively in the implementation of C API functions to ensure the cache maintained for C users are manipulated by one user at a time.

Referenced by ibis::fileManager::adjustCacheSize(), ibis::part::doBackup(), ibis::mensa::dropPartition(), ibis::query::evaluate(), ibis::column::evaluateRange(), ibis::findDataset(), gatherParts(), ibis::init(), ibis::column::mutexLock::mutexLock(), ibis::part::part(), and userName().

const int ibis::util::log2table
Initial value:
= {
#define X(i)
-1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
X(4), X(5), X(5), X(6), X(6), X(6), X(6),
X(7), X(7), X(7), X(7), X(7), X(7), X(7), X(7)
}

log base 2 of an integer, the lookup table

const uint32_t ibis::util::shellgaps[8] = {1, 4, 10, 23, 57, 132, 301, 701}

Gaps for Shell sort from http://en.wikipedia.org/wiki/Shell_sort by Ciura, 2001.

Referenced by sort_shell(), sortAll_shell(), and sortStrings_shell().

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive