Organize the miscellaneous functions under the name util. More...
Classes | |
class | counter |
A simple shared counter. More... | |
class | flock |
A simple wrapper on flock. More... | |
class | guardBase |
A class hierarchy for cleaning up after durable resources. More... | |
class | guardImpl0 |
A concrete class for cleanup jobs that take a function without any argument. More... | |
class | guardImpl1 |
A concrete class for cleanup jobs that take a function with one argument. More... | |
class | guardImpl2 |
A concrete class for cleanup jobs that take a function with two arguments. More... | |
class | guardObj0 |
A class to work with class member functions with no arguments. More... | |
struct | heap |
A simple heap based on std::push_heap and std::pop_heap. More... | |
class | ioLock |
A global I/O lock. More... | |
class | logger |
A class for logging messages. More... | |
class | mutexLock |
An wrapper class for perform pthread_mutex_lock/unlock. More... | |
class | quietLock |
An wrapper class for perform pthread_mutex_lock/unlock. More... | |
class | readLock |
An wrapper class for perform pthread_rwlock_rdlock/unlock. More... | |
class | refHolder |
A template to hold a reference to an object. More... | |
class | sharedInt32 |
A shared unsigned 32-bit integer class. More... | |
class | sharedInt64 |
A unsigned 64-bit shared integer class. More... | |
class | softLock |
An wrapper class for perform pthread_mutex_trylock/unlock. More... | |
class | timer |
Print simple timing information. More... | |
class | writeLock |
An wrapper class for perform pthread_rwlock_wrlock/unlock. More... | |
Typedefs | |
typedef const guardBase & | guard |
The type to be used by client code. More... | |
Functions | |
void | clear (ibis::array_t< ibis::bitvector * > &bv) throw () |
Clear an array of bit vectors. | |
void | clear (ibis::partList &pl) throw () |
Deallocate the list of data partitions. | |
template<typename T > | |
void | clearVec (std::vector< T * > &v) |
A template to clean up a vector of pointers. | |
void | closeLogFile () |
Close the log file. More... | |
int | copy (const char *to, const char *from) |
Copy file named "from" to a file named "to". More... | |
void | emptyCache (void) |
Attempt to remove all currently unused data from memory cache. More... | |
template<typename T > | |
size_t | find (const std::vector< T > &, const T &, size_t) |
Find the first position where the value is no less than val . More... | |
template<typename T > | |
size_t | find (const array_t< T > &, const T &, size_t) |
Find the first position where the value is no less than val . More... | |
template<typename T > | |
uint32_t | find (const array_t< T > &, const array_t< uint32_t > &, const T &, uint32_t) |
Find the position of the first element that is no less than val . More... | |
unsigned int | gatherParts (ibis::partList &parts, const char *adir, const char *bdir, bool ro=false) |
Look for data partitions in the given pair of directories. More... | |
unsigned int | gatherParts (ibis::partList &parts, const char *adir, bool ro=false) |
Look for data partitions in the given directory. More... | |
unsigned int | gatherParts (ibis::partList &parts, const ibis::resource &res, bool ro=false) |
Reconstruct partitions using data directories specified in the resource list. More... | |
off_t | getFileSize (const char *name) |
Return size of the file in bytes. More... | |
void | getGMTime (char *str) |
Return the current GMT time in string format. More... | |
void | getLocalTime (char *str) |
Return the current time in string format as asctime_r . More... | |
FILE * | getLogFile () |
Retrieve the pointer to the log file. More... | |
const char * | getLogFileName () |
Return name the of the current log file. More... | |
char * | getString (const char *buf) |
Extract a string from the given buf. More... | |
const char * | getToken (char *&str, const char *tok_chrs) |
Return a null-terminated string from the beginning of input string str. More... | |
int | getVersionNumber () |
Return an integer designating the version of this software. More... | |
const char * | getVersionString () |
Return a pointer to the string designating the version of this software. More... | |
long | intersect (const std::vector< ibis::bitvector > &bits1, const std::vector< ibis::bitvector > &bits2, std::vector< ibis::bitvector > &res) |
Intersect two sets of bit vectors. More... | |
long | intersect (const std::vector< ibis::bitvector > &bits1, const std::vector< ibis::bitvector > &bits2, const std::vector< ibis::bitvector > &bits3, std::vector< ibis::bitvector > &res) |
Intersect three sets of bit vectors. More... | |
bool | isFloatType (ibis::TYPE_T t) |
Is the type for floating-point values? | |
bool | isIntegerType (ibis::TYPE_T t) |
Is the type for integer values? | |
bool | isNumericType (ibis::TYPE_T t) |
Is the type for numberical values? | |
bool | isSignedIntegerType (ibis::TYPE_T t) |
Is the type for signed integer values? | |
bool | isStringType (ibis::TYPE_T t) |
Is the type for strings? | |
bool | isUnsignedIntegerType (ibis::TYPE_T t) |
Is the type for unsigned integer values? | |
char * | itoa (int value, char *str, int) |
void | logMessage (const char *event, const char *fmt,...) |
Print a message to standard output. More... | |
int | makeDir (const char *dir) |
Recursivly create directory "dir". More... | |
template<typename F > | |
guardImpl0< F > | makeGuard (F f) |
template<typename F , typename A > | |
guardImpl1< F, A > | makeGuard (F f, A a) |
template<typename F , typename A1 , typename A2 > | |
guardImpl2< F, A1, A2 > | makeGuard (F f, A1 a1, A2 a2) |
bool | nameMatch (const char *str, const char *pat) |
Match the string str against a simple pattern pat without considering cases. More... | |
template<class C , typename F > | |
guardObj0< C, F > | objectGuard (C o, F f) |
const ibis::bitvector64 & | outerProduct (const ibis::bitvector &a, const ibis::bitvector &b, ibis::bitvector64 &c) |
Compute the outer product of a and b , add the result to c . More... | |
const ibis::bitvector64 & | outerProductUpper (const ibis::bitvector &a, const ibis::bitvector &b, ibis::bitvector64 &c) |
Add the strict upper triangular portion of the outer production between a and b to c . More... | |
double | rand () |
A very simple pseudo-random number generator. More... | |
int64_t | read (int, void *, int64_t) |
A wrapper over POSIX read function. More... | |
int | readDouble (double &val, const char *&str, const char *del=ibis::util::delimiters) |
Attempt to convert the incoming string into a double. More... | |
int | readInt (int64_t &val, const char *&str, const char *del=ibis::util::delimiters) |
Attempt to convert the incoming string into an integer. More... | |
int | readString (std::string &str, const char *&buf, const char *delim=0) |
Copy the next string to the output variable str. More... | |
const char * | readString (char *&buf, const char *delim=0) |
Attempt to extract a string token from the incoming buffer. More... | |
int | readUInt (uint64_t &val, const char *&str, const char *del=ibis::util::delimiters) |
Attempt to convert the incoming string into a unsigned integer. More... | |
template<class T > | |
refHolder< T > | ref (T &r) |
A function template to produce refHolder. | |
void | removeDir (const char *name, bool leaveDir=false) |
Remove the content of named directory. More... | |
void | removeTail (char *str, char tail) |
Remove trailing character 'tail' from str. | |
template<typename T > | |
void | reorder (array_t< T > &arr, const array_t< uint32_t > &ind) |
Reorder the array arr according to the indices given in ind. More... | |
void | reorder (std::vector< std::string > &arr, const array_t< uint32_t > &ind) |
Reorder string values. More... | |
template<typename T > | |
void | reorder (array_t< T * > &arr, const array_t< uint32_t > &ind) |
Reorder the array arr according to the indices given in ind. | |
void | secondsToString (const time_t, char *str) |
Converts the given time in seconds (as returned by function time) into the string (as from asctime_r). More... | |
uint32_t | serialNumber () |
Compute a serial number. More... | |
int | setLogFileName (const char *filename) |
Change the current log file to the named file. More... | |
void | setVerboseLevel (int v) |
Set the verboseness level. More... | |
template<typename T1 , typename T2 > | |
void | sort_heap (array_t< T1 > &keys, array_t< T2 > &vals) |
Heapsort. More... | |
template<typename T1 , typename T2 > | |
void | sort_insertion (array_t< T1 > &keys, array_t< T2 > &vals) |
Insertion sort. More... | |
template<typename T1 , typename T2 > | |
uint32_t | sort_partition (array_t< T1 > &keys, array_t< T2 > &vals) |
Partition function for quicksort. More... | |
template<typename T1 , typename T2 > | |
void | sort_partition3 (array_t< T1 > &keys, array_t< T2 > &vals, uint32_t &starteq, uint32_t &startgt) |
Three-way partitioning algorithm for quicksort. More... | |
template<typename T1 , typename T2 > | |
void | sort_quick (array_t< T1 > &keys, array_t< T2 > &vals, uint32_t lvl) |
Quicksort. More... | |
template<typename T1 , typename T2 > | |
void | sort_quick3 (array_t< T1 > &keys, array_t< T2 > &vals) |
Quicksort. More... | |
template<typename T1 , typename T2 > | |
void | sort_shell (array_t< T1 > &keys, array_t< T2 > &vals) |
Shell sort. More... | |
template<typename T1 , typename T2 > | |
void | sortAll (array_t< T1 > &arr1, array_t< T2 > &arr2) |
Sort two arrays together. More... | |
template<typename T1 , typename T2 > | |
void | sortAll_quick (array_t< T1 > &arr1, array_t< T2 > &arr2) |
Quick sort. Uses both arrays as keys and Moves all records of both arrays. | |
template<typename T1 , typename T2 > | |
void | sortAll_shell (array_t< T1 > &arr1, array_t< T2 > &arr2) |
Shell sort. Sort both arrays arr1 and arr2. | |
template<typename T1 , typename T2 > | |
uint32_t | sortAll_split (array_t< T1 > &arr1, array_t< T2 > &arr2) |
The parititioning function for ibis::util::sortAll. More... | |
template<typename T1 , typename T2 > | |
void | sortKeys (array_t< T1 > &keys, array_t< T2 > &vals) |
Sorting function with payload. More... | |
int64_t | sortMerge (std::vector< std::string > &valR, array_t< uint32_t > &indR, std::vector< std::string > &valS, array_t< uint32_t > &indS) |
An in-memory sort merge join function with string values. | |
template<typename T > | |
int64_t | sortMerge (array_t< T > &valR, array_t< uint32_t > &indR, array_t< T > &valS, array_t< uint32_t > &indS) |
An in-memory sort merge join function. More... | |
template<typename T > | |
int64_t | sortMerge (array_t< T > &valR, array_t< uint32_t > &indR, array_t< T > &valS, array_t< uint32_t > &indS, double delta1, double delta2) |
An in-memory sort merge join function. More... | |
void | sortStrings (std::vector< std::string > &keys, array_t< uint32_t > &vals) |
Sorting function with string as keys and uint32_t as payload. More... | |
void | sortStrings (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
Quicksort for strings. More... | |
void | sortStrings (array_t< const char * > &keys, array_t< uint32_t > &vals) |
Sorting function with string as keys and uint32_t as payload. More... | |
void | sortStrings (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
Quicksort for strings. More... | |
uint32_t | sortStrings_partition (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
The partitioning procedure for quick sort. More... | |
uint32_t | sortStrings_partition (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
The partitioning procedure for quick sort. More... | |
void | sortStrings_shell (std::vector< std::string > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
Shell sorting procedure. More... | |
void | sortStrings_shell (array_t< const char * > &keys, array_t< uint32_t > &vals, uint32_t begin, uint32_t end) |
Shell sorting procedure. More... | |
bool | strMatch (const char *str, const char *pat) |
Match the string str against a simple pattern pat . More... | |
char * | strnewdup (const char *s) |
Duplicate string content with C++ default new operator. More... | |
char * | strnewdup (const char *s, const uint32_t n) |
Duplicate no more than n characters. | |
char * | trim (char *str) |
Remove leading and trailing blank space. | |
void | uniformFraction (const long unsigned idx, long unsigned &denominator, long unsigned &numerator) |
Compute a denominator and numerator pair. More... | |
void | updateDatasets (void) |
Update the metadata about the data partitions. More... | |
const char * | userName () |
Return the user name. More... | |
int64_t | write (int, const void *, int64_t) |
A wrapper over POSIX write function. More... | |
int | writeLogFileHeader (FILE *fptr, const char *fname) |
Write a header to the log file. More... | |
uint32_t | checksum (const char *str, uint32_t sz) |
Fletcher's arithmetic checksum with 32-bit result. | |
uint32_t | checksum (uint32_t a, uint32_t b) |
Fletcher's checksum on two integers. Returns an integer. | |
std::string | shortName (const std::string &longname) |
Use the Fletcher's checksum to produce a short string. More... | |
std::string | randName (const std::string &longname) |
Generate a short string to be used as a table/partition name. | |
void | int2string (std::string &str, unsigned val) |
Pack a 32-bit integer into six base-64 alphabets. | |
void | int2string (std::string &str, unsigned v1, unsigned v2) |
Pack two 32-bit integers into 11 base-64 alphabets. | |
void | int2string (std::string &str, unsigned v1, unsigned v2, unsigned v3) |
Pack three 32-bit integers into 16 base-64 alphabets. | |
void | int2string (std::string &str, const std::vector< unsigned > &val) |
Convert the incoming numbers into a base-64 alpha-numeric representation. More... | |
void | encode64 (uint64_t, std::string &) |
Turn the incoming integer into a 64-bit representation. More... | |
int | decode64 (uint64_t &, const std::string &) |
Decode a number encoded using ibis::util::encode64. | |
int | decode16 (uint64_t &, const char *) |
Convert a string of hexadecimal digits back to an integer. More... | |
std::string | groupby1000 (uint64_t) |
Produce a string version of the unsigned integer value with the decimal digits grouped into 1000s. More... | |
double | incrDouble (const double &) |
Functions to handle manipulation of floating-point numbers. More... | |
double | decrDouble (const double &) |
Decrease the input value to the next smaller value. More... | |
void | eq2range (const double &, double &, double &) |
Generate a range [left, right) that contains exactly the input value in. More... | |
double | coarsen (const double in, unsigned prec=2) |
Reduce the decimal precision of the incoming floating-point value to specified precision. More... | |
double | compactValue (double left, double right, double start=0.0) |
Compute a compact 64-bit floating-point value with a short decimal representation. More... | |
double | compactValue2 (double left, double right, double start=0.0) |
Compute a compact 64-bit floating-point value with a short binary representation. More... | |
void | setNaN (double &val) |
Set a double to NaN. | |
void | setNaN (float &val) |
Functions to handle manipulation of floating-point numbers. More... | |
template<typename Tin , typename Tout > | |
void | round_down (const Tin &inval, Tout &outval) |
Round the incoming value to the largest output value that is no more than the input. More... | |
template<typename Tin , typename Tout > | |
void | round_up (const Tin &inval, Tout &outval) |
Round the incoming value to the smallest output value that is no less than the input. More... | |
template<typename Tin > | |
void | round_up (const Tin &inval, float &) |
A specialization of round_up for the output type float. More... | |
template<typename Tin > | |
void | round_up (const Tin &inval, double &outval) |
A specialization of round_up for the output in double. | |
int | log2 (uint32_t x) |
Log_2 of a 32-bit integer. | |
int | log2 (uint64_t x) |
Log_2 of a 64-bit integer. | |
template<typename T > | |
void | sort_radix (array_t< char > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< signed char > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< unsigned char > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< uint16_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< int16_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< uint32_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< int32_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< uint64_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< int64_t > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< float > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
template<typename T > | |
void | sort_radix (array_t< double > &keys, array_t< T > &vals) |
LSD Radix sort. More... | |
void | sortRIDs (ibis::RIDSet &) |
Sort RID lists. More... | |
void | sortRIDsq (ibis::RIDSet &, uint32_t, uint32_t) |
Sort a portion of the RIDSet with quick sort. More... | |
void | sortRIDsi (ibis::RIDSet &, uint32_t, uint32_t) |
Sort a portion of the RIDset with insertion sort. More... | |
Variables | |
const short unsigned | charIndex [] |
charIndex maps the characters (ASCII) back to integer [0-64] More... | |
const char * | charTable |
charTable lists the 64 printable characters to be used for names More... | |
const char * | delimiters = ";, \v\b\f\r\t\n'\"" |
Delimiters used to separate a string of names. More... | |
pthread_mutex_t | envLock = PTHREAD_MUTEX_INITIALIZER |
A mutex for serialize operations FastBit wide. More... | |
const int | log2table [256] |
log base 2 of an integer, the lookup table More... | |
const uint32_t | shellgaps [8] = {1, 4, 10, 23, 57, 132, 301, 701} |
Gaps for Shell sort from http://en.wikipedia.org/wiki/Shell_sort by Ciura, 2001. More... | |
Organize the miscellaneous functions under the name util.
typedef const guardBase& ibis::util::guard |
The type to be used by client code.
User code uses type ibis::util::guard along with the overloaded function ibis::util::makeGuard, as in
void ibis::util::closeLogFile | ( | ) |
Close the log file.
This function is registered with atexit through ibis::init. It will be automatically invoked during exit. However, if necessary, the log file will be reopened during later operations. In any case, the operating system will close the log file after the termination of the program.
Referenced by fastbit_cleanup(), and ibis::init().
|
inline |
Reduce the decimal precision of the incoming floating-point value to specified precision.
In an attempt to compute small values more consistently, small values are computed through division of integer values.
Since these integer values are computed through the function pow, the accuracy of the results depend on the implementation of the math library.
The value zero is always rounded to zero. Incoming value less than 1E-300 or greater than 1E300 is rounded to zero.
Referenced by ibis::fuzz::append(), ibis::bylt::append(), ibis::zona::append(), ibis::fuge::append(), ibis::bak2::mapValues(), ibis::tafel::preferredSize(), ibis::tafel::readCSV(), and ibis::tafel::readSQLDump().
double ibis::util::compactValue | ( | double | left, |
double | right, | ||
double | start = 0.0 |
||
) |
Compute a compact 64-bit floating-point value with a short decimal representation.
Compute a compact 64-bit floating-point value in the range (left, right].
The righ-end is inclusive because the computed value is used to define bins based 'x<compactValue' and those bins are interpreted as [low, upper) as in the typical c notion.
References logMessage().
Referenced by ibis::bin::addBounds(), ibis::bin::bin(), ibis::range::binBoundaries(), ibis::bak::binBoundaries(), ibis::bin::binning(), ibis::bin::convertGranules(), ibis::part::equalWeightBins(), ibis::bin::expandRange(), ibis::range::expandRange(), ibis::bak::expandRange(), ibis::bak2::expandRange(), ibis::part::get1DDistribution(), ibis::relic::getCumulativeDistribution(), ibis::index::getCumulativeDistribution(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::relic::print(), ibis::direkte::print(), ibis::keywords::print(), ibis::fuzz::print(), ibis::bylt::print(), ibis::zona::print(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::part::recursiveQuery(), ibis::bin::scanAndPartition(), and ibis::bin::setBoundaries().
double ibis::util::compactValue2 | ( | double | left, |
double | right, | ||
double | start = 0.0 |
||
) |
Compute a compact 64-bit floating-point value with a short binary representation.
This version round the number in binary and tries to use a few binary digits in the mantissa as possible.
The resulting number should have a better chance of producing exact results in simple arithmetic operations.
References logMessage().
Referenced by ibis::part::equalWeightBins(), and ibis::part::get2DDistributionU().
int ibis::util::copy | ( | const char * | to, |
const char * | from | ||
) |
Copy file named "from" to a file named "to".
It overwrites the content of "to". It returns a negative number if the source file does not existing can not be copied.
References UnixOpen.
Referenced by ibis::tafel::append(), ibis::column::append(), ibis::dictionary::appendOrdered(), ibis::tafel::appendString(), ibis::array_t< T >::array_t(), ibis::tafel::assignDefaultValue(), ibis::part::coarsenBins(), ibis::part::get2DDistributionI(), ibis::blob::getBlob(), ibis::mensa::getColumnAsDoubles(), ibis::bord::getColumnAsDoubles(), ibis::mensa::getColumnAsFloats(), ibis::bord::getColumnAsFloats(), ibis::mensa::getColumnAsInts(), ibis::bord::getColumnAsInts(), ibis::mensa::getColumnAsLongs(), ibis::bord::getColumnAsLongs(), ibis::mensa::getColumnAsOpaques(), ibis::bord::getColumnAsOpaques(), ibis::mensa::getColumnAsShorts(), ibis::bord::getColumnAsShorts(), ibis::bord::getColumnAsStrings(), ibis::mensa::getColumnAsUInts(), ibis::bord::getColumnAsUInts(), ibis::mensa::getColumnAsULongs(), ibis::bord::getColumnAsULongs(), ibis::mensa::getColumnAsUShorts(), ibis::bord::getColumnAsUShorts(), ibis::bord::getHistogram(), ibis::bord::getHistogram2D(), ibis::bord::getHistogram3D(), ibis::dictionary::insert(), ibis::part::numbersToBitvector(), ibis::bitvector64::operator-=(), ibis::bitvector::operator-=(), ibis::qDiscreteRange::qDiscreteRange(), ibis::qIntHod::qIntHod(), ibis::qUIntHod::qUIntHod(), ibis::colInts::reduce(), ibis::colUInts::reduce(), ibis::colLongs::reduce(), ibis::colULongs::reduce(), ibis::colShorts::reduce(), ibis::colUShorts::reduce(), ibis::colBytes::reduce(), ibis::colUBytes::reduce(), ibis::colFloats::reduce(), and ibis::colDoubles::reduce().
int ibis::util::decode16 | ( | uint64_t & | output, |
const char * | buf | ||
) |
Convert a string of hexadecimal digits back to an integer.
The return values are:
Referenced by ibis::selectClause::decodeAName(), and ibis::selectClause::fillNames().
|
inline |
Decrease the input value to the next smaller value.
Referenced by ibis::quaere::create().
void ibis::util::emptyCache | ( | void | ) |
Attempt to remove all currently unused data from memory cache.
References ibis::datasets.
Referenced by ibis::part::clear(), ibis::part::doScan(), ibis::query::estimate(), ibis::query::evaluate(), ibis::filter::select(), ibis::table::select(), ibis::part::selectBytes(), ibis::part::selectDoubles(), ibis::part::selectFloats(), ibis::part::selectInts(), ibis::part::selectLongs(), ibis::part::selectShorts(), ibis::part::selectStrings(), ibis::part::selectUBytes(), ibis::part::selectUInts(), ibis::part::selectULongs(), ibis::part::selectUShorts(), ibis::part::selfTest(), ibis::query::sequentialScan(), and ibis::part::updateData().
void ibis::util::encode64 | ( | uint64_t | input, |
std::string & | buf | ||
) |
Turn the incoming integer into a 64-bit representation.
The resulting string is stored in the output variable buf. The return value of this function is buf.c_str() if this function completes successfully, otherwise, a nil pointer is returned.
References charTable.
|
inline |
Generate a range [left, right) that contains exactly the input value in.
This is used to transform an expression expression "A = in" into "left <= A < right".
Referenced by ibis::ambit::estimate(), ibis::pale::estimate(), ibis::pack::estimate(), and ibis::zone::estimate().
size_t ibis::util::find | ( | const std::vector< T > & | arr, |
const T & | val, | ||
size_t | i0 | ||
) |
Find the first position where the value is no less than val
.
Start the searching operation with position start. Assuming the array is already sorted in the ascending order, it returns the smallest i such that operator[](i)
>= val
.
Referenced by ibis::roster::icSearch(), and ibis::column::searchSortedICD().
size_t ibis::util::find | ( | const array_t< T > & | arr, |
const T & | val, | ||
size_t | i0 | ||
) |
Find the first position where the value is no less than val
.
Start the searching operation with position start. Assuming the array is already sorted in the ascending order, it returns the smallest i such that operator[](i)
>= val
.
References ibis::array_t< T >::size().
uint32_t ibis::util::find | ( | const array_t< T > & | arr, |
const array_t< uint32_t > & | ind, | ||
const T & | val, | ||
uint32_t | i0 | ||
) |
Find the position of the first element that is no less than val
.
The search starts with the given position i0
. Assuming ind
was produced by the sort function, it returns the smallest i such that operator[](ind[i])
>= val
.
References ibis::array_t< T >::size().
unsigned ibis::util::gatherParts | ( | ibis::partList & | tlist, |
const char * | adir, | ||
const char * | bdir, | ||
bool | ro = false |
||
) |
Look for data partitions in the given pair of directories.
Read the two directories, if there are matching subdirs, construct an ibis::part from them.
Will descend into the subdirectories when run on unix systems to look for matching subdirectories.
Returns the number of data partitions found.
References ibis::part::clear(), envLock, logMessage(), ibis::part::name(), ibis::part::nColumns(), ibis::part::nRows(), ibis::part::rename(), and ibis::part::timestamp().
Referenced by ibis::mensa::addPartition(), gatherParts(), ibis::init(), and ibis::mensa::mensa().
unsigned ibis::util::gatherParts | ( | ibis::partList & | tlist, |
const char * | dir1, | ||
bool | ro = false |
||
) |
Look for data partitions in the given directory.
Examining the given directory to look for the metadata files and to construct ibis::part.
Can only descend into subdirectories through opendir family of functions.
Returns the number of data partitions found.
References ibis::part::clear(), envLock, gatherParts(), ibis::part::name(), ibis::part::nColumns(), ibis::part::nRows(), ibis::part::rename(), and ibis::part::timestamp().
unsigned ibis::util::gatherParts | ( | ibis::partList & | tables, |
const ibis::resource & | res, | ||
bool | ro = false |
||
) |
Reconstruct partitions using data directories specified in the resource list.
Read the parameters dataDir1 and dataDir2 to build data partitions.
Returns the number of data partitions found.
References gatherParts(), and ibis::resource::getValue().
off_t ibis::util::getFileSize | ( | const char * | name | ) |
Return size of the file in bytes.
The value 0 is returned if file does not exist.
Referenced by ibis::relic::append(), ibis::bin::append(), ibis::column::append(), ibis::egale::append(), ibis::moins::append(), ibis::entre::append(), ibis::part::append2(), ibis::part::appendToBackup(), ibis::bundle1::bundle1(), ibis::bundles::bundles(), ibis::column::evaluateRange(), ibis::query::getRIDsInBundle(), ibis::column::hasRawData(), ibis::column::indexSize(), ibis::keywords::keywords(), ibis::index::mapValues(), ibis::query::orderPairs(), ibis::resource::read(), ibis::text::selectLongs(), ibis::text::selectStrings(), ibis::category::selectStrings(), ibis::category::selectUInts(), ibis::column::selectValuesT(), ibis::part::selfTest(), ibis::index::sizeInBytes(), ibis::column::truncateData(), ibis::column::writeData(), ibis::tafel::writeMetaData(), ibis::roster::writeSorted(), and ibis::category::~category().
void ibis::util::getGMTime | ( | char * | str | ) |
Return the current GMT time in string format.
The argument str
must have 26 or more bytes and is used to carry the time output.
void ibis::util::getLocalTime | ( | char * | str | ) |
Return the current time in string format as asctime_r
.
The argument str
must have 26 or more bytes and is used to carry the time output.
The new line character in position 24 is turned into a null character. Therefore the returned string contains only 24 characters.
Referenced by ibis::util::logger::logger(), ibis::column::logMessage(), ibis::query::logMessage(), logMessage(), ibis::column::logWarning(), ibis::fileManager::printStatus(), ibis::part::reorder(), writeLogFileHeader(), and ibis::part::writeMetaData().
FILE * ibis::util::getLogFile | ( | ) |
Retrieve the pointer to the log file.
The value of FASTBIT_DEFAULT_LOG will be returned if no log file was specified. A log file name be specified through the following means (in the specified order),
– setLogFile – environment variable FASTBITLOGFILE – configuration parameter logfile
References ibis::gParameters(), setLogFileName(), and writeLogFileHeader().
Referenced by fastbit_get_logfilepointer(), ibis::column::logMessage(), ibis::query::logMessage(), logMessage(), ibis::column::logWarning(), and ibis::util::logger::~logger().
const char * ibis::util::getLogFileName | ( | ) |
Return name the of the current log file.
A blank string indicates a standard output stream, either stderr or stdout.
Referenced by fastbit_get_logfile().
char * ibis::util::getString | ( | const char * | buf | ) |
Extract a string from the given buf.
Remove leading and trailing spaces and surrounding quotes. Returns a copy of the string allocated with the new
operator.
References strnewdup().
Referenced by ibis::column::column(), ibis::dictionary::fromASCII(), ibis::mensa::cursor::getColumnAsString(), ibis::mensa::getColumnAsStrings(), ibis::text::getOpaque(), ibis::text::IDColumnForKeywordIndex(), ibis::part::readMetaData(), and ibis::part::readMetaTags().
const char * ibis::util::getToken | ( | char *& | str, |
const char * | tok_chrs | ||
) |
Return a null-terminated string from the beginning of input string str.
The first apparence of any character from characters in tok_chars is turned into null. The incoming argument is modified to point to the first character that is not in tok_chrs. If no character in tok_chrs is found, str is returned and the first argument is changed to null.
|
inline |
Return an integer designating the version of this software.
The version number is composed of four segments each with two decimal digits. For example, version 1.3.0.2 will be represented as 1030002. The stable releases typically have the last segment as zero, which is generally referred to without the last ".0".
Referenced by fastbit_get_version_number(), and writeLogFileHeader().
|
inline |
Return a pointer to the string designating the version of this software.
Referenced by fastbit_get_version_string(), and writeLogFileHeader().
std::string ibis::util::groupby1000 | ( | uint64_t | val | ) |
Produce a string version of the unsigned integer value with the decimal digits grouped into 1000s.
Referenced by ibis::fileManager::clear(), ibis::fileManager::getFile(), ibis::fileManager::getFileSegment(), ibis::bord::merge(), ibis::fileManager::printStatus(), and ibis::fileManager::storage::storage().
|
inline |
Functions to handle manipulation of floating-point numbers.
Increment the input value to the next larger value.
If the math library has nextafter, it will use nextafter, otherwise, it will use the unit round-off error to compute the next larger value. The success of this computation is highly sensitive to the definition of DBL_EPSILON, which should be the smallest value x such that (1+x) is different from x. For 64-bit IEEE floating-point number, it is approximately 2.2E-16 (2^{-52}).
Referenced by ibis::part::adaptive2DBins(), ibis::part::adaptive3DBins(), ibis::part::adaptiveFloats(), ibis::part::adaptiveFloatsDetailed(), ibis::bak::binBoundaries(), ibis::quaere::create(), ibis::ambit::estimate(), ibis::pale::estimate(), ibis::pack::estimate(), ibis::zone::estimate(), ibis::part::get1DDistribution(), ibis::part::get2DDistributionI(), ibis::part::get2DDistributionU(), ibis::qContinuousRange::qContinuousRange(), and ibis::bin::scanAndPartition().
void ibis::util::int2string | ( | std::string & | str, |
const std::vector< unsigned > & | val | ||
) |
Convert the incoming numbers into a base-64 alpha-numeric representation.
Three 32-bit integers can be packed int 16 base-64 alphabets.
References int2string().
long ibis::util::intersect | ( | const std::vector< ibis::bitvector > & | bits1, |
const std::vector< ibis::bitvector > & | bits2, | ||
std::vector< ibis::bitvector > & | res | ||
) |
Intersect two sets of bit vectors.
References ibis::fileManager::bytesInUse(), ibis::bitvector::compress(), and ibis::fileManager::instance().
Referenced by ibis::part::get2DBins(), and ibis::part::get3DBins().
long ibis::util::intersect | ( | const std::vector< ibis::bitvector > & | bits1, |
const std::vector< ibis::bitvector > & | bits2, | ||
const std::vector< ibis::bitvector > & | bits3, | ||
std::vector< ibis::bitvector > & | res | ||
) |
Intersect three sets of bit vectors.
References ibis::fileManager::bytesInUse(), ibis::bitvector::compress(), and ibis::fileManager::instance().
void ibis::util::logMessage | ( | const char * | event, |
const char * | fmt, | ||
... | |||
) |
Print a message to standard output.
The format string is same as printf. A global mutex lock is used to ensure printed messages are ordered. The message is preceeded with the current local time (as from function ctime) if the macro FASTBIT_TIMED_LOG is defined at the compile time.
References getLocalTime(), and getLogFile().
Referenced by ibis::part::accessHint(), ibis::index::addBits(), ibis::blob::append(), ibis::column::append(), ibis::part::append1(), ibis::part::append2(), ibis::part::appendToBackup(), ibis::bin::binOrder(), ibis::column::column(), compactValue(), compactValue2(), ibis::bitvector64::decompress(), ibis::part::digestMeshShape(), ibis::part::doBackup(), ibis::query::doEstimate(), ibis::query::doScan(), ibis::bin::estimate(), ibis::relic::estimate(), ibis::part::evaluateJoin(), ibis::bitvector64::flip(), gatherParts(), ibis::part::get1DBins(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::query::getBounds(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::part::getJointDistribution(), ibis::column::getNullMask(), ibis::index::isIndex(), ibis::query::isValidToken(), ibis::relic::locate(), ibis::bak::locate(), ibis::bak2::locate(), ibis::part::makeBackupCopy(), makeDir(), ibis::part::old2DDistribution(), ibis::bitvector64::operator&(), ibis::bitvector64::operator&=(), ibis::bitvector64::operator-(), ibis::bitvector64::operator-=(), ibis::bitvector64::operator^(), ibis::bitvector64::operator^=(), ibis::bitvector64::operator|(), ibis::bitvector64::operator|=(), ibis::query::orderPairs(), ibis::part::packCumulativeDistribution(), ibis::part::packDistribution(), ibis::part::part(), ibis::query::processJoin(), ibis::part::purgeInactive(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::bitvector64::read(), ibis::part::readMetaData(), ibis::part::recursiveQuery(), removeDir(), ibis::query::removeFiles(), ibis::bord::reorderStrings(), ibis::part::rollback(), ibis::column::saveSelected(), ibis::column::selectBytes(), ibis::column::selectDoubles(), ibis::column::selectFloats(), ibis::column::selectInts(), ibis::column::selectLongs(), ibis::column::selectShorts(), ibis::column::selectUBytes(), ibis::column::selectUInts(), ibis::column::selectULongs(), ibis::column::selectUShorts(), ibis::column::selectValuesT(), ibis::part::selfTest(), ibis::bitvector64::setBit(), ibis::query::sortEquiJoin(), ibis::query::sortRangeJoin(), ibis::bord::sortStrings(), ibis::column::string2int(), ibis::index::sumBins(), ibis::index::sumBits(), ibis::column::truncateData(), and ibis::bitvector64::write().
int ibis::util::makeDir | ( | const char * | dir | ) |
Recursivly create directory "dir".
Returns zero (0) to indicate success, a negative number to indicate error. If the directory already exists, it immediately returns 0.
References logMessage(), and strnewdup().
Referenced by ibis::bord::backup(), and ibis::part::part().
bool ibis::util::nameMatch | ( | const char * | str, |
const char * | pat | ||
) |
Match the string str
against a simple pattern pat
without considering cases.
This is to follow the SQL standard for comparing names.
Referenced by ibis::part::matchNameValuePair().
const ibis::bitvector64 & ibis::util::outerProduct | ( | const ibis::bitvector & | a, |
const ibis::bitvector & | b, | ||
ibis::bitvector64 & | c | ||
) |
Compute the outer product of a
and b
, add the result to c
.
This implementation only uses public functions of ibis::bitvector and ibis::bitvector64.
This should make it possible to use both WAH and BBC compress bitvector classes. It is not likely that we can gain much performance by directly using member variables of ibit::bitvector because the unit of compressed data from ibit::vector do not fit neatly into ibis::bitvector64::word_t.
Return a const reference of c
. If the input does
not have the correct size, it will be replaced by the outer product.
References ibis::bitvector64::adjustSize(), ibis::bitvector64::appendFill(), ibis::bitvector64::bytes(), ibis::bitvector64::cnt(), ibis::bitvector::cnt(), ibis::bitvector::indexSet::nIndices(), ibis::bitvector64::setBit(), ibis::bitvector::size(), ibis::bitvector64::size(), ibis::horometer::start(), and ibis::bitvector64::swap().
Referenced by ibis::index::estimate(), ibis::query::processJoin(), ibis::relic::speedTest(), and ibis::bin::speedTest().
const ibis::bitvector64 & ibis::util::outerProductUpper | ( | const ibis::bitvector & | a, |
const ibis::bitvector & | b, | ||
ibis::bitvector64 & | c | ||
) |
Add the strict upper triangular portion of the outer production between a
and b
to c
.
The result contains
only the strict upper triangular portion of the full outer product.
References ibis::bitvector64::adjustSize(), ibis::bitvector64::appendFill(), ibis::bitvector64::bytes(), ibis::bitvector64::cnt(), ibis::bitvector::cnt(), ibis::bitvector::indexSet::nIndices(), ibis::bitvector64::setBit(), ibis::bitvector::size(), ibis::bitvector64::size(), ibis::horometer::start(), and ibis::bitvector64::swap().
|
inline |
A very simple pseudo-random number generator.
It produces a floating point number between 0 and 1.
The is a a Linear Congruential pseudo-random number generator. It produces a floating-point in the range of (0, 1). It is very simple and fast, however, it does not produce high-quality random numbers. It is not thread-safe. However, since the actual computation only involves two arithmetic operations, it is very unlikely to have thread-safty issues.
Referenced by ibis::part::buildQueryList(), ibis::index::mapValues(), ibis::part::queryTest(), ibis::part::quickTest(), ibis::part::selfTest(), and ibis::part::testRangeOperators().
int64_t ibis::util::read | ( | int | fdes, |
void * | buf, | ||
int64_t | nbytes | ||
) |
A wrapper over POSIX read function.
When a large chunk is requested by the user, the read function may return one piece at a time, typically a piece is no larger than 2^31 bytes, and the size of this piece is implementation dependent. This function attempts use the return value from read when it is a posive value.
Referenced by ibis::column::append(), ibis::fileManager::roFile::doRead(), ibis::roster::icSearch(), ibis::roster::oocSearch(), ibis::fileManager::storage::read(), ibis::column::searchSortedOOCC(), ibis::column::searchSortedOOCD(), ibis::column::selectValuesT(), and ibis::column::string2int().
int ibis::util::readDouble | ( | double & | val, |
const char *& | str, | ||
const char * | del = ibis::util::delimiters |
||
) |
Attempt to convert the incoming string into a double.
The format recodnized is the following
References readInt().
Referenced by ibis::math::fromUnixTime::eval(), ibis::resource::getNumber(), ibis::tafel::parseLine(), and ibis::whereClause::verifyExpr().
int ibis::util::readInt | ( | int64_t & | val, |
const char *& | str, | ||
const char * | del = ibis::util::delimiters |
||
) |
Attempt to convert the incoming string into an integer.
It skips leading space and converts an optional +/- sign followed by a list of decimal digits to an integer.
On successful completion of this function, the argument val contains the converted value and str points to the next unused character in the input string. In this case, it returns 0. It returns -1 if the input string is a null string, is an empty string, has only blank spaces, or has a delimiter immediately following the leading blank spaces. It returns -2 to indicate overflow, in which case, it resets val to 0 and moves str to the next character that is not a decimal digit.
References readUInt().
Referenced by ibis::tafel::parseLine(), ibis::qIntHod::qIntHod(), readDouble(), and ibis::whereClause::verifyExpr().
int ibis::util::readString | ( | std::string & | str, |
const char *& | buf, | ||
const char * | delim = 0 |
||
) |
Copy the next string to the output variable str.
Leading blank spaces are skipped. The content of str will be empty if buf is nil or an empty string. If the string is quoted, only spaces before the quote are skipped, and the content of the string will be everything after the first quote to the last character before the matching quote or end of buffer. If delim is not provided (i.e., is 0) and the 1st nonblank character is not a quote, then string will terminate at the 1st space (or nonprintable) character following the nonblank character.
Referenced by ibis::tafel::assignDefaultValue(), ibis::keywords::tokenizer::operator()(), ibis::tafel::parseLine(), ibis::tablex::readNamesAndTypes(), ibis::tafel::readSQLDump(), ibis::text::readStrings2(), ibis::text::selectStrings(), ibis::bin::setBoundaries(), and ibis::tafel::SQLCreateTable().
const char * ibis::util::readString | ( | char *& | buf, |
const char * | delim = 0 |
||
) |
Attempt to extract a string token from the incoming buffer.
It uses the existing buffer without allocating additional space. It creates nil terminated strings by converting terminators or space characters into nil characters. If it finds a valid string, it will return a pointer to the string, otherwise it returns a nil pointer or an empty string.
int ibis::util::readUInt | ( | uint64_t & | val, |
const char *& | str, | ||
const char * | del = ibis::util::delimiters |
||
) |
Attempt to convert the incoming string into a unsigned integer.
It skips leading space and converts a list of decimal digits to an integer. On successful completion of this function, the argument val contains the converted value and str points to the next unused character in the input string. In this case, it returns 0. It returns -1 if the input string is a null string, is an empty string, has only blank spaces or have one of the delimiters immediately following the leading blank spaces. It returns -2 to indicate overflow, in which case, it resets val to 0 and moves str to the next character that is not a decimal digit.
Referenced by ibis::dictionary::fromASCII(), ibis::tafel::parseLine(), ibis::qUIntHod::qUIntHod(), readInt(), and ibis::keywords::readTDLine().
void ibis::util::removeDir | ( | const char * | name, |
bool | leaveDir = false |
||
) |
Remove the content of named directory.
If this function is run on a unix-type system and the second argument is true, it will leave all the subdirectories intact as well.
References logMessage(), and strnewdup().
Referenced by ibis::part::append2(), ibis::query::clear(), ibis::part::doBackup(), ibis::part::part(), and ibis::part::rollback().
void ibis::util::reorder | ( | array_t< T > & | arr, |
const array_t< uint32_t > & | ind | ||
) |
Reorder the array arr according to the indices given in ind.
This implementation uses copy-and-swap algorithm.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
Referenced by ibis::bin::bin(), ibis::bin::construct(), ibis::bord::orderby(), ibis::math::bediener::reduce(), ibis::colInts::reorder(), ibis::bord::reorder(), ibis::colUInts::reorder(), ibis::colLongs::reorder(), ibis::bundles::reorder(), ibis::colULongs::reorder(), ibis::colShorts::reorder(), ibis::colUShorts::reorder(), ibis::colBytes::reorder(), ibis::part::reorder(), ibis::colUBytes::reorder(), ibis::colFloats::reorder(), ibis::colDoubles::reorder(), ibis::jNatural::select(), ibis::jRange::select(), and ibis::bundles::truncate().
void ibis::util::reorder | ( | std::vector< std::string > & | arr, |
const array_t< uint32_t > & | ind | ||
) |
Reorder string values.
This function keeps the actual strings in their input positions by using the function swap. This procedure should avoid most of the memory allocations.
References ibis::array_t< T >::size().
void ibis::util::round_down | ( | const Tin & | inval, |
Tout & | outval | ||
) |
Round the incoming value to the largest output value that is no more than the input.
Both Tin and Tout must be elementary data types, and Tout must be an elementary integral type.
Referenced by ibis::part::doCount(), and ibis::part::doScan().
void ibis::util::round_up | ( | const Tin & | inval, |
Tout & | outval | ||
) |
Round the incoming value to the smallest output value that is no less than the input.
Both Tin and Tout must be elementary data types, and Tout must be an elementary integral type.
Referenced by ibis::column::searchSortedICC(), and ibis::column::searchSortedOOCC().
|
inline |
A specialization of round_up for the output type float.
This function uses nextafterf if the macro HAVE_NEXTAFTER is defined, otherwise it uses FLT_EPSILON to compute outval as (float)(inval)*(1+FLT_EPSILON).
void ibis::util::secondsToString | ( | const time_t | sec, |
char * | str | ||
) |
Converts the given time in seconds (as returned by function time) into the string (as from asctime_r).
The argument str
must contain at least 26 bytes. The new line character is turned into null.
Referenced by ibis::bord::backup(), ibis::bord::bord(), and ibis::tafel::writeMetaData().
uint32_t ibis::util::serialNumber | ( | ) |
Compute a serial number.
It is a unique number that is always increasing. Even time FastBit runs, it starts this number with the value 1. This implementation uses a 32-bit shared integer and therefore could wrap-around when exceeding 2^32.
Referenced by ibis::part::queryTest(), ibis::part::quickTest(), randName(), and ibis::part::selfTest().
int ibis::util::setLogFileName | ( | const char * | filename | ) |
Change the current log file to the named file.
Log files are opened in append mode, so the existing content will be preserved. The log file will be changed only if the named file can be opened correctly.
Error code
References writeLogFileHeader().
Referenced by fastbit_set_logfile(), getLogFile(), ibis::init(), and writeLogFileHeader().
void ibis::util::setNaN | ( | float & | val | ) |
Functions to handle manipulation of floating-point numbers.
Increment the input value to the next larger value.
If the math library has nextafter, it will use nextafter, otherwise, it will use the unit round-off error to compute the next larger value. The success of this computation is highly sensitive to the definition of DBL_EPSILON, which should be the smallest value x such that (1+x) is different from x. For 64-bit IEEE floating-point number, it is approximately 2.2E-16 (2^{-52}).
|
inline |
Set the verboseness level.
Unless the code is compiled with DEBUG macro set, the default verboseness level is 0, which will only print out information about major errors.
std::string ibis::util::shortName | ( | const std::string & | de | ) |
Use the Fletcher's checksum to produce a short string.
The result is often used as name of temporary table objects. The name conforms to the SQL standard.
References checksum(), and int2string().
Referenced by ibis::bord::bord(), ibis::jNatural::fillResult(), ibis::jRange::fillResult(), ibis::part::rename(), ibis::jNatural::select(), ibis::jRange::select(), ibis::filter::sift0(), ibis::filter::sift0S(), ibis::filter::sift1(), ibis::filter::sift1S(), ibis::filter::sift2(), and ibis::filter::sift2S().
void ibis::util::sort_heap | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Heapsort.
Sort the keys only. Move the vals along with the keys.
References ibis::array_t< T >::size().
Referenced by sort_quick().
void ibis::util::sort_insertion | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Insertion sort.
It has relatively straightforward memory access pattern and may be useful to sort a few numbers at the end of a recursive procedure.
References ibis::array_t< T >::size().
uint32_t ibis::util::sort_partition | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Partition function for quicksort.
The return value p separates keys into two parts, keys[..:p-1] < keys[p:..]. A return value equal to the size of keys indicates all keys are sorted.
References ibis::array_t< T >::size(), and sort_shell().
Referenced by sort_quick().
void ibis::util::sort_partition3 | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals, | ||
uint32_t & | starteq, | ||
uint32_t & | startgt | ||
) |
Three-way partitioning algorithm for quicksort.
Upon return from this function, keys satisfying the following order keys[0:starteq] < keys[starteq:stargt-1] < keys[startgt:..]. The keys are ordered if starteq = startgt = keys.size().
References ibis::array_t< T >::size(), and sort_shell().
Referenced by sort_quick3().
void ibis::util::sort_quick | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals, | ||
uint32_t | lvl | ||
) |
Quicksort.
Quick sort with introspection.
Sort the keys only. Use the standard two-way partitioning.
It will switch to heap sort after FASTBIT_QSORT_MAX_DEPTH levels of recursion. Performs recursive call only on the smaller half, while iterate over the larger half.
References ibis::array_t< T >::size(), sort_heap(), sort_partition(), sort_quick(), and sort_shell().
Referenced by sort_quick(), and sortKeys().
void ibis::util::sort_quick3 | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Quicksort.
Sort the keys only. Use a nonstandard three-way partitioning.
References ibis::array_t< T >::size(), sort_partition3(), sort_quick3(), and sort_shell().
Referenced by sort_quick3().
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
Referenced by sortKeys().
void ibis::util::sort_radix | ( | array_t< signed char > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< unsigned char > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< uint16_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< int16_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< uint32_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< int32_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< uint64_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_radix | ( | array_t< int64_t > & | keys, |
array_t< T > & | vals | ||
) |
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
LSD Radix sort.
Allocates buffers needed for copying data.
References ibis::array_t< T >::size(), and ibis::array_t< T >::swap().
void ibis::util::sort_shell | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Shell sort.
It has relatively straightforward memory access pattern and may be useful to sort a few numbers at the end of a recursive sorting function.
References shellgaps, and ibis::array_t< T >::size().
Referenced by sort_partition(), sort_partition3(), sort_quick(), and sort_quick3().
void ibis::util::sortAll | ( | array_t< T1 > & | arr1, |
array_t< T2 > & | arr2 | ||
) |
Sort two arrays together.
Order arr1 in ascending order first, then when arr1 has the same value, order arr2 in ascending order as well.
References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sortAll_quick(), and sortAll_shell().
uint32_t ibis::util::sortAll_split | ( | array_t< T1 > & | arr1, |
array_t< T2 > & | arr2 | ||
) |
The parititioning function for ibis::util::sortAll.
Uses the standard two-way partitioning.
References ibis::array_t< T >::size().
Referenced by sortAll_quick().
void ibis::util::sortKeys | ( | array_t< T1 > & | keys, |
array_t< T2 > & | vals | ||
) |
Sorting function with payload.
Sort keys in ascending order, move the vals accordingly.
References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sort_quick(), and sort_radix().
Referenced by ibis::direkte::keys(), sortMerge(), and ibis::bord::sortValues().
int64_t ibis::util::sortMerge | ( | array_t< T > & | valR, |
array_t< uint32_t > & | indR, | ||
array_t< T > & | valS, | ||
array_t< uint32_t > & | indS | ||
) |
An in-memory sort merge join function.
Sort the input arrays, valR and valS. Count the number of results from join.
References ibis::array_t< T >::nosharing(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and sortKeys().
int64_t ibis::util::sortMerge | ( | array_t< T > & | valR, |
array_t< uint32_t > & | indR, | ||
array_t< T > & | valS, | ||
array_t< uint32_t > & | indS, | ||
double | delta1, | ||
double | delta2 | ||
) |
An in-memory sort merge join function.
Sort the input arrays, valR and valS. Count the number of results satisfying ValR-Vals between delta1 and delta2.
References ibis::array_t< T >::nosharing(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and sortKeys().
void ibis::util::sortRIDs | ( | ibis::RIDSet & | rids | ) |
Sort RID lists.
None of them are stable.
Sort the given list of RIDs with quick sort.
References ibis::array_t< T >::nosharing(), ibis::array_t< T >::size(), sortRIDsi(), and sortRIDsq().
Referenced by ibis::part::evaluateRIDSet(), and ibis::bundle::sortRIDs().
void ibis::util::sortRIDsi | ( | ibis::RIDSet & | rids, |
uint32_t | i, | ||
uint32_t | j | ||
) |
Sort a portion of the RIDset with insertion sort.
Sort RIDs in the range of [i, j).
Referenced by sortRIDs(), and sortRIDsq().
void ibis::util::sortRIDsq | ( | ibis::RIDSet & | rids, |
uint32_t | i, | ||
uint32_t | j | ||
) |
Sort a portion of the RIDSet with quick sort.
Sort RIDs in the range of [i, j).
References sortRIDsi().
Referenced by sortRIDs().
void ibis::util::sortStrings | ( | std::vector< std::string > & | keys, |
array_t< uint32_t > & | vals | ||
) |
Sorting function with string as keys and uint32_t as payload.
It uses quick sort if there are more than FASTBIT_QSORT_MIN elements to sort, otherwise, it uses shell sort.
References ibis::array_t< T >::size(), and sortStrings_shell().
Referenced by ibis::dictionary::dictionary(), ibis::dictionary::readRaw(), ibis::bord::reorder(), sortMerge(), sortStrings(), and ibis::bord::sortStrings().
void ibis::util::sortStrings | ( | std::vector< std::string > & | keys, |
array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
Quicksort for strings.
Quick-sort for strings with shell sort as clean-up procedure.
References ibis::array_t< T >::size(), sortStrings(), sortStrings_partition(), and sortStrings_shell().
void ibis::util::sortStrings | ( | ibis::array_t< const char * > & | keys, |
array_t< uint32_t > & | vals | ||
) |
Sorting function with string as keys and uint32_t as payload.
It uses quick sort if there are more than FASTBIT_QSORT_MIN elements to sort, otherwise, it uses shell sort.
References ibis::array_t< T >::size(), sortStrings(), and sortStrings_shell().
void ibis::util::sortStrings | ( | ibis::array_t< const char * > & | keys, |
ibis::array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
Quicksort for strings.
Quick-sort for strings with shell sort as clean-up procedure.
References ibis::array_t< T >::size(), sortStrings(), sortStrings_partition(), and sortStrings_shell().
uint32_t ibis::util::sortStrings_partition | ( | std::vector< std::string > & | keys, |
array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
The partitioning procedure for quick sort.
Median-of-3 partitioning algorithm.
It implements the standard two-way partitioning with the median-of-three pivot.
References ibis::array_t< T >::size(), and sortStrings_shell().
Referenced by sortStrings().
uint32_t ibis::util::sortStrings_partition | ( | ibis::array_t< const char * > & | keys, |
array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
The partitioning procedure for quick sort.
Median-of-3 partitioning algorithm.
It implements the standard two-way partitioning with the median-of-three pivot.
The string values are represented as "const char*". This function uses std::strcmp to perform the comparisons of string. Since strcmp can not accept nil pointers, this function has to check for nil pointers, which effectively treats a nil pointer as a special value of string that is less than all other string values.
References ibis::array_t< T >::size(), and sortStrings_shell().
void ibis::util::sortStrings_shell | ( | std::vector< std::string > & | keys, |
ibis::array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
Shell sorting procedure.
To clean up after the quick sort procedure.
References shellgaps, and ibis::array_t< T >::size().
Referenced by sortStrings(), and sortStrings_partition().
void ibis::util::sortStrings_shell | ( | ibis::array_t< const char * > & | keys, |
ibis::array_t< uint32_t > & | vals, | ||
uint32_t | begin, | ||
uint32_t | end | ||
) |
Shell sorting procedure.
Shell sort of string values.
To clean up after the quick sort procedure.
The string values are represented as "const char*". This function uses std::strcmp to perform the comparisons of string. Since strcmp can not accept nil pointers, this function has to check for nil pointers, which effectively treats a nil pointer as a special value of string that is less than all other string values.
References shellgaps, and ibis::array_t< T >::size().
bool ibis::util::strMatch | ( | const char * | str, |
const char * | pat | ||
) |
Match the string str
against a simple pattern pat
.
If the whole string matches the pattern, this function returns true, otherwise, it returns false. The special cases are (1) if the two pointers are the same, it returns true; (2) if both arguments point to an empty string, it returns true; (3) if one of the two argument is a nil pointer, but the other is not, it returns false; (4) if str is an empty string, it matches a pattern containing only '' and '*'.
The pattern may contain five special characters, two matches any number of characters, STRMATCH_META_CSH_ANY and STRMATCH_META_SQL_ANY, two matches exactly one character, STRMATCH_META_CSH_ONE and STRMATCH_META_SQL_ONE, and STRMATCH_META_ESCAPE. This combines the meta characters used in C-shell file name substitution and SQL LIKE clause.
Referenced by ibis::dictionary::patternSearch(), and ibis::mensa::select2().
char * ibis::util::strnewdup | ( | const char * | s | ) |
Duplicate string content with C++ default new operator.
Same as strdup() but uses 'new' instead of 'malloc'.
If s == 0, then it returns 0.
Referenced by ibis::resource::add(), ibis::dictionary::appendOrdered(), ibis::bord::bord(), ibis::fileManager::roFile::doRead(), getString(), ibis::part::indexSpec(), ibis::dictionary::insert(), makeDir(), ibis::dictionary::merge(), ibis::resource::operator=(), ibis::index::operator=(), ibis::resource::operator[](), ibis::resource::parseNameValuePairs(), ibis::part::part(), ibis::query::query(), ibis::relic::read(), ibis::bin::read(), ibis::direkte::read(), ibis::keywords::read(), ibis::fade::read(), ibis::range::read(), ibis::fuzz::read(), ibis::bylt::read(), ibis::zona::read(), ibis::ambit::read(), ibis::pale::read(), ibis::pack::read(), ibis::zone::read(), ibis::fuge::read(), ibis::egale::read(), ibis::part::readMetaData(), ibis::query::readQuery(), ibis::keywords::readTermDocFile(), removeDir(), ibis::part::rename(), ibis::part::setMetaTags(), and ibis::ridHandler::write().
void ibis::util::uniformFraction | ( | const long unsigned | idx, |
long unsigned & | denominator, | ||
long unsigned & | numerator | ||
) |
Compute a denominator and numerator pair.
The fraction (denominator / numerator) is uniformly distributed in [0, 1).
void ibis::util::updateDatasets | ( | void | ) |
Update the metadata about the data partitions.
Loop through all known data partitions and check for any update in the metadata files.
References ibis::datasets, and ibis::part::updateData().
const char * ibis::util::userName | ( | ) |
Return the user name.
It attempts to retrieve the user name from the system and store it locally.
A global mutex lock is used to ensure that only one thread can access the locally stored user name. If it fails to determine the user name, it will return '<(-_-)>', which is a long form of the robot emoticon.
References envLock.
Referenced by ibis::bord::computeHits(), ibis::part::genName(), ibis::part::get1DBins_(), and ibis::part::stringToBitvector().
int64_t ibis::util::write | ( | int | fdes, |
const void * | buf, | ||
int64_t | nbytes | ||
) |
A wrapper over POSIX write function.
When a large chunk is requested by the user, the write function may return one piece at a time, typically a piece is no larger than 2^31 bytes depending implementation details. This function attempts use the return value from write when it is a posive value.
Referenced by ibis::column::append(), ibis::column::appendValues(), ibis::bin::binningT(), ibis::bin::binOrderT(), ibis::tafel::readCSV(), ibis::tafel::readSQLDump(), ibis::direkte::remapKeys(), ibis::resource::write(), ibis::bin::write32(), ibis::relic::write64(), ibis::bin::write64(), ibis::fade::write64(), and ibis::egale::write64().
int ibis::util::writeLogFileHeader | ( | FILE * | fptr, |
const char * | fname | ||
) |
Write a header to the log file.
The caller has opened the file named fname, this function attempts to write a simple message to verify that the file is writable. It returns zero (0) upon successful completion of the write operation, it returns a non-zero number to indicate error.
If fptr is 0, it passes fname to ibis::util::setLogFileName to open the appropriate file. Otherwise, this function attempt to write to fptr.
References getLocalTime(), getVersionNumber(), getVersionString(), and setLogFileName().
Referenced by getLogFile(), and setLogFileName().
const short unsigned ibis::util::charIndex |
charIndex maps the characters (ASCII) back to integer [0-64]
Maps back from ASCII to positions of the characters in ibis::util::charTable.
Referenced by decode64(), and ibis::query::isValidToken().
const char * ibis::util::charTable |
charTable lists the 64 printable characters to be used for names
A list of 65 printable ASCII characters that are not special to most of the command interpreters.
The first 64 of them are basically the same as specified in RFC 3548 for base-64 numbers, but appear in different order. A set of numbers represented in this base-64 representation will be sorted in the same order as their decimal representations.
Referenced by encode64(), and int2string().
const char * ibis::util::delimiters = ";, \v\b\f\r\t\n'\"" |
Delimiters used to separate a string of names.
Referenced by ibis::resource::add(), ibis::nameList::add(), ibis::resource::operator[](), ibis::qIntHod::qIntHod(), ibis::qUIntHod::qUIntHod(), ibis::tafel::readSQLDump(), and ibis::nameList::select().
pthread_mutex_t ibis::util::envLock = PTHREAD_MUTEX_INITIALIZER |
A mutex for serialize operations FastBit wide.
Currently it is used by the functions that generating user name, asking for password, backing up active tables, cleaning up the list of tables. It is also used extensively in the implementation of C API functions to ensure the cache maintained for C users are manipulated by one user at a time.
Referenced by ibis::fileManager::adjustCacheSize(), ibis::part::doBackup(), ibis::mensa::dropPartition(), ibis::query::evaluate(), ibis::column::evaluateRange(), ibis::findDataset(), gatherParts(), ibis::init(), ibis::column::mutexLock::mutexLock(), ibis::part::part(), and userName().
const int ibis::util::log2table |
log base 2 of an integer, the lookup table
const uint32_t ibis::util::shellgaps[8] = {1, 4, 10, 23, 57, 132, 301, 701} |
Gaps for Shell sort from http://en.wikipedia.org/wiki/Shell_sort by Ciura, 2001.
Referenced by sort_shell(), sortAll_shell(), and sortStrings_shell().