A data structure to represent a sequence of bits. More...
#include <bitvector64.h>
Classes | |
class | const_iterator |
The const_iterator class. It iterates on the individual bits. More... | |
class | indexSet |
The indexSet stores positions of bits that are one. More... | |
class | iterator |
The iterator that allows modification of bits. More... | |
Public Types | |
typedef uint64_t | word_t |
Public Member Functions | |
void | adjustSize (word_t nv, word_t nt) |
Adjust the size of the bit sequence. More... | |
void | appendByte (unsigned char) |
!< Append a single bit. More... | |
void | appendFill (int val, word_t n) |
!< Append a WAH word. More... | |
void | appendWord (word_t w) |
Append a WAH compressed word. More... | |
iterator | begin () |
const_iterator | begin () const |
bitvector64 () | |
!< The basic unit of data storage is 64-bit. | |
bitvector64 (const bitvector64 &bv) | |
bitvector64 (const array_t< word_t > &arr) | |
bitvector64 (const char *file) | |
word_t | bytes () const throw () |
Return the number of bytes used by the bitvector object in memory. | |
void | clear () |
Remove the existing content of a bitvector64. | |
word_t | cnt () const |
Return the number of bits that are one. | |
void | compress () |
word_t | compressible () const |
!< Turn all fill words into literal words. More... | |
bitvector64 & | copy (const bitvector64 &bv) |
!< More... | |
void | decompress () |
!< Merge fills into fill words. | |
iterator | end () |
const_iterator | end () const |
void | erase (word_t i, word_t j) |
Remove the bits in the range of [i, j). | |
indexSet | firstIndexSet () const |
void | flip () |
Complement all bits of a bit sequence. | |
int | getBit (const word_t i) const |
Return the value of a bit. More... | |
word_t | getSerialSize () const throw () |
Compute the number of words in serialized version of the bitvector object. More... | |
bool | isCompressed () const |
Does this bit vector use less space than the maximum? Return true if yes, otherwise false. More... | |
word_t | numFillWords () const |
Return the number of fill words. | |
bitvector64 * | operator& (const bitvector64 &) const |
Perform bitwise AND, return the pointer to the result. More... | |
void | operator&= (const bitvector64 &rhs) |
Perform bitwise AND between this bitvector64 and rhs . More... | |
bitvector64 & | operator+= (const bitvector64 &bv) |
bitvector64 & | operator+= (int b) |
!< Append a bitvector64. More... | |
bitvector64 * | operator- (const bitvector64 &) const |
Perform bitwise subtraction and return the pointer to the result. More... | |
void | operator-= (const bitvector64 &rhs) |
Perform bitwise subtraction (a & !b). More... | |
bitvector64 & | operator= (const bitvector64 &bv) |
!< Read the content of the named file. | |
int | operator== (const bitvector64 &rhs) const |
Return 1 if two bit sequences have the same content, 0 otherwise. | |
bitvector64 * | operator^ (const bitvector64 &) const |
Perform bitwise XOR, return the pointer to the result. More... | |
void | operator^= (const bitvector64 &rhs) |
Perform bitwise exclusive or (XOR). More... | |
bitvector64 * | operator| (const bitvector64 &) const |
Perform bitwise OR, return the pointer to the result. More... | |
void | operator|= (const bitvector64 &rhs) |
Perform bitwise OR. More... | |
std::ostream & | print (std::ostream &) const |
void | read (const char *fn) |
Read a bit vector from the file. More... | |
void | set (int val, word_t n) |
Create a vector with n bits of value val (cf. More... | |
void | setBit (const word_t i, int val) |
Replace a single bit at position i . More... | |
word_t | size () const throw () |
Return the total number of bits in the bit sequence. | |
bitvector64 & | swap (bitvector64 &bv) |
!< More... | |
void | write (const char *fn) const |
Write the bit vector to a file. | |
void | write (FILE *fptr) const |
void | write (array_t< word_t > &arr) const |
Write the bit vector to an array_t<word_t>. | |
Static Public Member Functions | |
static unsigned | bitsPerLiteral () |
Return the number of bits in a literal word. | |
static double | clusteringFactor (word_t nb, word_t nc, word_t nw) |
Estimate clustering factor based on the size. More... | |
static double | markovSize (word_t nb, word_t nc, double f) |
Compute the expected size (bytes) of a random sequence generated from a Markov process. More... | |
static double | randomSize (word_t nb, word_t nc) |
Compute the expected number of bytes required to store a random sequence. More... | |
Protected Member Functions | |
bool | all0s () const |
Are all bits in regular words 0? | |
bool | all1s () const |
Are all bits in regular words 1? | |
Friends | |
struct | active_word |
class | const_iterator |
class | indexSet |
class | iterator |
struct | run |
A data structure to represent a sequence of bits.
The 64-bit version.
Key features
A bitvector object stores a sequence of bits and provides fast bitwise logical operations. In addition, it supports operations to append new bits from the end, read bits at arbitrary location and set bits at arbitrary location. It also supports an iterator, a const_iterator and an indexSet.
Encoding format
Incoming bits are organized into words (bitvector::word_t). A word is a literal word if its Most Significant Bit (MSB) is 0, it is a fill word if its MSB is 1. A literal word stores literal bit values in the bit position following the MSB and a fill word stores a sequence of consecutive bits that are of the same value, i.e., a fill. The second most significant bit of the fill word is the bit value, the remaining bits of the word is a unsigned integer that stores the length of the fill as number of equivalent literal words, i.e., how many literal words it will take if the fill is stored in literal words.
Restrictions
The number of bits must be expressible by one single bitvector::word_t. This ensure that a fill word can store a fill of any valid length without performing a bound check. In this 64-bit version, the maximum number of bits that can be represented by a bitvector object is 16 quintillion (16x10^{18}).
void ibis::bitvector64::adjustSize | ( | word_t | nv, |
word_t | nt | ||
) |
Adjust the size of the bit sequence.
If current size is less than nv
, append enough 1 bits so that it has nv
bits. If the resulting total number of bits is less than nt
, append 0 bits so that there are nt
total bits. The final result always contains nt
bits.
Referenced by ibis::part::evaluateJoin(), ibis::util::outerProduct(), and ibis::util::outerProductUpper().
|
inline |
!< Append a single bit.
Append all 8 bits of the incoming bytes as literal bits.
|
inline |
!< Append a WAH word.
Append n bits of val.
Append n
bits of val
.
The value of n may be arbitrary integer, but the value of val must be either 0 or 1.
Referenced by ibis::util::outerProduct(), and ibis::util::outerProductUpper().
void ibis::bitvector64::appendWord | ( | word_t | w | ) |
Append a WAH compressed word.
The general case, active word may not be empty.
Referenced by erase().
|
static |
Estimate clustering factor based on the size.
ibis::bitvector64::word_t ibis::bitvector64::compressible | ( | ) | const |
!< Turn all fill words into literal words.
Return the number of word saved if the function compress is called.
|
inline |
!<
Referenced by operator&(), operator&=(), operator-(), operator|(), and operator|=().
int ibis::bitvector64::getBit | ( | const word_t | i | ) | const |
Return the value of a bit.
|
inline |
Compute the number of words in serialized version of the bitvector object.
References ibis::array_t< T >::size().
|
inline |
Does this bit vector use less space than the maximum? Return true if yes, otherwise false.
References ibis::array_t< T >::size().
|
inlinestatic |
Compute the expected size (bytes) of a random sequence generated from a Markov process.
The bit sequence is to have nb
total bits, nc
bits of one, and f
consecutive ones on the average. The argument f
is known as the clustering factor.
Referenced by ibis::query::processJoin().
ibis::bitvector64 * ibis::bitvector64::operator& | ( | const bitvector64 & | rhs | ) | const |
Perform bitwise AND, return the pointer to the result.
References all0s(), all1s(), copy(), ibis::util::logMessage(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and size().
void ibis::bitvector64::operator&= | ( | const bitvector64 & | rhs | ) |
Perform bitwise AND between this bitvector64 and rhs
.
References all0s(), all1s(), copy(), ibis::util::logMessage(), ibis::array_t< T >::size(), and size().
|
inline |
!< Append a bitvector64.
Append a single bit.
ibis::bitvector64 * ibis::bitvector64::operator- | ( | const bitvector64 & | rhs | ) | const |
Perform bitwise subtraction and return the pointer to the result.
References all0s(), all1s(), copy(), flip(), ibis::util::logMessage(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and size().
void ibis::bitvector64::operator-= | ( | const bitvector64 & | rhs | ) |
Perform bitwise subtraction (a & !b).
References all0s(), all1s(), ibis::util::copy(), ibis::util::logMessage(), ibis::array_t< T >::size(), and size().
ibis::bitvector64 * ibis::bitvector64::operator^ | ( | const bitvector64 & | rhs | ) | const |
Perform bitwise XOR, return the pointer to the result.
References ibis::util::logMessage(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and size().
void ibis::bitvector64::operator^= | ( | const bitvector64 & | rhs | ) |
Perform bitwise exclusive or (XOR).
References ibis::util::logMessage(), ibis::array_t< T >::size(), and size().
ibis::bitvector64 * ibis::bitvector64::operator| | ( | const bitvector64 & | rhs | ) | const |
Perform bitwise OR, return the pointer to the result.
References all0s(), all1s(), copy(), ibis::util::logMessage(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), and size().
void ibis::bitvector64::operator|= | ( | const bitvector64 & | rhs | ) |
Perform bitwise OR.
References all0s(), all1s(), copy(), ibis::util::logMessage(), ibis::array_t< T >::size(), and size().
|
inlinestatic |
Compute the expected number of bytes required to store a random sequence.
The random bit sequence is to have nb
total bits and nc
bits of one.
void ibis::bitvector64::read | ( | const char * | fn | ) |
Read a bit vector from the file.
Purge current contents before read.
References ibis::fileManager::getFile(), ibis::fileManager::instance(), and ibis::util::logMessage().
void ibis::bitvector64::set | ( | int | val, |
word_t | n | ||
) |
Create a vector with n
bits of value val
(cf.
memset()).
val
must be either 0 or 1. References ibis::util::clear().
Referenced by ibis::bin::estimate(), ibis::relic::estimate(), ibis::index::estimate(), ibis::part::evaluateJoin(), and ibis::query::processJoin().
void ibis::bitvector64::setBit | ( | const word_t | ind, |
int | val | ||
) |
Replace a single bit at position i
.
replace the ind'th bit with val.
val
must be either 0 or 1.val is assumed to be either 0 or 1. If val is not 0 or 1, it could cause serious problems. This function can be used to extend the length of the bit sequence. When the given index (ind) is beyond the end of the current sequence, the unspecified bits in the range of [size(), ind) are assumed to be 0.
References ibis::util::logMessage().
Referenced by ibis::part::evaluateJoin(), ibis::util::outerProduct(), and ibis::util::outerProductUpper().
|
inline |
!<
Referenced by ibis::util::outerProduct(), ibis::util::outerProductUpper(), and ibis::query::processJoin().