Classes | Public Member Functions | Static Public Member Functions | Protected Types | Protected Member Functions | Protected Attributes | Friends | List of all members
ibis::selectClause Class Reference

A class to represent the select clause. More...

#include <selectClause.h>

Classes

class  variable
 A specialization of ibis::math::variable. More...
 

Public Member Functions

void clear ()
 
bool empty () const
 Returns true if this select clause is empty.
 
int find (const char *) const
 Locate the position of the string. More...
 
void getNullMask (const ibis::part &, ibis::bitvector &) const
 
const char * getString (void) const
 Return a pointer to the string form of the select clause.
 
const char * operator* (void) const
 Dereferences to the string form of the select clause.
 
selectClauseoperator= (const selectClause &rhs)
 Assignment operator.
 
int parse (const char *cl)
 Parse a new string.
 
void print (std::ostream &) const
 Write a string version of the select clause to the specified output stream.
 
void printDetails (std::ostream &) const
 
 selectClause (const char *cl=0)
 Parse a new string as a select clause.
 
 selectClause (const ibis::table::stringArray &)
 Parse a list of strings.
 
 selectClause (const selectClause &)
 Copy constructor. Deep copy.
 
void swap (selectClause &rhs)
 Swap the content of two select clauses.
 
int verify (const ibis::part &) const
 Verify the select clause is valid against the given data partition. More...
 
int verifySome (const std::vector< uint32_t > &, const ibis::part &) const
 Verify the selected terms. More...
 

Static Public Member Functions

static int verifyTerm (const ibis::math::term &, const ibis::part &, const ibis::selectClause *=0)
 Verify the specified term has valid column names. More...
 

Protected Types

typedef std::map< const char *, ibis::selectClause::variable *, ibis::lessi > varMap
 

Protected Member Functions

ibis::math::variableaddAgregado (ibis::selectClause::AGREGADO, ibis::math::term *)
 Record an aggregation function. More...
 
ibis::math::termaddRecursive (ibis::math::term *&)
 
void addTerm (ibis::math::term *, const std::string *)
 Add a top-level term. More...
 
uint64_t decodeAName (const char *) const
 Determine if the name refers to a term in the list of aggregation functions. More...
 
void fillNames ()
 Fill array names_ and xnames_. More...
 
void gatherVariables (varMap &vmap, ibis::math::term *t) const
 
bool hasAggregation (const ibis::math::term *tm) const
 Does the math expression contain any aggregation operations?
 

Protected Attributes

std::vector< AGREGADOaggr_
 Aggregators.
 
mathTerms atms_
 Arithmetic expressions used by aggregators.
 
std::string clause_
 
ibis::selectLexerlexer
 !< String version of the select clause.
 
std::vector< std::string > names_
 Names of the variables inside the aggregation functions.
 
StringToInt ordered_
 A ordered version of names_.
 
StringToInt xalias_
 Aliases.
 
std::vector< std::string > xnames_
 Names of the top-level terms.
 
mathTerms xtms_
 Top-level terms. Externally visible arithmetic expressions.
 

Friends

class ibis::selectParser
 !< A pointer for the parser.
 
class variable
 
typedef std::vector< ibis::math::term * > mathTerms
 Functions related to extenally visible portion of the select clause. More...
 
typedef std::map< const char *, const char *, ibis::lessi > nameMap
 Functions related to extenally visible portion of the select clause. More...
 
const mathTermsgetTerms () const
 Retrieve all top-level arithmetic expressions.
 
const ibis::math::termtermExpr (unsigned i) const
 Fetch the ith term visible to the outside. No array bound checking.
 
uint32_t numTerms () const
 Number of terms visible to the outside.
 
const char * termName (unsigned i) const
 Name given to the top-level function. More...
 
std::string termDescription (unsigned i) const
 Produce a string for the jth term of the select clause. More...
 
int getAliases (nameMap &) const
 Map internal column names to external column names. More...
 
uint32_t numGroupbyKeys () const
 Number of terms without aggregation functions. More...
 
int getGroupbyKeys (std::vector< std::string > &keys) const
 Gather the implicit group-by keys into a vector. More...
 
enum  AGREGADO {
  NIL_AGGR, AVG, CNT, MAX,
  MIN, SUM, DISTINCT, VARPOP,
  VARSAMP, STDPOP, STDSAMP, MEDIAN,
  CONCAT
}
 Functions related to internal aggregation operations. More...
 
typedef std::map< std::string, unsigned > StringToInt
 Functions related to internal aggregation operations. More...
 
uint32_t aggSize () const
 The number of arithmetic expressions inside the select clause.
 
AGREGADO getAggregator (uint32_t i) const
 Return the aggregation function used for the ith term.
 
const ibis::math::termaggExpr (unsigned i) const
 Fetch the ith term inside the select clause. More...
 
const char * aggName (unsigned i) const
 Name inside the aggregation function. More...
 
std::string aggDescription (unsigned i) const
 Produce a string description for the ith aggregation expression. More...
 
std::string aggDescription (AGREGADO, const ibis::math::term *) const
 Write the string form of an aggregator and artithmetic expression combination. More...
 
bool needsEval (const ibis::part &) const
 Does the data partition need additional processing to process the select clause? If any of the (lower-level) names is not present in the incoming data partition, then it is we presume additional evaluation is needed. More...
 
bool isSeparable () const
 Can the select clause be evaluated in separate parts? Return true if there is at least one aggregator and all aggregation operations are separable operations. More...
 
const char * isUnivariate () const
 Is the select caluse univariate? If yes, return the pointer to the string value, otherwise return a nil pointer. More...
 
const StringToIntgetOrdered () const
 Functions related to internal aggregation operations. More...
 

Detailed Description

A class to represent the select clause.

It parses a string into a list of arithmetic expressions and aggregation functions.

The terms in a select clause must be separated by comas ',' and each term may be an arithmetic expression or an aggregation function over an arithmetic expression, e.g., "age, avg(income)" and "temperature, sqrt(vx*vx+vy*vy+vz*vz) as speed, max(duration * speed)". An arithmetic expression may contain any valid combination of numbers and column names connected with operators +, -, *, /, %, ^, ** and standard functions defined in math.h and having only one or two arguements. A handful of functions requested by various uers have also been added as collection of 1- and 2-argument functions. These extra 1-argument functions include: IS_ZERO, IS_NONZER, TRUNC (for truncating floating-point values to whole numbers), and INT_FROM_DICT (meant for reporting the integer values representing the categorical values, but implemented as ROUND). The extra 2-argument functions include: IS_EQL, IS_GTE (>=), IS_LTE (<=).

The supported aggregation functions are:

Each term may optionally be followed by an alias for the term. The alias must be a valid SQL name. The alias may optionally be preceded by the keyword 'AS'. The aliases can be used in the other part of the query.

Note
All select operations excludes null values! In most SQL implementations, the function 'count(*)' includes the null values. However, in FastBit, null values are always excluded. For example, the return value for 'count(*)' in the following two select statements may be different if there are any null values in column A,
select count(*) from ...;
select avg(A), count(*) from ...;
In the first case, the number reported is purely determined by the where clause. However, in the second case, because the select clause also involves the column A, all of null values of A are excluded, therefore 'count(*)' in the second example may be smaller than that of the first example.

In cases where an integer-valued column is actually storing unix time stamps, it might be useful to print out the integer values in the usual date/time format. In this case, the following pseudo-function notation could be used.

select FORMAT_UNIXTIME_LOCAL(colname, "formatstring") from ...;
select FORMAT_UNIXTIME_GMT(colname, "formatstring") from ...;

Note that the format string is passed to strftime. Please refer to the documentation about strftime for details about the format. Please also note that the format string must be quoted, single quotes and double quotes are accepted.

Member Typedef Documentation

Functions related to extenally visible portion of the select clause.

A vector of arithematic expressions.

typedef std::map<const char*, const char*, ibis::lessi> ibis::selectClause::nameMap

Functions related to extenally visible portion of the select clause.

A vector of arithematic expressions.

typedef std::map<std::string, unsigned> ibis::selectClause::StringToInt

Functions related to internal aggregation operations.

Aggregation functions.

Note
"Agregado" is Spanish for aggregate.

Member Enumeration Documentation

Functions related to internal aggregation operations.

Aggregation functions.

Note
"Agregado" is Spanish for aggregate.

Member Function Documentation

ibis::math::variable * ibis::selectClause::addAgregado ( ibis::selectClause::AGREGADO  agr,
ibis::math::term expr 
)
protected

Record an aggregation function.

Return a math term of the type variable to the caller so the caller can continue to build up a larger expression. For simplicity, the variable name is simply "__hhh", where "hhh" is the size of aggr_ in hexadecimal.

Note
This function takes charge of expr. It will free the object if the object is not passed on to other operations. This can happen when the particular variable appeared already in the select clause.

References ibis::math::variable::dup().

void ibis::selectClause::addTerm ( ibis::math::term tm,
const std::string *  al 
)
protected

Add a top-level term.

It invokes ibis::selectClause::addRecursive to do the actual work. The final expression returned by addRecursive is added to xtms_.

std::string ibis::selectClause::aggDescription ( unsigned  i) const
inline

Produce a string description for the ith aggregation expression.

Warning
No array bound checking!

Referenced by ibis::bord::evaluateTerms(), ibis::bord::groupbya(), and ibis::bord::xgroupby().

std::string ibis::selectClause::aggDescription ( AGREGADO  ag,
const ibis::math::term tm 
) const

Write the string form of an aggregator and artithmetic expression combination.

const ibis::math::term* ibis::selectClause::aggExpr ( unsigned  i) const
inline
const char* ibis::selectClause::aggName ( unsigned  i) const
inline

Name inside the aggregation function.

To be used together with aggSize() and aggExpr().

Warning
No array bound checking!

Referenced by ibis::bord::bord(), ibis::bundle1::bundle1(), ibis::bundles::bundles(), ibis::bord::evaluateTerms(), ibis::bord::groupbya(), ibis::bord::merge(), verifyTerm(), and ibis::bord::xgroupby().

uint64_t ibis::selectClause::decodeAName ( const char *  nm) const
protected

Determine if the name refers to a term in the list of aggregation functions.

A name to a aggregation function will be named by ibis::selctClause::addAgregado. If the return value is less than the size of atms_, then the name is considered referring to a aggregation function, otherwise, it is a literal name from the user.

References ibis::util::decode16().

void ibis::selectClause::fillNames ( )
protected

Fill array names_ and xnames_.

An alias for an aggregation operation is used as the external name for the whole term. This function resolves all external names first to establish all aliases, and then resolve the names of the arguments to the aggregation functions. The arithmetic expressions without external names are given names of the form "_hhh", where "hhh" is a hexadecimal number.

References ibis::util::decode16().

int ibis::selectClause::find ( const char *  key) const

Locate the position of the string.

Upon successful completion, it returns the position of the term with the matching name, otherwise, it returns -1. The incoming argument may be an alias, a column name, or the exact form of the arithmetic expression. In case it is an arithmetic expression, it must be exactly the same as the original term passed to the constructor of this class including spaces. The comparison is done with case-insensitive string comparison.

Referenced by ibis::whereClause::verifyExpr(), and verifyTerm().

int ibis::selectClause::getAliases ( nameMap nmap) const

Map internal column names to external column names.

The key of the map is the internal column names, i.e., the column names used by the ibis::bord object generated with a selct clause. If that ibis::bord object does not go through any aggregation operation, then the columns need to be renamed using this information.

It returns the number of changes needed. A negative number is used to indicate error.

int ibis::selectClause::getGroupbyKeys ( std::vector< std::string > &  keys) const

Gather the implicit group-by keys into a vector.

Note
Uses std::vector<std::string> because the string values may not existing inside the select clause, such as the string representation for arithmetic experssions.
const StringToInt& ibis::selectClause::getOrdered ( ) const
inline

Functions related to internal aggregation operations.

Aggregation functions.

Note
"Agregado" is Spanish for aggregate.

References ordered_.

Referenced by ibis::bord::append().

bool ibis::selectClause::isSeparable ( ) const

Can the select clause be evaluated in separate parts? Return true if there is at least one aggregator and all aggregation operations are separable operations.

Otherwise return false.

Referenced by ibis::filter::select(), and ibis::filter::sift().

const char * ibis::selectClause::isUnivariate ( ) const

Is the select caluse univariate? If yes, return the pointer to the string value, otherwise return a nil pointer.

References ibis::array_t< T >::push_back(), ibis::math::barrel::recordVariable(), and ibis::array_t< T >::size().

Referenced by ibis::filter::sift(), ibis::filter::sift1(), and ibis::filter::sift1S().

bool ibis::selectClause::needsEval ( const ibis::part prt) const

Does the data partition need additional processing to process the select clause? If any of the (lower-level) names is not present in the incoming data partition, then it is we presume additional evaluation is needed.

That is, this function will return true. If all the names are present in the data partition, then this function returns false.

References ibis::part::getColumn().

uint32_t ibis::selectClause::numGroupbyKeys ( ) const

Number of terms without aggregation functions.

They are implicitly used as sort keys for group by operations. However, if the select clause does not contain any aggregation function, the sorting operation might be skipped.

Referenced by ibis::filter::sift0(), ibis::filter::sift0S(), ibis::filter::sift1(), ibis::filter::sift1S(), ibis::filter::sift2(), and ibis::filter::sift2S().

std::string ibis::selectClause::termDescription ( unsigned  j) const

Produce a string for the jth term of the select clause.

The string shows the actual expression, not the alias. To see the final name to be used, call ibis::selectClause::termName(j).

const char* ibis::selectClause::termName ( unsigned  i) const
inline

Name given to the top-level function.

This is the external name assigned to termExpr(i) (which is also getTerms()[i]). To produce a string version of the term use termDescription.

Referenced by ibis::bord::groupbyc(), ibis::filter::sift0(), ibis::filter::sift0S(), ibis::filter::sift1(), ibis::filter::sift1S(), ibis::filter::sift2(), ibis::filter::sift2S(), and ibis::bord::xgroupby().

int ibis::selectClause::verify ( const ibis::part part0) const

Verify the select clause is valid against the given data partition.

Returns the number of variables that are not in the data partition. This function also simplifies the arithmetic expression if ibis::math::preserveInputExpression is not set.

Note
Simplifying the arithmetic expressions typically reduces the time needed for evaluations, but may introduce a different set of round-off erros in the evaluation process than the original expression. Set the variable ibis::math::preserveInputExpression to true to avoid this change in error round-off property.

References ibis::math::preserveInputExpressions, and ibis::math::term::reduce().

Referenced by ibis::query::setPartition(), ibis::query::setSelectClause(), ibis::filter::sift0(), ibis::filter::sift0S(), ibis::filter::sift1(), ibis::filter::sift1S(), ibis::filter::sift2(), and ibis::filter::sift2S().

int ibis::selectClause::verifySome ( const std::vector< uint32_t > &  touse,
const ibis::part part0 
) const

Verify the selected terms.

Return the number of terms containing unknown names.

References ibis::math::preserveInputExpressions, and ibis::math::term::reduce().

int ibis::selectClause::verifyTerm ( const ibis::math::term xp0,
const ibis::part part0,
const ibis::selectClause sel0 = 0 
)
static

Verify the specified term has valid column names.

It returns the number of terms not in the given data partition.

References aggName(), aggSize(), find(), ibis::part::getColumn(), ibis::qExpr::getLeft(), ibis::qExpr::getRight(), and ibis::part::name().


The documentation for this class was generated from the following files:

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive