A simple tokenizer used to extract keywords. More...
#include <ikeywords.h>
Public Member Functions | |
virtual int | operator() (std::vector< const char * > &tkns, char *buf) |
Tokenizer. More... | |
tokenizer (const char *d=ibis::util::delimiters) | |
Constructor. More... | |
virtual | ~tokenizer () |
Destructor. | |
Protected Attributes | |
std::string | delim_ |
The list of delimiters. May be empty. | |
A simple tokenizer used to extract keywords.
A text field (i.e., a row of a text column) is split into a list of null-terminated tokens and each of these token is a keyword that could be searched.
ibis::keywords::tokenizer::tokenizer | ( | const char * | d = ibis::util::delimiters | ) |
Constructor.
It takes a list of delimiters. Any character in the list of delimiters will terminate a token. If no delimiter is given, anything other than alphanumerical characters will terminate a token. By default, the delimiters defined in ibis::util::delimiters are used.
References delim_.
|
virtual |
Tokenizer.
Turn the buffer buf into a list of tokens through the function ibis::util::readString.
This function returns a negative value to indicate error, 0 to indicate success, a positive number to indicate completion with some potential issues.
Implements ibis::text::tokenizer.
References ibis::util::readString().