A tokenizer class to turn a string buffer into tokens. More...
#include <category.h>
Public Member Functions | |
virtual int | operator() (std::vector< const char * > &tkns, char *buf)=0 |
A tokenizer must implement a two-argument operator(). More... | |
virtual | ~tokenizer () |
Destructor. | |
A tokenizer class to turn a string buffer into tokens.
Used by ibis::keywords to build a term-document index.
|
pure virtual |
A tokenizer must implement a two-argument operator().
It takes an input string in buf to produce a list of tokens in tkns. The input buffer may be modified in this function. The return value shall be zero (0) to indicate success, a positive value to carray a warning message, and a negative value to indicate fatal error.
Implemented in ibis::keywords::tokenizer.