Grammatica
Public Member Functions | Protected Member Functions | Properties | List of all members
PerCederberg.Grammatica.Runtime.Tokenizer Class Reference

A character stream tokenizer. More...

Public Member Functions

 Tokenizer (TextReader input)
 Creates a new case-sensitive tokenizer for the specified input stream. More...
 
 Tokenizer (TextReader input, bool ignoreCase)
 Creates a new tokenizer for the specified input stream. More...
 
bool GetUseTokenList ()
 Checks if the token list feature is used. More...
 
void SetUseTokenList (bool useTokenList)
 Sets the token list feature flag. More...
 
string GetPatternDescription (int id)
 Returns a description of the token pattern with the specified id. More...
 
int GetCurrentLine ()
 Returns the current line number. More...
 
int GetCurrentColumn ()
 Returns the current column number. More...
 
void AddPattern (TokenPattern pattern)
 Adds a new token pattern to the tokenizer. More...
 
void Reset (TextReader input)
 Resets this tokenizer for usage with another input stream. More...
 
Token Next ()
 Finds the next token on the stream. More...
 
override string ToString ()
 Returns a string representation of this object. More...
 

Protected Member Functions

virtual Token NewToken (TokenPattern pattern, string image, int line, int column)
 Factory method for creating a new token. More...
 

Properties

bool UseTokenList [get, set]
 The token list flag property. More...
 

Detailed Description

A character stream tokenizer.

This class groups the characters read from the stream together into tokens ("words"). The grouping is controlled by token patterns that contain either a fixed string to search for, or a regular expression. If the stream of characters don't match any of the token patterns, a parse exception is thrown.

Author
Per Cederberg
Version
1.5

Constructor & Destructor Documentation

PerCederberg.Grammatica.Runtime.Tokenizer.Tokenizer ( TextReader  input)
inline

Creates a new case-sensitive tokenizer for the specified input stream.

Parameters
inputthe input stream to read
PerCederberg.Grammatica.Runtime.Tokenizer.Tokenizer ( TextReader  input,
bool  ignoreCase 
)
inline

Creates a new tokenizer for the specified input stream.

The tokenizer can be set to process tokens either in case-sensitive or case-insensitive mode.

Parameters
inputthe input stream to read
ignoreCasethe character case ignore flag
Since
1.5

Member Function Documentation

void PerCederberg.Grammatica.Runtime.Tokenizer.AddPattern ( TokenPattern  pattern)
inline

Adds a new token pattern to the tokenizer.

The pattern will be added last in the list, choosing a previous token pattern in case two matches the same string.

Parameters
patternthe pattern to add
Exceptions
ParserCreationExceptionif the pattern couldn't be added to the tokenizer
int PerCederberg.Grammatica.Runtime.Tokenizer.GetCurrentColumn ( )
inline

Returns the current column number.

This number will be the column number of the next token returned.

Returns
the current column number
int PerCederberg.Grammatica.Runtime.Tokenizer.GetCurrentLine ( )
inline

Returns the current line number.

This number will be the line number of the next token returned.

Returns
the current line number
string PerCederberg.Grammatica.Runtime.Tokenizer.GetPatternDescription ( int  id)
inline

Returns a description of the token pattern with the specified id.

Parameters
idthe token pattern id
Returns
the token pattern description, or null if not present
bool PerCederberg.Grammatica.Runtime.Tokenizer.GetUseTokenList ( )
inline

Checks if the token list feature is used.

The token list feature makes all tokens (including ignored tokens) link to each other in a linked list. By default the token list feature is not used.

Returns
true if the token list feature is used, or false otherwise
See also
UseTokenList
SetUseTokenList
Token::GetPreviousToken
Token::GetNextToken
Since
1.4
Deprecated:
Use the UseTokenList property instead.
virtual Token PerCederberg.Grammatica.Runtime.Tokenizer.NewToken ( TokenPattern  pattern,
string  image,
int  line,
int  column 
)
inlineprotectedvirtual

Factory method for creating a new token.

This method can be overridden to provide other token implementations than the default one.

Parameters
patternthe token pattern
imagethe token image (i.e. characters)
linethe line number of the first character
columnthe column number of the first character
Returns
the token created
Since
1.5
Token PerCederberg.Grammatica.Runtime.Tokenizer.Next ( )
inline

Finds the next token on the stream.

This method will return null when end of file has been reached. It will return a parse exception if no token matched the input stream, or if a token pattern with the error flag set matched. Any tokens matching a token pattern with the ignore flag set will be silently ignored and the next token will be returned.

Returns
the next token found, or null if end of file was encountered
Exceptions
ParseExceptionif the input stream couldn't be read or parsed correctly
void PerCederberg.Grammatica.Runtime.Tokenizer.Reset ( TextReader  input)
inline

Resets this tokenizer for usage with another input stream.

This method will clear all the internal state in the tokenizer as well as close the previous input stream. It is normally called in order to reuse a parser and tokenizer pair with multiple input streams, thereby avoiding the cost of re-analyzing the grammar structures.

Parameters
inputthe new input stream to read
See also
Parser::reset(Reader)
Since
1.5
void PerCederberg.Grammatica.Runtime.Tokenizer.SetUseTokenList ( bool  useTokenList)
inline

Sets the token list feature flag.

The token list feature makes all tokens (including ignored tokens) link to each other in a linked list when active. By default the token list feature is not used.

Parameters
useTokenListthe token list feature flag
See also
UseTokenList
GetUseTokenList
Token::GetPreviousToken
Token::GetNextToken
Since
1.4
Deprecated:
Use the UseTokenList property instead.
override string PerCederberg.Grammatica.Runtime.Tokenizer.ToString ( )
inline

Returns a string representation of this object.

The returned string will contain the details of all the token patterns contained in this tokenizer.

Returns
a detailed string representation

Property Documentation

bool PerCederberg.Grammatica.Runtime.Tokenizer.UseTokenList
getset

The token list flag property.

If the token list flag is set, all tokens (including ignored tokens) link to each other in a double-linked list. By default the token list flag is set to false.

See also
Token::Previous
Token::Next
Since
1.5

The documentation for this class was generated from the following file: