Overview   Project   Class   Tree   Deprecated   Index 
Grammatica 1.5 Documentation
FRAMES    NO FRAMES
SUMMARY:  INNER | FIELD | CONSTR | METHOD

PerCederberg.Grammatica.Runtime
Class Tokenizer

System.Object
   |
   +--Tokenizer

   in Tokenizer.cs

public class Tokenizer
extends System.Object

A character stream tokenizer. This class groups the characters read from the stream together into tokens ("words"). The grouping is controlled by token patterns that contain either a fixed string to search for, or a regular expression. If the stream of characters don't match any of the token patterns, a parse exception is thrown.


Field Summary
 bool UseTokenList
          The token list flag property.
 
Constructor Summary
Tokenizer( TextReader input )
          Creates a new case-sensitive tokenizer for the specified input stream.
Tokenizer( TextReader input, bool ignoreCase )
          Creates a new tokenizer for the specified input stream.
 
Method Summary
 void AddPattern( TokenPattern pattern )
          Adds a new token pattern to the tokenizer.
 int GetCurrentColumn()
          Returns the current column number.
 int GetCurrentLine()
          Returns the current line number.
 string GetPatternDescription( int id )
          Returns a description of the token pattern with the specified id.
 bool GetUseTokenList()
          Deprecated. Use the UseTokenList property instead.
protected virtual Token NewToken( TokenPattern pattern, string image, int line, int column )
          Factory method for creating a new token.
 Token Next()
          Finds the next token on the stream.
 void Reset( TextReader input )
          Resets this tokenizer for usage with another input stream.
 void SetUseTokenList( bool useTokenList )
          Deprecated. Use the UseTokenList property instead.
 override string ToString()
          Returns a string representation of this object.
 

Field Detail

UseTokenList

public bool UseTokenList;
The token list flag property. If the token list flag is set, all tokens (including ignored tokens) link to each other in a double-linked list. By default the token list flag is set to false.
Since:
1.5
See Also:
Token.Previous, Token.Next


Constructor Detail

Tokenizer

public Tokenizer( TextReader input );
Creates a new case-sensitive tokenizer for the specified input stream.
Parameters:
input - the input stream to read

Tokenizer

public Tokenizer( TextReader input, bool ignoreCase );
Creates a new tokenizer for the specified input stream. The tokenizer can be set to process tokens either in case-sensitive or case-insensitive mode.
Parameters:
input - the input stream to read
ignoreCase - the character case ignore flag
Since:
1.5


Method Detail

AddPattern

public void AddPattern( TokenPattern pattern );
Adds a new token pattern to the tokenizer. The pattern will be added last in the list, choosing a previous token pattern in case two matches the same string.
Parameters:
pattern - the pattern to add
Throws:
ParserCreationException - if the pattern couldn't be added to the tokenizer

GetCurrentColumn

public int GetCurrentColumn();
Returns the current column number. This number will be the column number of the next token returned.
Returns:
the current column number

GetCurrentLine

public int GetCurrentLine();
Returns the current line number. This number will be the line number of the next token returned.
Returns:
the current line number

GetPatternDescription

public string GetPatternDescription( int id );
Returns a description of the token pattern with the specified id.
Parameters:
id - the token pattern id
Returns:
the token pattern description, or null if not present

GetUseTokenList

public bool GetUseTokenList();
Deprecated. Use the UseTokenList property instead.

Checks if the token list feature is used. The token list feature makes all tokens (including ignored tokens) link to each other in a linked list. By default the token list feature is not used.

Returns:
true if the token list feature is used, or false otherwise
Since:
1.4
See Also:
UseTokenList, SetUseTokenList, Token.GetPreviousToken, Token.GetNextToken

NewToken

protected virtual Token NewToken( TokenPattern pattern, string image, int line, int column );
Factory method for creating a new token. This method can be overridden to provide other token implementations than the default one.
Parameters:
pattern - the token pattern
image - the token image (i.e. characters)
line - the line number of the first character
column - the column number of the first character
Returns:
the token created
Since:
1.5

Next

public Token Next();
Finds the next token on the stream. This method will return null when end of file has been reached. It will return a parse exception if no token matched the input stream, or if a token pattern with the error flag set matched. Any tokens matching a token pattern with the ignore flag set will be silently ignored and the next token will be returned.
Returns:
the next token found, or null if end of file was encountered
Throws:
ParseException - if the input stream couldn't be read or parsed correctly

Reset

public void Reset( TextReader input );
Resets this tokenizer for usage with another input stream. This method will clear all the internal state in the tokenizer as well as close the previous input stream. It is normally called in order to reuse a parser and tokenizer pair with multiple input streams, thereby avoiding the cost of re-analyzing the grammar structures.
Parameters:
input - the new input stream to read
Since:
1.5
See Also:
Parser.reset(Reader)

SetUseTokenList

public void SetUseTokenList( bool useTokenList );
Deprecated. Use the UseTokenList property instead.

Sets the token list feature flag. The token list feature makes all tokens (including ignored tokens) link to each other in a linked list when active. By default the token list feature is not used.

Parameters:
useTokenList - the token list feature flag
Since:
1.4
See Also:
UseTokenList, GetUseTokenList, Token.GetPreviousToken, Token.GetNextToken

ToString

public override string ToString();
Returns a string representation of this object. The returned string will contain the details of all the token patterns contained in this tokenizer.
Returns:
a detailed string representation

 Overview   Project   Class   Tree   Deprecated   Index 
Grammatica 1.5 Documentation
FRAMES    NO FRAMES
SUMMARY:  INNER | FIELD | CONSTR | METHOD