Package com.topologi.diffx.load.text
Class TokenizerByWord
java.lang.Object
com.topologi.diffx.load.text.TokenizerByWord
- All Implemented Interfaces:
TextTokenizer
The tokeniser for characters events.
This class is not synchronized.
- Version:
- 11 May 2010
-
Field Summary
FieldsModifier and TypeFieldDescriptionMap characters to events in order to recycle events as they are created.private final WhiteSpaceProcessingDefine the whitespace processing. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate TextEventgetSpaceEvent(String space) Returns the space event corresponding to the specified characters.private TextEventgetWordEvent(String word) Returns the word event corresponding to the specified characters.AlwaysTextGranularity.WORD.tokenize(CharSequence seq) Returns the list ofTextEventcorresponding to the specified character sequence.
-
Field Details
-
recycling
Map characters to events in order to recycle events as they are created. -
whitespace
Define the whitespace processing.
-
-
Constructor Details
-
TokenizerByWord
Creates a new tokenizer.- Parameters:
whitespace- the whitespace processing for this tokenizer.- Throws:
NullPointerException- if the white space processing is not specified.
-
-
Method Details
-
tokenize
Returns the list ofTextEventcorresponding to the specified character sequence.- Specified by:
tokenizein interfaceTextTokenizer- Parameters:
seq- the character sequence to tokenize.- Returns:
- the corresponding list.
-
granurality
AlwaysTextGranularity.WORD. Returns the text granularity of this tokenizer.- Specified by:
granuralityin interfaceTextTokenizer- Returns:
- the text granularity of this tokenizer.
-
getWordEvent
Returns the word event corresponding to the specified characters.- Parameters:
word- the characters of the word- Returns:
- the corresponding word event
-
getSpaceEvent
Returns the space event corresponding to the specified characters.- Parameters:
space- the characters of the space- Returns:
- the corresponding space event
-