...
Readers read text streams and create text blocks to process.
- SimpleReader -
Excerpt Include |
---|
| SimpleReader Stage |
---|
| SimpleReader Stage |
---|
nopanel | true |
---|
|
Tokenizers
Tokenizers read text blocks and divide them up into individual tokens to be processed.
- WhitespaceTokenizer - Divides text blocks into tokens based on white space.-
Excerpt Include |
---|
| WhitespaceTokenizer Stage |
---|
| WhitespaceTokenizer Stage |
---|
nopanel | true |
---|
|
Splitters
Splitters split up tokens into multiple smaller tokens as an alternative interpretation.
Normalizers
Normalizers create alternative normalized interpretations of tokens from original tokens.
- CaseAnalysis - Analyzes and flags the case of tokens and then (optionally) normalizes the token to lower case.
Excerpt Include |
---|
| CaseAnalysis Stage |
---|
| CaseAnalysis Stage |
---|
nopanel | true |
---|
|
Recognizers
Recognizers identify and flag tokens based on their character patterns.
- NumberRecognizer - - Identifies tokens which look like numbers and flags them with the "NUMBER" flag.
Excerpt Include |
---|
| NumberRecognizer Stage |
---|
| NumberRecognizer Stage |
---|
nopanel | true |
---|
|
Taggers
Taggers create semantic tags which are added to the interpretation graph as alternative interpretations.
- RegexPattern -
Excerpt Include |
---|
| Regex Pattern Stage |
---|
| Regex Pattern Stage |
---|
nopanel | true |
---|
|
- DictionaryTagger - Looks up all combinations of tokens in a dictionary and tags any that are found. -
Excerpt Include |
---|
| DictionaryTagger Stage |
---|
| DictionaryTagger Stage |
---|
nopanel | true |
---|
|
- AdvancedPattern -
Excerpt Include |
---|
| AdvancedPattern Stage |
---|
| AdvancedPattern Stage |
---|
nopanel | true |
---|
|