You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 11
Next »
This stages are contained inside the Saga Core library and available in all time
Text Block Readers
Readers read text streams and create text blocks to process.
- SimpleReader -
Error rendering macro 'excerpt-include'
No link could be created for 'SimpleReader Stage'.
Text Block Breakers
Breakers read text blocks and breaks them into individual text blocks.
- QuotationBreaker - Breaks TEXT_BLOCK tokens into other TEXT_BLOCK tokens, separating the non quoted text from the quoted one. This breaker respects the grammatical rules of quotes.
Tokenizers
Tokenizers read text blocks and divide them up into individual tokens to be processed.
- WhitespaceTokenizer -
Error rendering macro 'excerpt-include'
No link could be created for 'WhitespaceTokenizer Stage'.
Splitters
Splitters split up tokens into multiple smaller tokens as an alternative interpretation.
CharacterSplitter -
Error rendering macro 'excerpt-include'
No link could be created for 'CharacterSplitter Stage'.
- CharChangeSplitter -
Error rendering macro 'excerpt-include'
No link could be created for 'CharChangeSplitter Stage'.
Normalizers
Normalizers create alternative normalized interpretations of tokens from original tokens.
- CaseAnalysis -
Error rendering macro 'excerpt-include'
No link could be created for 'CaseAnalysis Stage'.
Recognizers
Recognizers identify and flag tokens based on their character patterns.
- NumberRecognizer -
Error rendering macro 'excerpt-include'
No link could be created for 'NumberRecognizer Stage'.
- StopWords -
Error rendering macro 'excerpt-include'
No link could be created for 'StopWords Stage'.
- Lemmatize - Match tokens to words in a dictionary then creates new LexItems for the token lemma and/or synonyms if configured.
Taggers
Taggers create semantic tags which are added to the interpretation graph as alternative interpretations.
- RegexPattern - Looks up matches to regular expressions in a dictionary across multiple tokens and then tags the match with one or more semantic tags as an alternative representation. For a simple regex expression where a match only needs to occur against a singe token, the Simple Regex Stage is recommended.
- DictionaryTagger -
Error rendering macro 'excerpt-include'
No link could be created for 'DictionaryTagger Stage'.
- AdvancedPattern -
Error rendering macro 'excerpt-include'
No link could be created for 'AdvancedPattern Stage'.