Excerpt |
---|
This Stage flags tokens that are matched to Stop-Words. The flagged tokens will be skipped in subsequent stages (if so indicated on the configuration). |
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer Recognizer false
Include Page | ||||
---|---|---|---|---|
|
Parameter summary If true, all stop words and tokens will be processed as case insensitive. default true name caseInsensitive type boolean
Parameter summary The resource containing the list of stop words. Or the direct list of stop words name stopWords
Info | ||
---|---|---|
| ||
a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with |
Saga_config_stage | ||||
---|---|---|---|---|
| ||||
"caseInsensitive" : true, "stopWords" : "words-provider:stop_words" |
Saga_config_stage | ||||
---|---|---|---|---|
| ||||
"caseInsensitive" : true, "stopWords" : ["a", "about", "above", "after", "again", "all", "am", "an", "and", "the", "i", "who", ...] |
Saga_graph | ||||
---|---|---|---|---|
Code Block | ||||
| ||||
V--------------[A test to be skipped]--------------V ^--[A]--V--[test]--V--[to]--V--[be]--V--[skipped]--^ ^--[a]--^ Item [A] - [TOKEN, STOP_WORD ] Item [to] - [TOKEN, STOP_WORD ] Item [be] - [TOKEN, STOP_WORD ] Item [a] - [TOKEN, STOP_WORD ] |
Info |
---|
No vertices are created in this stage |
The resource data will be a json file with an array of words in a field named stopWords.
Saga_json |
---|
"stopWords": ["a", "about", "above", "after", "again", "all", "am", "an", "and", "the", "i", "who", ...] |