Excerpt |
---|
This Stage flags tokens that are matched to Stop-Words. The flagged tokens will be skipped in subsequent stages (if so indicated on the configuration). |
...
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer Recognizer false
Include Page | ||||
---|---|---|---|---|
|
...
Parameter summary If true, all stop words and tokens will be
...
processed as case insensitive
...
. default
...
true name caseInsensitive type boolean
Parameter summary The resource containing the list of stop words. Or the direct
...
list of stop words name stopWords
Info | ||
---|---|---|
| ||
a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with |
Code Block | ||||
---|---|---|---|---|
|
...
|
...
describe the example output...
| |
"caseInsensitive" : true, "stopWords" : "words-provider:stop_words" |
...
describe the configuration...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
"caseInsensitive" : true,
"stopWords" : ["a", "about", "above", "after", "again", "all",
"am", "an", "and", "the", "i", "who", ...] |
Code Block | ||
---|---|---|
|
...
V--------------[ |
...
A |
...
test |
...
to |
...
be |
...
skipped] |
...
-------------- |
...
V ^--[ |
...
A]--V--[ |
...
test]--V--[ |
...
to]--V--[ |
...
be]--V--[ |
...
skipped]--^ |
...
|
...
^--[a]- |
...
- |
...
^ Item [A] - [TOKEN, STOP_WORD ] Item [to] |
...
- |
...
[TOKEN, STOP_WORD ]
Item [be] - [TOKEN, STOP_WORD ]
Item [a] - [TOKEN, STOP_WORD ] |
...
...
...
Info |
---|
No vertices are created in this stage |
The resource data will be a
...
json file with an array of words in a field named stopWords.
Code Block | ||
---|---|---|
| ||
"stopWords": ["a", "about", "above", "after", "again", "all", "am", "an", "and", "the", "i", "who", ...] |