Page History

Excerpt
Flag tokens matched to stop words, to be skipped for following subsequent stages.

Operates On: Lexical Items with TOKEN

...

caseInsensitive (string, optional) - If true, all stop words and tokens will be process processed as case insensitive (default = true).
stopWords (string, optional) - The resource containing the list of stop words.
- See below for the format. if If no resource is provided, the stage will use the default list of stop words

...

Code Block

language	js
theme	Eclipse
title	Example Configuration

{
  "type": "StopWords",
  "caseInsensitive" : true,
  "stopWords" : "words-provider:stop_words"
}

Example Output

Code Block

language	text
theme	FadeToGrey

V--------------[A test to be skipped]--------------V  
  ^--[A]--V--[test]--V--[to]--V--[be]--V--[skipped]--^  
  ^--[a]--^  

Item [A] - [TOKEN, SKIP]
Item [to] - [TOKEN, SKIP]
Item [be] - [TOKEN, SKIP]
Item [a] - [TOKEN, SKIP]

Output Flags

Lex-Item Flags

...

SKIP - All matched stop words will be marked as SKIP.

Resource Data

The resource data will be a json file with an array of words in a field named stopWrods stopWords.

Code Block

language	js
theme	Eclipse

{
  "stopWords": ["a", "about", "above", "after", "again", "all", "am", "an", "and", "the", "i", "who", ...]
}

Page tree

Versions Compared

Old Version 10

New Version 11

Key

Example Output

Output Flags

Lex-Item Flags

SKIP - All matched stop words will be marked as SKIP.

Resource Data