Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

This stage review reviews tokens using Elasticsearch suggestions functionality and creates a new token with a "suggestion" for word it does not words it didn't recognize. The process takes all the available tokens (usually already tokenized by the "WhitespaceTokenizerStage") for the stage (using the highest confidence route), flags like "STOP_WORD" or "ALL_UPPER_CASE" can be used as filters by including them in the "Skip Flags" list.

Info

Uses Spellchecker Stage

Configuration

  • Load from Dataset - Load dictionary from pre-loaded dictionaries.
  • Load from File - Load a dictionary from your local machine.
  • Delete Dictionary - Delete a dictionary.

General Settings

Include Page
Generic Processor Config
Generic Processor Config

Elasticsearch Connection Settings

  • Parameter
    summarySchema used by Elasticsearch connection
    defaulthttp
    nameSchema
  • Parameter
    summaryElasticsearch connection port.
    default9200
    namePort
    typeinteger
  • Parameter
    summaryHost used by Elasticsearch connection
    defaultlocalhost
    nameHost
  • Parameter
    summaryIndex used by the stage to store dictionary data.
    defaultsaga_spellcheck_dictionary
    nameIndex Name
    • This is an Elasticsearch index.