Warning |
---|
This is a work in progress you can expect things to break while using this stage. |
Excerpt |
---|
This stage review tokens using Elasticsearch suggestions functionality and creates a new token with a "suggestion" for word it does not recognize. |
The process takes all the available tokens (usually already tokenized by the "WhitespaceTokenizerStage") for the stage (using the highest confidence route), flags like "STOP_WORD" or "ALL_UPPER_CASE" can be used as filters by including them in the "Skip Flags" list.
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer
Note |
---|
This recognizer requires a dictionary to work, so it must be loaded either from a dataset or a file before using it. Validate your Elasticsearch version to ensure this stage is compatible. |
Include Page | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Saga_config_stage | ||
---|---|---|
| ||
"index": "saga_spellchecker_dictionary", "schema": "http", "host": "localhost", "port": "9200" |
Saga_graph |
---|
V--------------[abraham lincoln likes makaroni and cheese]--------------------V ^--[abraham]--V--[lincoln]--V--[likes]--V--[makaroni]--V--[and]--V--[cheese]--^ ^--[macaroni]--^ |
Info |
---|
No vertices are created in this stage |
The data used by the dictionary may come from 2 sources:
Both options are accessed through Saga Server or the endpoints of the service. To create a dictionary from a dataset, select the one you are interested in and select the pipeline to process it, remember that the pipeline must end with a Spellchecker Stage. To create a dictionary from a file you only need a plain text file with terms separated by new line.
Code Block | |||
---|---|---|---|
| |||
Saga_json | |||
| |||
abraham
lincoln
likes
macaroni
and
cheese
|