...
Library: saga-lang-detector-stage
Saga_is_recognizer Recognizer false
Tip |
---|
It can detect 103 languages outputting ISO 639-3 language codes. (https://opennlp.apache.org/news/model-langdetect-183.html) |
...
Include Page | ||||
---|---|---|---|---|
|
No configuration parameters are needed.
Parameter | ||||||
---|---|---|---|---|---|---|
|
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
"atLeastOneFlag": []
"boundaryFlags": ["SENTENCE_SPLIT", "TEXT_BLOCK_SPLIT"]
"confidenceAdjustment": 1
"debug": false
"langModel": "langdetect-183.bin"
"dontProcessFlags": []
"requiredFlags": []
"skipFlags": [] | ||||||
Code Block | ||||||
| ||||||
{
"type":"LangDetectorStage",
} |
As you can see, the first sentence is tagged with "LANG_ENG" and the second sentence with "LANG_SPA".
In this case, a sentence breaker stage was configured before the language detector stage. As a result, language identification can occur at the sentence level.
LANG_??? - Flags all text blocks where a language was identified.
Note |
---|
Notice '???' at the end of the Flag. This is replaced by an ISO three letter language code. For example, if Spanish is detected, the three letter code is SPA, and the Flag will be "LANG_SPA" |
Vertex Flags
Info |
---|
No vertices are created in this stage |