...
Operates On: Lexical Items with TEXT_BLOCK flag.
Library: saga-lang-detector-stage
Tip |
---|
It can detect 103 languages outputting ISO 639-3 language codes. (https://opennlp.apache.org/news/model-langdetect-183.html) |
Note |
---|
It is important to note that the model works better with longer texts that have at least 2 sentences. So it is important to configure this stage earlier in the pipeline and before tokenizing the text. |
Library: saga-lang-detector-stage
...
Include Page Generic Configuration Parameters Generic Configuration Parameters
...