Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Operates On:  Lexical Items with TEXT_BLOCK flag.

Library: saga-lang-detector-stage

Tip

It can detect 103 languages outputting ISO 639-3 language codes. (https://opennlp.apache.org/news/model-langdetect-183.html)

Note

It is important to note that the model works better with longer texts that have at least 2 sentences. So it is important to configure this stage earlier in the pipeline and before tokenizing the text.

Library: saga-lang-detector-stage

...


Include Page
Generic Configuration Parameters
Generic Configuration Parameters

...