Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

This recognizer uses OpenNLP's DocumentCategorizer to load classification models and tag sentences that match the binary classification model (is or isn't in a certain category) given a specified threshold of accuracy.

Note

This is a plugin recognizer. Uses Classification Stage

Configuration

  • Parameter
    summaryModel to use for classification, by default it always uses the latest, but can be configure to use one in specific
    default--LATEST--
    nameModel
    requiredtrue
  • Parameter
    summaryProbability threshold. Will only tag sentences that match better or equal to the value specified.
    default.95
    nameMinimum Probability
    typedouble
    requiredtrue
  • Parameter
    summaryIndicates which tags must be first recognized in order to do the classification and training
    nameNormalize Tags
    typestring array

    • This helps to reduce the noise in the text, for example reducing every set of numbers to {number}
  • Parameter
    summaryThe amount of training data which represents the positive samples
    default0.5
    namePositive Sample Ration
    typedouble

    • If there are 5000 positive samples, that 5000 represents 60% (with a PSR of 0.6) of the total training data, where the other 40% will be negative samples, around 3333 negative samples

Training a Model

By clicking in Click on the  which  button which will popup the "Start Training Run Run" dialog

  • Parameter
    summarySelect the Dataset to use as training data
    nameDatasets
    typeboolean
    requiredtrue
  • Parameter
    default200
    nameIterations
    typeinteger
    requiredtrue
  • Parameter
    default5
    nameCut Off
    typeinteger
    requiredtrue
  • Parameter
    default2
    nameThreads
    typeinteger
    requiredtrue
  • Parameter
    defaultBoW
    nameFeature Selection
    requiredtrue
    • BoW (Bag of Words)
    • N-Gram
  • Parameter
    defaultMAXENT_QN
    nameAlgorithm
    • Available algorithms
      • MAXENT_QN

      • MAXENT

      • NAIVEBAYES

      • PERCEPTRON

General Settings

Include Page
Generic Recognizer Config
Generic Recognizer Config