Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

The name predictor stage uses OpenNLP's NameFinder to load Name Entity Recognizer models and tag tokens that match entities based on the model given a certain threshold of accuracy.

Note

This is a plugin recognizer. Uses Name Predictor Stage

Configuration

  • Parameter
    summaryModel to use for recognition, by default it always uses the latest, but can be configure to use one in specific
    default--LATEST--
    nameModel
    requiredtrue
  • Parameter
    summaryProbability threshold. Will only tag names that match better or equal to the value specified.
    default.7
    nameMinimum Probability
    typedouble
    requiredtrue
  • Parameter
    summaryIndicates which tags must be first recognized in order to do the recognition and training
    nameNormalize Tags
    typestring array

    • This helps to reduce the noise in the text, for example reducing every set of numbers to {number}

Training a Model

Click on the  button to open the "Start Training Run" dialog

  • Parameter
    summarySelect the Dataset to use as training data
    nameDatasets
    typeboolean
    requiredtrue
  • Parameter
    default200
    nameIterations
    typeinteger
    requiredtrue
  • Parameter
    default5
    nameCut Off
    typeinteger
    requiredtrue
  • Parameter
    default2
    nameThreads
    typeinteger
    requiredtrue
  • Parameter
    summaryIndicates the percentage of failed files during the training of the model
    default10
    nameError Threshold
    typeinteger
    requiredtrue
  • Parameter
    defaultMAXENT_QN
    nameAlgorithm
    • Available algorithms
      • MAXENT_QN

      • MAXENT

      • NAIVEBAYES

      • PERCEPTRON

  • Parameter
    default0.00001
    nameLLThreshold
    typedouble
    requiredtrue
  • Parameter
    defaultfalse
    nameSmoothing
    typeboolean
  • Parameter
    default0.00001
    nameSmoothing Observation
    typedouble
  • Parameter
    defaultfalse
    nameGaussian Smoothing
    typeboolean
  • Parameter
    default2
    nameGaussian Smoothing Sigma
    typeinteger

General Settings

Include Page
Generic Recognizer Config
Generic Recognizer Config