Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

The sentence classifier stage uses OpenNLP's DocumentCategorizer to load classification models and tag sentences that match the binary classification model (is or isn't in a certain category) given a specified threshold of accuracy.


Operates On:  Lexical Items with TOKEN within TEXT_BLOCK_SPLIT or SENTENCE_SPLIT vertex flags.

Saga_is_recognizer

Library: saga-classification-trainer-stage

Tip

Models can be trained directly with OpenNLP tools or from the Saga UI. See Classifier Recognizer in the User Manual for more information on how to create a model using Saga.

Note

The stage will use boundaryFlags specified to split the input text in sentences. All text between boundaries is considered a sentence and will be evaluated separately by the classifier.


Include Page
Generic Configuration Parameters
Generic Configuration Parameters

Configuration Parameters

  • Parameter
    summaryif used with automatic pipeline creation, it assigns the tag to which the recognizer belongs to.
    defaultmatch
    nametagWith
  • Parameter
    summaryProbability threshold. Will only tag sentences that match better or equal to prob.
    default0.95
    nameprob
    typedouble
  • Parameter
    summaryFile location of the model.
    namemodel
  • Parameter
    summaryList of Tags used to normalize the text
    namenormalize
    typestring array
    • For example, let's say you want to normalize all different numbers in the text. You can create a "Numeric" tag using the numeric recognizer, that way each different number will me normalized to "{Numeric}".


Saga_config_stage
boundaryFlagstext block split, sentence split
stageClassificationStage
requiredFlagstoken
"tagWith": "NAME-OF-OUTPUT-TAG",
"prob": "0.95", 
"model": ".\model-file.bin",
"normalize": []


Example Output

In this case, a sentence breaker stage was configured before the classifier stage.  The tagWith value is animal-incident for this example.

Saga_graph
V-------------------------------------[BIRD STRIKE DURING DESCENT. DAMAGE LIMITED TO 18 INCH DENT BOTTOM OUTBOARD SIDE OF ENGINE COWLING.]--------------------------------------V 
^-----------[BIRD STRIKE DURING DESCENT]-----------V------------------------[DAMAGE LIMITED TO 18 INCH DENT BOTTOM OUTBOARD SIDE OF ENGINE COWLING]------------------------V-[]-^ 
^---[BIRD]---V---[STRIKE]---V-[DURING]-V-[DESCENT]-^--[DAMAGE]--V-[LIMITED]-V-[TO]-V-[18]-V-[INCH]-V-[DENT]-V-[BOTTOM]-V-[OUTBOARD]-V-[SIDE]-V-[OF]-V-[ENGINE]-V-[COWLING]-^      
^---[bird]---^---[strike]---^-[during]-^-[descent]-^--[damage]--^-[limited]-^-[to]-^      ^-[inch]-^-[dent]-^-[bottom]-^-[outboard]-^-[side]-^-[of]-^-[engine]-^-[cowling]-^      
^--[{bird}]--^-[{physical}]-^--[dure]--^           ^-[{damage}]-^--[limit]--^                                                                                                     
^-[{animal}]-^--[{damage}]--^                                                                                                                                                     
^---------------[{animal-incident}]----------------^  

Output Flags

Lex-Item Flags

  • CLASSIFICATION - Flags the sentence that matched the model's criteria.
  • SEMANTIC_TAG - Flags the sentence that matched as a semantic tag.

Vertex Flags

Info

No vertices are created in this stage.