You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Current »

This stage flags vertices with “Skip-Sentence”.  The vertex flag is the start of the sentence. This can be used to ignore a complete sentence by a later stage.

The conditions evaluated by the processor are:

  • Sentence length, given by the token count, not vertices.
  • A list of tags that work as an exception to the count, meaning that if the tag is found within the sentence the count is irrelevant and the sentence is not flagged (whitelisting).
  • A list of tags that if found in the sentence it should be flagged (blacklisting).

Blacklisting a tag always has precedence over the other values, so any sentence with a blacklisted flag will always be flagged as “SKIP_SENTENCE”.  Whitelisted tags will always have precedence over the token limit restriction. And finally token limit restriction is on effect.

Sentence Filter will flag the initial vertex of the sentence with a "SKIP_SENTENCE" flag, it will not remove the sentence from the interpretation graph.

Settings and Configuration

  • Remove Short Sentences ( optional ) - Enables marking of the sentence by length limit.

  • Minimum tokens for valid sentence ( optional ) - Equal or less number of tokens in sentence.

  • Keep Sentence with Semantic Tags ( optional ) - Enables the list of tags exceptions for the length limit.

  • Keep Sentence with these tags ( optional ) - List of tags used to keep the sentence. At least one of the tags should be present on the sentence. Comma separated.

  • Mark Sentence with these tags ( optional ) - List of tags used to mark the sentence. At least one of the tags should be present on the sentence.

General Settings

The general settings can be accessed by clicking on 


  • Enable - Enables the processor to be used in the pipeline.
  • Skip Flags ( optional ) - Lexical items flags to be ignored by this processor.
  • Boundary Flags  ( optional ) - List of vertex flags that indicate the beginning and end of a text block.
  • Required Flags ( optional ) - Lexical items flags required by every token to be processed.
  • At Least One Flags ( optional ) - List of lexical item flags where at least one of them needs to be present to be processed.
  • Don't Process Flags ( optional ) - List of lexical items flags that are not processed. The difference with "Skip Flags" is that this will drop the path in the Saga graph, skip just skips the token and continues in the same path.
  • Confidence Adjustment - Adjustment factor to apply to the confidence value of 0.0 to 2.0 from (Applies for every match).
    • 0.0 to < 1.0  decreases confidence value
    • 1.0 confidence value remains the same
    • > 1.0 to  2.0 increases confidence value
  • Debug - Enable debug logging.

  • No labels