Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

Looks up matches to regular expressions in a dictionary across multiple tokens and then tags the match with one or more semantic tags as an alternative representation. For a simple regex expression, a match only needs to occur against a singe token. Simple Regex is recommended

Info

Uses Regex Pattern Stage

Warning

This stage requires a lot of processing time. Please follow these recommendations:

  • Keep the amount at a minimum to regex patterns.
  • Try to use non greedy regex.
  • Set the maximum length to the bare minimum necessary for the expected matches.

Configuration

  • Parameter
    summaryThe maximum length of text to test for regex.
    default25
    nameMax Length
    typeinteger
    requiredtrue
    • For each token, the stage will increase the size by adding tokens before and after, until a match (or the 25 character limit) is reached.
  • Parameter
    summaryIndicates if the match to the regex can be case insensitive
    defaultchecked
    nameCase Insensitive
    typeboolean

Adding a Pattern

By clicking in Click on the  which will popup the button to open the "Add new Pattern Pattern" dialog


  • Parameter
    summaryRegex patter to apply to the tokens
    nameRegex
    requiredtrue
  • Options
    • Parameter
      summaryIndicates the creation of a new tag, in case the regex gets a match with just a section of one or more tokens
      defaultunchecked
      nameSplit Match
      typeboolean
      requiredtrue
    • Parameter
      summaryIndicates if the match with the regex can be case insensitive
      defaultunchecked
      nameCase Insensitive
      typeboolean
    • Parameter
      summaryIndicates if the match to the regex must be a literal. (a better choice is use Entity Recognizer)
      defaultuncheck
      nameLiteral
      typeboolean
    • Parameter
      summaryThe maximum length of text to test for regex.
      default5
      nameMax Length
      typeinteger
      requiredtrue

      • For each token, the stage will increase the size by adding tokens before and after, until a match (or the 25 character limit) is reached.
  • Parameter
    summaryAdjustment factor to apply to the confidence value of 0.0 to 2.0 from (Applies for every pattern match).
    default1
    nameConfidence Adjustment
    typedouble
    requiredtrue
    • 0.0 to < 1.0  decreases confidence value
    • 1.0 confidence value remains the same
    • > 1.0 to  2.0 increases confidence value

General Settings

Include Page
Generic Recognizer Config
Generic Recognizer Config