Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • dictionary (string, required) - The dictionary resource which holds the names and to be located in the text.
    • This is specified as "provider:name" in the standard resource format (INSERT LINK HERE).
  • boundaryFlags (string, optional) 
    • The tokens to process must be inside two vertex mark with this flags (e.g ["TEXT_BLOCK_SPLIT"])
  • ignoreTags (string array, optional) - Ignore matches with tags specified in the ignoreTags list
  • skipFlags (string array, optional) - Flags to be skipped by this stage
    • Tokens marked with this flags will be ignore by this stage, and no process will be performed.
  • requireFlags (string array, optional)
    • Tokens need to have all the specified flags, in order to be processed
  • debug (boolean, optional)
    • Enable all debug log functionality of the stage, if any.

...

  • SEMANTIC_TAG - Identifies all lexical items which are semantic tags.
  • PROCESSED - Placed on all the tokens which composed the semantic tag.

Resource Data

The dictionary tagger must have an "entity dictionary" (a string to JSON map) which is a list of JSON records, indexed by entity ID. In addition, there may also be a pattern map and a token index.

...