Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • id (required, string) - Identifies the entity by unique ID. This identifier must be unique across all entities (across all dictionaries) regardless of the type.
    • Typically this is an identifier with meaning to the larger application which is using the Language Processing Toolkit.
  • tags (required, array of string) - The list of semantic tags which will be added to the interpretation graph whenever any of the patterns are matched.
    • These will all be matched added to the interpretation graph with the SEMANTIC_TAG flag.
    • Typically, multiple tags are hierarchical representations of the same intent. For example, {city} → {administrative-area} → {geographical-area}
  • patterns (required, array of string) - A list of patterns to match in the content.
    • Patterns will be tokenized and there may be multiple variations which can match.
      • NOTE:  Currenty, tokens are separated on simple white-space and punctuation, and then reduced to lowercase.
      • TODO:  This will need to be improved in the future, perhaps by specifying a pipeline to perform the tokenization and to allow for multiple variations.
  • confidence (optional, float) - Specifies the confidence level of the entity, independent of any patterns matched.
    • This is the confidence of the entity, in comparison to all of the other entities. Essentially, the likelihood that this entity will be randomly encountered.

...