Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejs
themeEclipse
titleEntity JSON Format
{
    "_id" : "ca84",
    "tags" : [ 
        "number"
    ],
    "patterns" : [ 
        "[0-9]+", 
        "[0-9]+\\.[0-9]+"
    ],
    "confidence" : 0.95
  . . . additional fields as needed go here . . . "splitMatch": false,
	"literal": false,
	"caseInsensitive": true
}

Notes

  1. Multiple patterns can have the same entry.
  2. Additional fielded data can be added to the record.
    • As needed by downstream processes.

Fields

  • id (required, string) -
    Parameter
    summary
    Identifies the entity by unique ID. This identifier must be unique across all entries (across all dictionaries).
    nameid
    requiredtrue

    • Typically, this is an identifier with meaning to the larger application that is using the Language Processing Toolkit.
  • tags (required, array of string) -
    Parameter
    summary
    The list of semantic tags that will be added to the interpretation graph whenever any of the patterns are matched.
    nametags
    typestring array
    requiredtrue

    • These will be added to the interpretation graph with the SEMANTIC_TAG flag.
    patterns (required, array of string) -
  • Parameter
    summary A list of patterns to match in the content.
    namepatterns
    typestring array
    requiredtrue

  • Parameter
    summaryWhen this flag is specified then the input string that specifies the pattern is treated as a sequence of literal characters. Metacharacters or escape sequences in the input sequence will be given no special meaning.
    defaultfalse
    nameliteral
  • Parameter
    summarySet to true if the pattern is not case sensitive.
    defaulttrue
    namecaseInsensitive
    typeboolean
  • splitMatch (optional, boolean) -
    Parameter
    summary
    Indicates if the partialmatch will create a regex tag even if a full match was not met.
     
    defaultfalse
    namespliMatch
    typeboolean
  • confidence (optional, float) -
    Parameter
    summary
    Specifies the confidence level of
    the 
    the entity, independent of any patterns matched.
    nameconfidence
    typedouble

    • This is the confidence of the entry, in comparison to all of the other entries. Essentially, the likelihood that this entry will be encountered randomly.

...