...
...
...
...
...
...
...
...
...
Include Page | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | |||||||
---|---|---|---|---|---|---|---|
|
...
|
...
Code Block | ||||
---|---|---|---|---|
|
...
|
...
|
...
| |
"patterns":"saga_provider: |
...
saga_advanced",
"maxRepeats": 5 |
The following shows sample output from the advanced pattern matcher, which has multiple patterns for the {product} and {person-product-preference} semantic tags.
Code Block |
---|
...
language | text |
---|---|
theme | FadeToGrey |
...
V--------------------[Abe Lincoln likes the iPhone-7]-------------------- |
...
V |
...
^---[Abe]----V--[Lincoln]--V--[likes]--V--[the]--V------[iPhone-7]-------^
|
...
^---[iPhone]----V--[7]--^
|
...
^---[abe]----^--[lincoln]--^ ^---[iphone]----^
|
...
^--[{name}]--^--[{place}]--^ ^-----------[{product}]-----------^
|
...
^-------[{product}]-------^
|
...
^---------[{name}]---------^ ^--[{product}]--^
|
...
^--------[{place}]---------^ ^------[iphone-7]-------^
|
...
^------[{product}]------^
|
...
^-----------------[{person-product-preference}]------------------^
|
...
^---------------------[{person-product-preference}]----------------------^ |
...
Info |
---|
No vertices are created in this stage. |
The Advanced Pattern Stage accepts patterns that span any number of tokens/items. This is not a character matching/regex engine. The way the patterns works is to replace and expand tokens, not characters, using precedence rules and simple symbols to define repetitions and optional tokens.
Warning |
---|
An advanced pattern always requires a tag to work. (e.g "token {tag}") |
These are the symbols supported for patterns:
Symbol | Usage | |
---|---|---|
Zero or more of preceding | * | {tag}* token* |
One or more of preceding | + | {tag}+ token+ |
Zero or one of preceding | ? or [ ] | {tag}? [{tag}] token? [token] |
Grouping and escaping | ( ) " " | ({tag} token) "some tokens" "what?" |
Alternations | | | token1|{tag}|token2 |
Note |
---|
There cannot be whitespaces between the tokens and the symbols, for example “token1 | token2” (notice there are whitespaces between token1 and the pipe symbol as well as after) is not valid, it should be written as “token1|token2”. Or “{tag}?”, adding a space between the closing curly bracket and the question mark will not yield the expected result. |
This is the precedence order used by the pattern expansion process (from highest to lowest):
"" | escaping | All other symbols are turned off inside of quotes |
( ) [ ] | groupings and optional | All other chars will act on groups |
? + * | wildcards | Apply to tokens, tags, or groups |
| | alternations | Apply to tokens, tags, or groups |
Note |
---|
Wildcards such as + and * have a limit of repetitions set by the "Maximum Repeats" value on the configuration. |
Example patterns:
this that | Matches the literal token "this" followed by the literal token "that" |
this|that | Matches "this" or "that". |
{animal} | Matches any token that has been tagged as an {animal}. |
this {tag}? that | Matches "this" followed by "that", optionally with a {tag} token in between |
this {tag}* that | As above, but with any number (zero or more) of {tag} tokens in between. |
this {tag}+ that | As above, but with at least one {tag} token in between |
(this that)+ | Matches "this that this that..." etc. |
(left handed)|(right handed) tool | Matches "left handed tool" or "right handed tool". |
Phrase “here is optional”? | Matches “Phrase” and “here is optional” might be at the end of the phrase. |
Phrase [here is optional] | Matches same as above. |
Is the last phrase “here optional?” | Matches the whole phrase as the ? mark is within the escaping. |
One|two|three|more alternatives | Matches multiple optional values. |
In a galaxy far+ away | Matches up to 5 (depending on Maximum Repeats values) repetitions of “far” e.g. In a galaxy far far far far far away. |
The resource data
...
is a database of advanced patterns
...
and the resulting semantic
...
tags that they produce.
Resource Format
The pattern database is a series of JSON records, typically indexed by "pattern block ID". Each JSON record represents
...
a block of patterns (one or more)
...
that all produce the same semantic tag. The format is as follows:
...
Code Block | ||
---|---|---|
|
...
| |||
"_id" : " |
...
KGAAJGsBemSwA0nZTLXA", |
...
" |
...
tag": |
...
"recipe", "pattern": "("how many"|"how much") {ingredient} ", "confAdjust": 0.95 . . . additional fields as needed go here . . . |
...
Note |
---|
|
...
|
...
|
...
Parameter | |
---|---|
|
...
...
|
These will all be added to the interpretation graph with the SEMANTIC_TAG flag.
Tip |
---|
Tags are hierarchical representations of the same intent. For example, {city} → {administrative-area} → {geographical-area} |
Parameter | ||||||
---|---|---|---|---|---|---|
|
Note |
---|
Currently, tokens are separated on simple white-space and punctuation, and then reduced to lowercase. |
...
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Include Page | ||||
---|---|---|---|---|
|
...