Excerpt |
---|
Identifies geo locations, based on the patterns loaded. |
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer
Include Page | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Saga_config_stage | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
"parameter": "something something" |
Description
Saga_graph |
---|
V--------------[abraham lincoln likes macaroni and cheese]--------------------V ^--[abraham]--V--[lincoln]--V--[likes]--V--[macaroni]--V--[and]--V--[cheese]--^ ^---{place}---^ ^----{food}----^ ^---{food}---^ ^----------{person}---------^ ^-----------------{food}--------------^ |
Info |
---|
No vertices are created in this stage |
Description of resource.
The dictionary tagger must have an "entity dictionary" (a string to JSON map) which is a list of JSON records, indexed by entity ID. In addition, there may also be a pattern map and a token index.
The only file that is absolutely required is the entity dictionary. It is a series of JSON records, typically indexed by entity ID.
Each JSON record represents an entity. The format is as follows:
Saga_json | ||
---|---|---|
| ||
"_id" : "KGAAJGsBemSwA0nZTLXA", "id":"Q28260", "tag":"{city}", "display": "recipeLincon", "patternpatterns":[ "("how many"|"how much") {ingredient} ",Lincoln", "Lincoln, Nebraska", "Lincoln, NE" ], "fields": { "coord": [40.813639, -96.702611] } "confAdjust": 0.95 . . . additional fields as needed go here . . . |
Note |
---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
These will all be added to the interpretation graph with the SEMANTIC_TAG flag.
Tip |
---|
Tags are hierarchical representations of the same intent. For example, {city} → {administrative-area} → {geographical-area} |
Parameter | |
---|---|
|
|
|
Patterns will be tokenized and there may be multiple variations which can match.
Note |
---|
Currently, tokens are separated on simple white-space and punctuation, and then reduced to lowercase. |
Parameter | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Include Page | ||||
---|---|---|---|---|
|
To improve performance especially for every large databases of entities, the entity dictionary is inverted and indexed.
This currently happens in RAM inside the GeoName stage. An off-line option for pre-inverting the dictionary will be provided in the future.