V---------------[this is COSTA RICA !!! and Costa Verda]---------------V
^-[this]-V-[is]-V-[COSTA]-V-[RICA]-V-[!!!]-V-[and]-V-[Costa]-V-[Verda]-^
^-[costa]-^-[rica]-^ ^-[costa]-^-[verda]-^
^--[{_geoname_}]---^ ^---[{_geoname_}]---^
Excerpt |
---|
Identifies geo locations, based on the patterns loaded. |
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer
Include Page | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Saga_config_stage | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
"parameter": "something something" |
Saga_graph |
---|
V---------------[abrahamthis is lincolnCOSTA likesRICA macaroni!!! and Costa cheeseVerda]--------------------V ^-[this]-V-[abrahamis]--V--[lincolnCOSTA]--V--[likesRICA]--V--[macaroni!!!]--V--[and]-V-[Costa]-V--[cheeseVerda]--^ ^-[costa]--{place}--^-[rica]-^ ^-[costa]---{food}----^^-[verda]-^ ^---[{food_geoname_}]---^ ^----------{person}---------^ ^-----------------{food}-------[{_geoname_}]-------^ |
Info |
---|
No vertices are created in this stage |
The only file that is absolutely required is the entity geonames dictionary. It is a series of JSON records, typically indexed by entity ID.
Each JSON record represents an entity. The format is as follows:
Saga_json | ||
---|---|---|
| ||
"_id" : "KGAAJGsBemSwA0nZTLXA", "id" : 3621815, "display" : "Q28260San Juan", "tagpatterns" :"{city}" [ "San Juan" ], "displaytag" : "LinconDDfO1HABPr3bu3tFxDT4", "patternsfields" :[ "Lincoln", "Lincoln, Nebraska", "Lincoln, NE" ], "fields": { "coord": [40.813639, -96.702611] } "confAdjust": 0.95 . . . additional fields as needed go here . . . { "feature class" : "P", "feature code" : "PPL", "admin3 code" : "20203", "timezone" : "America/Costa_Rica", "country code" : "CR", "admin1 code" : "01", "location" : { "lon" : -84.4654, "lat" : 10.10676 }, "modification date" : "2016-09-07", "admin2 code" : "202", "dem" : 1093, "population" : 0 }, "confAdjust" : 1.0 |
Note |
---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
These will all be added to the interpretation graph with the SEMANTIC_TAG flag.
Tip |
---|
Tags are hierarchical representations of the same intent. For example, {city} → {administrative-area} → {geographical-area} |
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Patterns will be tokenized and there may be multiple variations which can match.
Note |
---|
Currently, tokens are separated on simple white-space and punctuation, and then reduced to lowercase. |
Parameter | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Include Page | ||||
---|---|---|---|---|
|
To improve performance especially for every large databases of entities, the entity dictionary is inverted and indexed.
This currently happens in RAM inside the GeoName GeoNames stage. An off-line option for pre-inverting the dictionary will be provided in the future.