Excerpt |
---|
Identifies geo locations, based on the patterns loaded. |
Operates On: Lexical Items with TOKEN and possibly other flags as specified below.
Saga_is_recognizer
Include Page | ||||
---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter |
---|
Saga_config_stage | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
"parameter": "something something" |
Description
|
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Code Block | ||||
---|---|---|---|---|
| ||||
"charList": "_-‿⁀⁔︳︴﹍﹎﹏_",
"dictionary": "saga-provider:saga_geonames",
"lowercase": true,
"minimum": 3,
"normalizeAccents": true
"removeChars": false |
Code Block | ||
---|---|---|
| ||
V- | ||
Saga_graph | ||
V--------------[abrahamthis is lincolnCOSTA likesRICA macaroni!!! and cheese]-----Costa Verda]---------------V ^-[this]-V-[abrahamis]--V--[lincolnCOSTA]--V--[likesRICA]--V--[macaroni!!!]--V--[and]-V-[Costa]-V--[cheeseVerda]--^ ^-[costa]--{place}---^-[rica]-^ ^-[costa]-^--{food}----^[verda]-^ ^---[{food_geoname_}]---^ ^----------{person}---------^ ^-----------------{food}--------------^[{_geoname_}]---^ |
Info |
---|
No vertices are created in this stage |
Description of resource.
The only file that is absolutely required is the geonames dictionary. It is a series of JSON records, typically indexed by entity ID.
Each JSON record represents an entity. The format is as follows:
Code Blocksaga_json | ||||
---|---|---|---|---|
| ||||
"_id" : "KGAAJGsBemSwA0nZTLXA", "id" : 3621815, "display" : "San Juan", "patterns" : [ "San Juan" ], "tag" : "DDfO1HABPr3bu3tFxDT4", "fields" : { "feature class" : "P", "feature code" : "PPL", "admin3 code" : "20203", "timezone" : "recipeAmerica/Costa_Rica", "patterncountry code" : "CR", "("how many"|"how much") {ingredient} ", "confAdjust": 0.95 . . . additional fields as needed go here . . . admin1 code" : "01", "location" : { "lon" : -84.4654, "lat" : 10.10676 }, "modification date" : "2016-09-07", "admin2 code" : "202", "dem" : 1093, "population" : 0 }, "confAdjust" : 1.0 |
Note |
---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
Parameter | ||||||
---|---|---|---|---|---|---|
|
These will all be added to the interpretation graph with the SEMANTIC_TAG flag.
Parameter | ||||||||
---|---|---|---|---|---|---|---|---|
|
Patterns will be tokenized and there may be multiple variations which can match.
Note |
---|
Currently, tokens are separated on simple white-space and punctuation, and then reduced to lowercase. |
Parameter | ||||
---|---|---|---|---|
|
Parameter | |
---|---|
|
|
|
|
|
Include Page | ||||
---|---|---|---|---|
|
To improve performance especially for every large databases of entities, the entity dictionary is inverted and indexed.
This currently happens in RAM inside the GeoNames stage. An off-line option for pre-inverting the dictionary will be provided in the future.