Operates On: Lexical Items with TOKEN
Note |
---|
This lemmatization does not use rules. |
Include Page | ||||
---|---|---|---|---|
|
...
dictionary (string, optional) - The resource containing the list of words and relationships.
...
...
...
include (list, optional) - A list of the relationships to include.
...
exclude (list, optional) - A list of the relationships to exclude.
...
Note |
---|
...
A default dictionary is available in English |
...
. Spanish is supported when parameter languageISO3 is configured properly |
...
. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
{
"type": "Lemmatize",
"include" : ["pl", "vf"],
"exclude" : ["ob"],
"dictionary" : "lemmatize-provider:lemmatize_words",
"languageISO3":"SPA"
} |
Code Block | ||||
---|---|---|---|---|
| ||||
V--------------------[I am liking this projects very much]--------------------V
^--[I]--V--[am]--V--[liking]--V--[this]--V--[projects]--V--[very]--V--[much]--^
^--[be]--^---[like]---^ ^--[project]---^
am - {"confidence":0.0084,"rel":["vf","wnm"],"to":"be"}
liking - {"confidence":0.0084,"rel":["vf","wnm"],"to":"like"}
projects - {"confidence":0.012,"rel":["vf","wnm","pl"],"to":"project"} |
...
...
The resource data will be a json file with an array of words in a field named words. This is when the 'dictionary' parameter is used.
Code Block | ||||
---|---|---|---|---|
| ||||
{
"words": [
{
"confidence": 0.0049,
"rel": [
"wnm",
"sp"
],
"from": "encyclopaedia",
"to": "encyclopedia"
},
{
"confidence": 0.0752,
"rel": [
"wnm",
"sp"
],
"from": "word",
"to": "worth"
}
]
}
|
When the 'dictionary' parameter is not
...
used, an embedded
...
Wiktionary file will be used. This file is formatted as a 1 entry json per line:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
{"confidence":0,"rel":["syn"],"from":"japonés","to":"nipón"}
{"confidence":0,"rel":["syn"],"from":"alemán","to":"germano"}
{"confidence":0,"rel":["syn"],"from":"alemán","to":"tedesco"}
{"confidence":0,"rel":["syn"],"from":"alemán","to":"teutón"}
{"confidence":0,"rel":["syn"],"from":"alemán","to":"gringo"}
{"confidence":0,"rel":["syn"],"from":"mayo","to":"guainica"}
{"confidence":0,"rel":["syn"],"from":"mayo","to":"maisito"}
{"confidence":0,"rel":["syn"],"from":"mayo","to":"mayito"}
{"confidence":0,"rel":["syn"],"from":"mayo","to":"turpial de sureste"}
{"confidence":0,"rel":["syn"],"from":"domingo","to":"paga"} |
Anchor | ||||
---|---|---|---|---|
|
The required fields for each entry are:
...
...
...
...
Tip |
---|
Any other field will be included in the entities of the LexItem. |