Lemmatize tokens matched to words in a dictionary.
Operates On: Lexical Items with TOKEN
This lemmatization does not use rules
dictionary(string, optional) - The resource containing the list of words and relationships
include(list, optional) - A list of the relationships to include
exclude(list, optional) - A list of the relationships to exclude
Default dictionary only available in English
{ "type": "LemmatizeStage", "include" : ["pl", "vf"], "exclude" : ["ob"], "dictionary" : "lemmatize-provider:lemmatize_words" }
V--------------------[I am liking this projects very much]--------------------V ^--[I]--V--[am]--V--[liking]--V--[this]--V--[projects]--V--[very]--V--[much]--^ ^--[be]--^---[like]---^ ^--[project]---^ am - {"confidence":0.0084,"rel":["vf","wnm"],"to":"be"} liking - {"confidence":0.0084,"rel":["vf","wnm"],"to":"like"} projects - {"confidence":0.012,"rel":["vf","wnm","pl"],"to":"project"}
The resource data will be a json file with an array of words in a field named words
{ "words": [ { "confidence": 0.0049, "rel": [ "wnm", "sp" ], "from": "encyclopaedia", "to": "encyclopedia" }, { "confidence": 0.0752, "rel": [ "wnm", "sp" ], "from": "word", "to": "worth" } ] }
The required fields for each entry are:
Any other field will be included in the entities of the LexItem