Creates a bag of words / tfidf tag with the vector information for the document/text_block/sentence. Accumulates the vector until the engine cannot read any further
Operates On: all lexical Items.
This stage is disabled since version 1.2.2
"vocabulary": "saga-provider:saga_vocabulary", "vectorType": "BOW", "datasetId": "dataset-234ifgbqafgoail3", "min": 1, "max": 3,
In this example the stage load a predefined vocabulary to generate a vector for the sentence using BOW, the same is done but using TF_IDF
V---------------------------[The pilot landed safely the aircraft after gear failed when approaching the runaway.]----------------------------V ^-[The]-V-[pilot]-V-[landed]-V-[safely]-V-[the]-V-[aircraft]-V-[after]-V-[gear]-V-[failed]-V-[when]-V-[approaching]-V-[the]-V---[runaway.]----^ ^-[the]-^ ^---[landed safely]---^---[the aircraft]---^---[after gear]---^---[failed when]---^---[approaching the]---^-[runaway]-V-[.]-^ ^---[pilot landed]---^---[safely the]---^---[aircraft after]---^---[gear failed]---^---[when approaching]---^ ^---[runaway .]---^ ^---[The pilot]---^ ^-----[the runaway.]------^ ^---[the pilot]---^ ^---[the runaway]---^ ^-------------------------------------------------------------------[{BOW}]-------------------------------------------------------------------^ V---------------------------[The pilot landed safely the aircraft after gear failed when approaching the runaway.]----------------------------V ^-[The]-V-[pilot]-V-[landed]-V-[safely]-V-[the]-V-[aircraft]-V-[after]-V-[gear]-V-[failed]-V-[when]-V-[approaching]-V-[the]-V---[runaway.]----^ ^-[the]-^ ^---[landed safely]---^---[the aircraft]---^---[after gear]---^---[failed when]---^---[approaching the]---^-[runaway]-V-[.]-^ ^---[pilot landed]---^---[safely the]---^---[aircraft after]---^---[gear failed]---^---[when approaching]---^ ^---[runaway .]---^ ^---[The pilot]---^ ^-----[the runaway.]------^ ^---[the pilot]---^ ^---[the runaway]---^ ^-----------------------------------------------------------------[{TF_IDF}]------------------------------------------------------------------^
No vertices are created in this stage
Description of resource.
"count" : 15, "docsPerTerm" : 15, "datasetId" : "f92e1394-5f52-3331-aa6a-9c510ad31da5", "tokenCount" : 1, "docCount" : 204021, "word" : "depict"
word ( type=string | required ) - word of the vocabulary