Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

This stage maintains a list tokens used to identify possible subjects of interest and suggest a URL reference along with "title" and "description". The title and description fields are used as display data.


Operates On:  Lexical Items with TOKEN flagand possibly other flags as specified below.

Saga_is_recognizer

Note

This stage extends from the Dictionary Tagger Stage.

...

  • Parameter
    summary Use to matching only a percentage of the words present in the pattern, only if active on the pattern.
    default50
    namepartialMatchPercent
    typeinteger
Saga_config_stagecode
boundaryFlagstext block split
requiredFlagstoken, semantic tagall_lower_case
languagejsskipFlagsstop word
"partialMatchPercent": 50
"dictionary":"saga-provider:saga_bestbets"
"dontProcessFlags":[]

Example Output

Description

saga_graph
Code Block
languagetext
V-------[Welcome to Accenture.]--------V 
^-----[Welcome to Accenture]------V-[]-^ 
^-[Welcome]-V-[to]-V-[Accenture]--^      
^-[welcome]-^      ^-[accenture]--^      
                   ^-[{bestbets}]-^      

...

  • BESTBET - Identifies that the token as a possible reference to a subject to which Saga has a link for.
  • SEMANTIC_TAG - Identifies all lexical items which are semantic tags.
  • MISSPELL - Identifies tokens with errors or misspells.

Vertex Flags:

Info

No vertices are created in this stage

...

The pattern database is a series of JSON records, typically indexed by "pattern block ID".  Each JSON record represents a block of patterns (one or more) that all produce the same semantic tag.  The format is as follows:

Saga_jsoncode
TitleEntity Json Format
languagejs
"usePartialMatch": true,
"patterns": "something1, something2, somnething3",
"description": "Description of the bestbets",
"tag": "search-bet",
"title": "the best bet title",
"url": "http://accenture.enterpricesearch.org",
"confAdjust": 1
. . . additional fields as needed go here . . . 

...