Page History

Excerpt
Identifies accents in tokens and creates a new token without the accents, using the most similar letter as a replacement.

Operates On: Lexical Items with TOKEN and possibly other flags as specified below.

Saga_is_recognizer

Recognizer	false

Note

Currently the only unicode scripts supported are:

Latin
Greek

Any other unicode script will remain untouched

Include Page

	Generic Configuration Parameters
	Generic Configuration Parameters

Configuration Parameters

Saga_config_stage

boundaryFlags	text block split
requiredFlags	token

Example Output

Saga_graph

V-----------[ÀÁÂÃÄÅ ÈÉÊË ÌÍÎÑÏ ÒÓÔÕÖ ÙÚÛÜ ÝŸ]-----------V 
^-[ÀÁÂÃÄÅ]-V-[ÈÉÊË]-V-[ÌÍÎÑÏ]-V-[ÒÓÔÕÖ]-V-[ÙÚÛÜ]-V-[ÝŸ]-^ 
^-[àáâãäå]-^-[èéêë]-^-[ìíîñï]-^-[òóôõö]-^-[ùúûü]-^-[ýÿ]-^ 
^-[AAAAAA]-^-[EEEE]-^-[IIINI]-^-[OOOOO]-^-[UUUU]-^-[YY]-^ 
^-[aaaaaa]-^-[eeee]-^-[iiini]-^-[ooooo]-^-[uuuu]-^-[yy]-^ 


V---------[ÏÐÑÒ ÓÔÕÖ × ØÙÑÚ ÛÜÝÞß!]---------V 
^-[ÏÐÑÒ]-V-[ÓÔÕÖ]-V-[×]-V-[ØÙÑÚ]-V-[ÛÜÝÞß!]-^ 
^-[ïðñò]-^-[óôõö]-^     ^-[øùñú]-^-[ûüýþß!]-^ 
^-[IÐNO]-^-[OOOO]-^     ^-[ØUNU]-^-[UUYÞß!]-^ 
^-[iðno]-^-[oooo]-^     ^-[øunu]-^-[uuyþß!]-^ 


V-------------[Ç Š Ž Œ Æ Þ Ð]-------------V 
^-[Ç]-V-[Š]-V-[Ž]-V-[Œ]-V-[Æ]-V-[Þ]-V-[Ð]-^ 
^-[ç]-^-[š]-^-[ž]-^-[œ]-^-[æ]-^-[þ]-^-[ð]-^ 
^-[C]-^-[S]-^-[Z]-^ 
^-[c]-^-[s]-^-[z]-^ 

V----------[Star, Inc., Lighting the Way...]----------V 
^-[Star,]-V-[Inc.,]-V-[Lighting]-V-[the]-V--[Way...]--^ 
^-[star,]-^-[inc.,]-^-[lighting]-^       ^--[way...]--^ 
                                         ^-[Way...]-^ 
                                         ^-[way...]-^

Output Flags

Lex-Item Flags:

HAS_ACCENT - Identifies all lexical items which contains an accent and are of the valid unicode script.
ACCENT_STRIPPED - All the tokens created without the accents will have .
TOKEN - All tokens produced are tagged as TOKEN

Vertex Flags:

Info
No vertices are created in this stage

Page tree

Versions Compared

Old Version 2

New Version Current

Key

Configuration Parameters

Example Output

Output Flags

Lex-Item Flags:

Vertex Flags: