You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

A key innovation of the Language Processing Toolkit is that the output of language processing is graph of alternative representations

(more, with examples)


Flags Only Describe the Lexical Item Itself

It may seem obvious, but flags describe the Lexical Item itself, and do not describe any items from which it was derived.


For example if you have the following graph:

[v1]----President----[v2]


And then you apply the CaseAnalysis Stage to this graph, you will get:

[v1]----President----[v2]
 ^------president-----^


In this example, the first "President" token will have the TITLE_CASE flag, and the second (normalized) "president" token will have the ALL_LOWER_CASE flag. There is no flag which says "I was derived from some other token which was TITLE_CASE".

Note that you can traverse the component links from the derived item ("president") to the original item ("President") to  determine if some token was original TITLE_CASE.



  • No labels