Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • TOKEN - Identifies that the Lex-Items produced by this stage are tokens and not text blocks.
  • ORIGINAL - Identifies that the Lex-Items produced by this stage have an original, as written, representation of every token (e.g. before normalization)

Vertex Flags:

  • TEXTALL_BLOCK_SPLIT WHITESPACE - Identifies that the vertex as a split between text blocks.OVERFLOW_SPLIT - Identifies that an entire buffer was read without finding a split between text blocks.
  • The current maximum size of a text block is 64K characters.
  • Text blocks larger than this will be arbitrarily split, and the vertex will be marked with "OVERFLOW_SPLIT"\contains all white-space.