Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt
Creates N-Grams from of TOKEN flagged lexical items of size between a .  The size of the N-Gram can be specified by minimum and a maximum settings. It will also will breaks break the N-Grams on SPLIT_FLAGS


Info

Uses NGram Stage

Configuration

  • Parameter
    summaryMinimum size of tokens for an N-Gram
    default2
    nameMin N-Gram size
    typeinteger
    requiredtrue
  • Parameter
    summaryMaximum size of tokens for an N-Gram
    default2
    nameMax- N-Gram size
    typeinteger
    requiredtrue
  • Parameter
    summarySplit the N-Gram if any of the next token has one of this flags. (i.e. if a token has any of this flags, stops the building of the N-Gram)
    defaultALL_PUNCTUATION, ALL_DIGITS, HAS_PUNTUATION, HAS_DIGIT
    nameSplit Flags
    typestring array
  • Parameter
    summarySpecifies flags for token that will not be added to an N-Gram
    nameIgnore On Boundary Flags

General Settings

Include Page
Generic Processor Config
Generic Processor Config