...
Operates On: Lexical Items with TOKEN
...
Include Page |
---|
| Generic Configuration Parameters |
---|
|
...
- The tokens to process must be inside two vertex mark with this flags (e.g ["TEXT_BLOCK_SPLIT"])
...
- Tokens marked with this flags will be ignore by this stage, and no process will be performed.
...
- Tokens need to have all the specified flags, in order to be processed.
...
- Tokens will need at least one of the flags specify in this array.
Generic Configuration Parameters |
|
Configuration Parameters
- splitChars (string, optional) - List of characters which should be used to split tokens.
- If not present, then tokens are split on any sequence of punctuation.
- dontSplitChars (string, optional) - List of characters which will NOT be used to split tokens.
- This is typically used to identify exceptions (characters which are not used to split tokens) when splitChars is missing.
- These characters are included in the produced tokens.
- splitFlag (string, optional) - The flag to be put on the vertex between the two tokens.
- If missing, defaults to ALL_PUNCTUATION.
...