Configuration
- Split Flag - The flag to be put on the vertex between the two tokens.
- If missing, defaults to ALL_PUNCTUATION.
Split as Vertex
- Split Characters - List of characters which should be used to split tokens
- A new vertex will be created covering the split characters.
- Don't Split Characters - List of characters which will NOT be used to split tokens.
- This is typically used to identify exceptions (characters which are not used to split tokens) when Split Characters is missing.
- These characters are included in the produced tokens.
Split In Between (Before/After) - split occurs when split characters are located in the middle of a token text.
- Split Before character - if any character in this list occurs inside a token, that token will be split just before that character
- Split After character - if any character in this list occurs inside a token, that token will be split just after that character
Split As Prefix/Suffix - Split occurs if the split characters are located at the beginning (prefix) or the end (suffix) of the token text.
- Split Prefix Characters - list of split characters that appear at the beginning of the token.
- Split Suffix Characters - list of split characters that appear at the end of the token.