This stage identifies tokens that look like phone numbers and flag them as "PHONE".
Operates On: Lexical Items with TOKEN or SEMANTIC_TAG and possibly other flags as specified below.
Generic Configuration Parameters
-
boundaryFlags ( type=string array
| optional
)
- List of vertex flags that indicate the beginning and end of a text block.
Tokens to process must be inside two vertices marked with this flag (e.g ["TEXT_BLOCK_SPLIT"]) -
skipFlags ( type=string array
| optional
)
- Flags to be skipped by this stage.
Tokens marked with this flag will be ignored by this stage, and no processing will be performed. -
requiredFlags ( type=string array
| optional
)
- Lex items flags required by every token to be processed.
Tokens need to have all of the specified flags in order to be processed. -
atLeastOneFlag ( type=string array
| optional
)
- Lex items flags needed by every token to be processed.
Tokens will need at least one of the flags specified in this array. -
confidenceAdjustment ( type=double
| default=1
| required
)
- Adjustment factor to apply to the confidence value of 0.0 to 2.0 from (Applies for every pattern match).
- 0.0 to < 1.0 decreases confidence value
- 1.0 confidence value remains the same
- > 1.0 to 2.0 increases confidence value
-
debug ( type=boolean
| default=false
| optional
)
- Enable all debug log functionality for the stage, if any.
-
enable ( type=boolean
| default=true
| optional
)
- Indicates if the current stage should be consider for the Pipeline Manager
- Only applies for automatic pipeline building
Configuration Parameters
"enableUS": false
"enableUK": false
"enableCR": false
"filterAreaCode": false
"filterPhoneLength": false
"phoneLength": 10
-
enableUS ( type=boolean
| default=unchecked
| optional
)
- Enable validation logic for US phone numbers
-
enableUK ( type=boolean
| default=unchecked
| optional
)
- Enable validation logic for UK phone numbers
-
enableCR ( type=boolean
| default=unchecked
| optional
)
- Enable validation logic for Costa Rican phone numbers
-
filterAreaCode ( type=boolean
| default=unchecked
| optional
)
- Filter phone numbers by Area Code.
-
filterPhoneLength ( type=boolean
| default=unchecked
| optional
)
- Enable length checking.
-
phoneLength ( type=integer
| default=10
| optional
)
- Length number to check.
Example Output
V------------------------[please call 1-800-555-5555 thank you]------------------------V
^-[please]-V-[call]-V----------------[1-800-555-5555]----------------V-[thank]-V-[you]-^
^-[1]-V-[-]-V-[800]-V-[-]-V-[555]-V-[-]-V-[5555]-^
^-------------------[{phone}]--------------------^
Output Flags
Lex-Item Flags:
- NUMBER - Flagged on all tokens that are numbers.
- SEMANTIC_TAG - Identifies all lexical items which are semantic tags.
- PHONE - Identifies that token as a phone number.
Vertex Flags: