You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

The Sentence Splitter stage is useful when you want to split sentences using the GAIA API. By adding a regex pattern or using NTLK library, you can split sentences as a part of a pipeline. The regex and NLP options are mutually exclusive, if you use NLP then the regex is ignored.


The use_nlp uses NTLK library to split sentences more precisely at a performance cost.

You need to have installed the python library, otherwise the Stage will return an ImportError.

You can execute the proper pip install command beforehand to have the library installed and ready to be used.


Properties

PropertyDescriptionDefaultTypeRequired
typeStage class name-stringYes
enableEnable stage for executiontruebooleanNo
nameName for this specific stagestringNo
use_nlpIndicates the stage to use NLTK to split the sentences instead of regex patterns.FalsebooleanNo
regexIndicates the regex pattern that will be used to split the sentences.[//.|//!|//?]\s+stringNo

Example Configuration

_split_sentence = SentenceSplitterStage(
	use_nlp=False
    regex="[//.|//!|//?]\s+",
    enable=True,
    name='split_sentence_stage',
)
  • No labels