Page History

...

Step 8. Configure Saga Parser on Aspire

Image RemovedImage Removed

Image AddedImage Added

Image Added

Config Path: Location of the config.json downloaded earlier

Create Python Bridge per engine: Option to create and start a python bridge PER SAGA engine used.

Python Bridge path: Folder path to the python bridge you want to spawn (it MUST have the venv created and with all the requirements installed).

Match Type: Type of SAGA output match (Match Extraction or Analytics).

Match Extraction: This response type returns an array with all the Sematic Tags matches.

Analytics: This response type returns an array with any non Token matches.

Process fields: Path of the content you want to process inside the AspireObject.

Engine Pool Size: Number of SAGA engines.

Create Engines Beforehand: Create the Engines BEFORE crawling instead at the time of actual cralws.

Tags/Processors: Select if you want to use SAGA tags or a specific Processor (pipeline stage).

Tags:The tag name the we wanted to use List of SAGA tags you want to process. It needs to have at least ONE tag.

Use Exact Tags: If you want to use the exact names of tags (If you use container tags, probably you want to disable this).

Processor: Specific processor you want to process from a pipeline.

FLAGS Include Flags: The Name of the Flags that you wanted to usewant to use. By default is SEMANTIC_TAG and this option cannot be empty.

Exclude Flags: Flags you want to skip and not add to the final output.

Cache Results: Enabling this will cache the most used results to improve performance.

Debug: Enable debug log messages.

Step 9. Save the configuration

...

Page tree