Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Step 8. Configure Saga Parser on Aspire

       Image RemovedImage Removed

Image AddedImage Added

Image Added

Image Added


      Config Path: Location of the config.json downloaded earlier

      Create Python Bridge per engine: Option to create and start a python bridge PER SAGA engine used.

                Python Bridge path: Folder path to the python bridge you want to spawn (it MUST have the venv created and with all the requirements installed).

      Match Type: Type of SAGA output match (Match Extraction or Analytics).

                Match Extraction: This response type returns an array with all the Sematic Tags matches.

                Analytics: This response type returns an array with  any non Token matches.

      Process fields: Path of the content you want to process inside the AspireObject.

      Engine Pool Size: Number of SAGA engines.

      Create Engines Beforehand: Create the Engines BEFORE crawling instead at the time of actual cralws.

      Tags/Processors: Select if you want to use SAGA tags or a specific Processor (pipeline stage).

              Tags:The tag name the we wanted to use List of SAGA tags you want to process. It needs to have at least ONE tag.

              Use Exact Tags: If you want to use the exact names of tags (If you use container tags, probably you want to disable this).

              Processor: Specific processor you want to process from a pipeline.

      FLAGS Include Flags: The Name of the Flags that you wanted to usewant to use. By default is SEMANTIC_TAG and this option cannot be empty.

      Exclude Flags: Flags you want to skip and not add to the final output.

      Cache Results: Enabling this will cache the most used results to improve performance.

      Debug: Enable debug log messages.


Step 9. Save the configuration

...