How to Configure Saga Parser in Aspire 5

Step-by-step guide

Step 1. Download Saga Parser

Go to https://repository.searchtechnologies.com/artifactory/public/com/accenture/saga/binaries/1.3.2.1/
Download Saga_Aspire.zip
Extract the Files

Step 2. Copy the .jar files to Aspire

Copy bundles folder from the extracted files from Saga_Aspire.zip.
Paste the bundles folder to aspire 5 parent directory
Verify if app-saga-parser and aspire-saga-parser are properly copied to Aspire 5: {AspireParentDIR}\bundles\aspire

Step 3. Configure the settings.json of Aspire 5

Go to {AspireParentDIR}\config
Open settings.json
Add Saga Parser bundle inside "repositories"

bundle

"bundleVersions": {
	"bundle": [
				{
                	"@artifactId": "app-saga-parser",
                	"@groupId": "com.accenture.aspire",
                	"@version": "5.0.3.132100"
                },
                {
                	"@artifactId": "aspire-saga-parser",
                	"@groupId": "com.accenture.aspire",
                	"@version": "5.0.3.132100"
                }
	]
}

Step 4. Add Saga Config in Aspire 5

Create config.json

config.json

{
  "config": {
    "libraryJars": ["./lib"],
    "tagManager": {
      "resource": "saga-provider:saga_tags"
    },
    "pipelineManager": {
      "resource": "saga-provider:saga_pipelines"
    },
    "providers": [
      {
        "name": "filesystem-provider",
        "type": "FileSystem",
        "baseDir": "./config"
      },
      {
        "name": "saga-provider",
        "type": "Elastic",
        "nodeUrls": ["http://localhost:9200"],
        "timestamp": "updatedAt",
        "authentication": "none",
        "indexName": "saga",
        "exclude": [
          "updatedAt",
          "createdAt"
        ],
        "maxResults": 2000000
      }
    ]
  }
}

Paste the created config.json file to {AspireParentDIR}\config\saga
Make sure that the info in config.json are correct.

Step 5. Run Aspire 5

Step 6. Add Saga Parser in the Extension Manager

Go to Tools>Extension Manager
Click New
Fill out necessary Inputs

Type name - Name of the Extension

Extension type - Choose application

Maven Coordinates - com.accenture.aspire:app-saga-parser:{Saga Parser Version}

Step 7. Add Saga Parser to your workflow

Saga parser is located under Publisher>Aspire Saga Parser

Step 8. Configure Saga Parser on Aspire

Config Path: Location of the config.json downloaded earlier

Create Python Bridge per engine: Option to create and start a python bridge PER SAGA engine used.

Python Bridge path: Folder path to the python bridge you want to spawn (it MUST have the venv created and with all the requirements installed).

Match Type: Type of SAGA output match (Match Extraction or Analytics).

Match Extraction: This response type returns an array with all the Sematic Tags matches.

Analytics: This response type returns an array with any non Token matches.

Process fields: Path of the content you want to process inside the AspireObject.

Engine Pool Size: Number of SAGA engines.

Create Engines Beforehand: Create the Engines BEFORE crawling instead at the time of actual cralws.

Tags/Processors: Select if you want to use SAGA tags or a specific Processor (pipeline stage).

Tags: List of SAGA tags you want to process. It needs to have at least ONE tag.

Use Exact Tags: If you want to use the exact names of tags (If you use container tags, probably you want to disable this).

Processor: Specific processor you want to process from a pipeline.

Include Flags: The Name of the Flags that you want to use. By default is SEMANTIC_TAG and this option cannot be empty.

Exclude Flags: Flags you want to skip and not add to the final output.

Cache Results: Enabling this will cache the most used results to improve performance.

Debug: Enable debug log messages.

Step 9. Save the configuration

Page tree

How to Configure Saga Parser in Aspire 5

Step-by-step guide