How to Configure Saga Parser in Aspire 5

Step-by-step guide

Step 1. Download Saga Parser

Go to https://repository.sca.com/artifactory/public/com/accenture/saga/binaries/1.3.4/
Download Saga_Aspire.zip
Extract the Files

Step 2. Copy the .jar files to Aspire

Copy bundles folder from the extracted files from Saga_Aspire.zip.
Paste the bundles folder to aspire 5 parent directory
Verify if app-saga-parser and aspire-saga-parser are properly copied to Aspire 5: {AspireParentDIR}\bundles\aspire

Usage of add-ons from Saga-Parser

If you are using the Saga-Parser from Aspire and also using one of the recognizers/processors described in add-on stages, make sure the corresponding .jar file is inside the "lib" folder in Aspire too. You can get these .jar from SagaAspire.zip file, in the "lib" sub-folder.

The location of where Saga-Parser should get the JARs from is configurable in the Saga-Parser component UI, make sure the folder "lib" path is correct.

Step 3. Configure the settings.json of Aspire 5

Go to {AspireParentDIR}\config
Open settings.json
Add Saga Parser bundle inside "repositories"

bundle

"repositories: {
    ...

	"bundleVersions": {
		"bundle": [
				{
                	"@artifactId": "app-saga-parser",
                	"@groupId": "com.accenture.aspire",
                	"@version": "5.2.2.134000"
                },
                {
                	"@artifactId": "aspire-saga-parser",
                	"@groupId": "com.accenture.aspire",
                	"@version": "5.2.2.134000"
                }
		]
	}
}

Step 4. Add Saga Config in Aspire 5

Create config.json (The "logging_provider" bit is ONLY FOR 1.3.4^)
NOTE: The tagManager and pipelineManager keys are also removed in 1.3.4^

config.json

{
  "config": {
    "libraryJars": ["./lib"],
    "tagManager": {
      "resource": "saga-provider:saga_tags"
    },
    "pipelineManager": {
      "resource": "saga-provider:saga_pipelines"
    },
    "providers": [
      {
        "name": "filesystem-provider",
        "type": "FileSystem",
        "baseDir": "./config"
      },
      {
        "name": "saga-provider",
        "type": "Elastic",
        "nodeUrls": ["http://localhost:9200"],
        "timestamp": "updatedAt",
        "authentication": "none",
		"caFilePath": "",
        "trustAllSSL": false,
        "indexName": "saga",
        "exclude": [
          "updatedAt",
          "createdAt"
        ],
        "maxResults": 2000000
      }
    ],
    "logging_provider": {
      "name": "saga-parser-logging-provider",
      "type": "Elastic",
      "nodeUrls": ["http://localhost:9200"],
      "timestamp": "updatedAt",
      "indexName": "saga",
      "authentication": "none",
      "timeout": 90,
      "delay": 5,
      "retries": 3,
      "exclude": [ ]
    }
  }
}

For OpenSearch, replace the "saga-provider" section of type "Elastic" with the corresponding OpenSearch configuration as shown here:

    {
        "name": "saga-provider",
        "type": "OpenSearch",
        "nodeUrls": ["http://localhost:9200"],
        "timestamp": "updatedAt",
        "indexName": "saga",
        "trustAllSSL": false,
        "timeout": 90,
        "delay": 5,
        "retries": 3,
        "include": [],
        "exclude": [],
        "track_total_hits": true,
        "maxResults": 10000
    }

Paste the created config.json file to {AspireParentDIR}\config\saga
Make sure that the info in config.json are correct.

Also, you need to add these two files as well: eventAppenderTemplate.json and sagaLoggerTemplate.json. (VERSION 1.3.4)

eventAppenderTemplate.json

{
  "mdc": {
    "$resolver": "mdc"
  },
  "exception": {
    "exception_class": {
      "$resolver": "exception",
      "field": "className"
    },
    "exception_message": {
      "$resolver": "exception",
      "field": "message"
    },
    "stacktrace": {
      "$resolver": "exception",
      "field": "stackTrace",
      "stackTrace": {
        "stringified": true
      }
    }
  },
  "line_number": {
    "$resolver": "source",
    "field": "lineNumber"
  },
  "class": {
    "$resolver": "source",
    "field": "className"
  },
  "@version": 1,
  "source_host": "${hostName}",
  "message": {
    "$resolver": "message",
    "stringified": true
  },
  "thread_name": {
    "$resolver": "thread",
    "field": "name"
  },
  "@timestamp": {
    "$resolver": "timestamp"
  },
  "level": {
    "$resolver": "level",
    "field": "name"
  },
  "file": {
    "$resolver": "source",
    "field": "fileName"
  },
  "method": {
    "$resolver": "source",
    "field": "methodName"
  },
  "logger_name": {
    "$resolver": "logger",
    "field": "name"
  }
}

sagaLoggerTemplate.json

{
  "index_patterns": [
    "*_logger*"
  ],
  "priority": 300,
  "template": {
    "settings": {
      "index": {
        "refresh_interval": "5s"
      }
    },
    "mappings": {
      "properties": {
        "class": {
          "type": "text",
          "index": false
        },
        "message": {
          "type": "text",
          "index": false
        },
        "date_time": {
          "type": "date",
          "format": "date_time"
        },
        "level": {
          "type": "text",
          "index": false
        }
      }
    }
  }
}

Step 5. Run Aspire 5

Step 6. Add Saga Parser in the Extension Manager

Go to Tools>Extension Manager
Click New
Fill out necessary Inputs

Type name - Name of the Extension

Extension type - Choose application

Maven Coordinates - com.accenture.aspire:app-saga-parser:{Saga Parser Version}

Step 7. Add Saga Parser to your workflow

Saga parser is located under Publisher>Aspire Saga Parser

Step 8. Configure Saga Parser on Aspire

Config Path: Location of the config.json downloaded earlier

Create Python Bridge per engine: Option to create and start a python bridge PER SAGA engine used.

Python Bridge path: Folder path to the python bridge you want to spawn (it MUST have the venv created and with all the requirements installed).

Python Bridge Base Port Number: The base port number that the Python Bridge(s) will be created.

Protocol: Transport Protocol serving on the python bridge to use.

Disable SSL verification: If you are using a python bridge with HTTPS, the Saga Parser will try to verify the certificate, but if you are using a Self-Signed one, it will drop the connection. This option disables this verification (The SSL disable MUST be configured as well on the SAGA configuration).

Python server authentication: Checked only if the python server has authentication (username/password) enabled.

Username: Specify the username to authenticate.

Password: Specify the password to authenticate.

Match Type: Type of SAGA output match (Match Extraction or Analytics).

Match Extraction: This response type returns an array with all the Sematic Tags matches.

Analytics: This response type returns an array with any non Token matches.

Process fields: Path of the content you want to process inside the AspireObject.

Engine Pool Size: Number of SAGA engines.

Create Engines Beforehand: Create the Engines BEFORE crawling instead at the time of actual crawls.

Tags/Processors: Select if you want to use SAGA tags or a specific Processor (pipeline stage).

Tags: List of SAGA tags you want to process. It needs to have at least ONE tag.

Use Exact Tags: If you want to use the exact names of tags (If you use container tags, probably you want to disable this).

Processor: Specific processor you want to process from a pipeline.

Include Flags: The Name of the Flags that you want to use. By default is SEMANTIC_TAG and this option cannot be empty.

Exclude Flags: Flags you want to skip and not add to the final output.

Cache Results: Enabling this will cache the most used results to improve performance.

Debug: Enable debug log messages.

Step 9. Save the configuration

Page tree

How to Configure Saga Parser in Aspire 5

Step-by-step guide