Step 1. Download Saga Parser
Step 2. Copy the .jar files to Aspire
Usage of add-ons from Saga-Parser
If you are using the Saga-Parser from Aspire and also using one of the recognizers/processors described in add-on stages, make sure the corresponding .jar file is inside the "lib" folder in Aspire too. You can get these .jar from SagaAspire.zip file, in the "lib" sub-folder.
The location of where Saga-Parser should get the JARs from is configurable in the Saga-Parser component UI, make sure the folder "lib" path is correct.
Step 3. Configure the settings.json of Aspire 5
"repositories: { ... "bundleVersions": { "bundle": [ { "@artifactId": "app-saga-parser", "@groupId": "com.accenture.aspire", "@version": "5.2.2.134000" }, { "@artifactId": "aspire-saga-parser", "@groupId": "com.accenture.aspire", "@version": "5.2.2.134000" } ] } }
Step 4. Add Saga Config in Aspire 5
Create config.json (The "logging_provider" bit is ONLY FOR 1.3.4^)
NOTE: The tagManager and pipelineManager keys are also removed in 1.3.4^
{ "config": { "libraryJars": ["./lib"], "tagManager": { "resource": "saga-provider:saga_tags" }, "pipelineManager": { "resource": "saga-provider:saga_pipelines" }, "providers": [ { "name": "filesystem-provider", "type": "FileSystem", "baseDir": "./config" }, { "name": "saga-provider", "type": "Elastic", "nodeUrls": ["http://localhost:9200"], "timestamp": "updatedAt", "authentication": "none", "caFilePath": "", "trustAllSSL": false, "indexName": "saga", "exclude": [ "updatedAt", "createdAt" ], "maxResults": 2000000 } ], "logging_provider": { "name": "saga-parser-logging-provider", "type": "Elastic", "nodeUrls": ["http://localhost:9200"], "timestamp": "updatedAt", "indexName": "saga", "authentication": "none", "timeout": 90, "delay": 5, "retries": 3, "exclude": [ ] } } }
For OpenSearch, replace the "saga-provider" section of type "Elastic" with the corresponding OpenSearch configuration as shown here:
{
"name"
:
"saga-provider"
,
"type"
:
"OpenSearch"
,
"nodeUrls"
: [
"http://localhost:9200"
],
"timestamp"
:
"updatedAt"
,
"indexName"
:
"saga"
,
"trustAllSSL"
:
false
,
"timeout"
: 90,
"delay"
: 5,
"retries"
: 3,
"include"
: [],
"exclude"
: [],
"track_total_hits"
:
true
,
"maxResults"
: 10000
}
Also, you need to add these two files as well: eventAppenderTemplate.json and sagaLoggerTemplate.json. (VERSION 1.3.4)
{ "mdc": { "$resolver": "mdc" }, "exception": { "exception_class": { "$resolver": "exception", "field": "className" }, "exception_message": { "$resolver": "exception", "field": "message" }, "stacktrace": { "$resolver": "exception", "field": "stackTrace", "stackTrace": { "stringified": true } } }, "line_number": { "$resolver": "source", "field": "lineNumber" }, "class": { "$resolver": "source", "field": "className" }, "@version": 1, "source_host": "${hostName}", "message": { "$resolver": "message", "stringified": true }, "thread_name": { "$resolver": "thread", "field": "name" }, "@timestamp": { "$resolver": "timestamp" }, "level": { "$resolver": "level", "field": "name" }, "file": { "$resolver": "source", "field": "fileName" }, "method": { "$resolver": "source", "field": "methodName" }, "logger_name": { "$resolver": "logger", "field": "name" } }
{ "index_patterns": [ "*_logger*" ], "priority": 300, "template": { "settings": { "index": { "refresh_interval": "5s" } }, "mappings": { "properties": { "class": { "type": "text", "index": false }, "message": { "type": "text", "index": false }, "date_time": { "type": "date", "format": "date_time" }, "level": { "type": "text", "index": false } } } } }
Step 5. Run Aspire 5
Step 6. Add Saga Parser in the Extension Manager
Type name - Name of the Extension
Extension type - Choose application
Maven Coordinates - com.accenture.aspire:app-saga-parser:{Saga Parser Version}
Step 7. Add Saga Parser to your workflow
Step 8. Configure Saga Parser on Aspire
Config Path: Location of the config.json downloaded earlier
Create Python Bridge per engine: Option to create and start a python bridge PER SAGA engine used.
Python Bridge path: Folder path to the python bridge you want to spawn (it MUST have the venv created and with all the requirements installed).
Python Bridge Base Port Number: The base port number that the Python Bridge(s) will be created.
Protocol: Transport Protocol serving on the python bridge to use.
Disable SSL verification: If you are using a python bridge with HTTPS, the Saga Parser will try to verify the certificate, but if you are using a Self-Signed one, it will drop the connection. This option disables this verification (The SSL disable MUST be configured as well on the SAGA configuration).
Python server authentication: Checked only if the python server has authentication (username/password) enabled.
Username: Specify the username to authenticate.
Password: Specify the password to authenticate.
Match Type: Type of SAGA output match (Match Extraction or Analytics).
Match Extraction: This response type returns an array with all the Sematic Tags matches.
Analytics: This response type returns an array with any non Token matches.
Process fields: Path of the content you want to process inside the AspireObject.
Engine Pool Size: Number of SAGA engines.
Create Engines Beforehand: Create the Engines BEFORE crawling instead at the time of actual crawls.
Tags/Processors: Select if you want to use SAGA tags or a specific Processor (pipeline stage).
Tags: List of SAGA tags you want to process. It needs to have at least ONE tag.
Use Exact Tags: If you want to use the exact names of tags (If you use container tags, probably you want to disable this).
Processor: Specific processor you want to process from a pipeline.
Include Flags: The Name of the Flags that you want to use. By default is SEMANTIC_TAG and this option cannot be empty.
Exclude Flags: Flags you want to skip and not add to the final output.
Cache Results: Enabling this will cache the most used results to improve performance.
Debug: Enable debug log messages.
Step 9. Save the configuration