Page History

Release date: September 12th, 2023

Tech Stack

NoSQL DB provider supported

Elasticsearch versions 7.14 - 7.17 and 8.8.1
Opensearch v. 1.1

Java supported

OpenJDK 17

Python supported

Python version 3.11.4

Node.js

Node.js LTS v. 18.1217.1.

New Features

UI/UX

Copy ProcessorID button for pipeline stages

Processors and Recognizers

Addition of Japanese, Korean and Chinese tokenizer setting in pattern based recognizers (Entity, Bestbet, Geonames )

Improvements

Aspire Saga-Parser

Upgraded Aspire libraries to version 5.2
Refactor to decouple Saga from Saga-Parser (it will be available in Aspire 5.3)
Addition of configurable cache to improve processing performance
New setting to run a Python Bridge instance for every Saga Engine created in the Aspire Worker
Removed maximum number of engines setting

Server

Critical and High vulnerabilities found in security scans fixed.
Support to work with Elasticsearch 8.8.1
Addition of HealthCheck endpoint to Saga and Python Bridge
Change in SSO authentication to use OIDC instead of SAML
Addition of T5 and MiniLM models to Python Bridge
Addition of SSL and Authentication to the Python Bridge
Migration to latest version 5 of Javalin
Javalin max payload size is now configurable in config.json file
Improving security by moving logs of data processed from Info to Debug
Migration to Apache 2 licensed version of Elasticsearch client library
Support for Migrated to Java 17
Returning an a user error when using a Processing Unit that doesn't exist when processing text
Addition of the "id" field in the importer for the FAQ recognizer
Https port is not now configurable

UI/UX

Countries/Languages configuration in PostalCode and Number recognizers are now dropdowns instead of free textboxes
Increased maximum length of the hostname in the Python Model recognizer
Upgrade to Angular 15

Recognizers/Processors

Accuracy improvements in PostalCode for USA
Support for additional scientific notation (ex. 2.1E+11) in the Number recognizer
Supporting identification of SSN without format in the FederalId recognizer
Accuracy improvement in FederalId recognizer by discarting discarding invalid SSNs
New setting in DateTime recognizer to identify dates in the past/future/both
Accuracy improvement in CreditCard recognizer by implementing LUHN check
Performance improvement at loading time in Regex and SimpleRegex recognizers
Performance improvement at loading time in the CreditCard recognizer
Performance improvement at loading time in the Entity recognizer
Support of multiple to multiple Supporting many to many synonyms in the Synonym stage
Phone Number supports UK number format more closely, along with area codes

Docker

Image tag now includes base layer name (ex. saga-server:1.3.3-javacio17-base)
Now using CIO recommended base layer

...

NPE in Saga-Parser when retrieving tags
Error in Bestbets recognizer when property is null
Missing entries in Saga export file
Importer fails in Linux fails
Error when selecting a model in FAQ recognizer when default model is not present
Error starting Saga when no indices are present in Elasticsearch
Python bridge is not downloading Bert models in Windows
Error loading a dictionary on start up after importing a bad .sg file
Saga-Parser failing on initialization when numeric settings are passed as string
Saga failing when processing big payload from Elasticsearch
FederalId recognizer not recognizing 11 digit numbers
NPE in synonym stage
Wrong matching text in Regex recognizer when special characters are processed
Index out of bounds error when processing big content with no breaks
Lat/Long recognizer not supporting degree sign when next to a number
Stack overflow error in LucenePipeline stage when processor when configured with Korean tokenizer
Export for data science not working
UI pagination shows wrong numbers after a search is done
Intent recognizer not working with TensorFlow model
Intent and FAQ recognizer not working
Error in Sentence Breaker OpenNLP when used from Saga-Parser
Corrupted data in the Classification Watcher
Processing Unit is not cleaning the Saga Graph when an error occurs traversing the graph
API is creating Processing Unit without a tag
NPE in FederalId recognizer
Preview pop-up overflows with large text
Hitting Enter in the preview textbox adds a new line
Error in Bestbets recognizer when property is null
Releasing vector to memory pool error when processing lots of documents
Unknow query error in GeoNames recognizer
Bert models stuck loading in Python Bridge
UI issues adding a stage in the Lucene Pipeline processor
Invalid combination of arguments error in Python Bridge when using a Bert model in the Intent Recognizer
Error when using Lucene Pipeline processor with no tokenizer, added validation and default tokenizer
NPE in Saga-Parser when using Aspire with embedded JRE
Phone Number recognizer was not checking correctly the area code
Engine provider should wait for resources to be loaded before creating new engines
Error when selecting many tags in Export for Data Science
Pagination at the bottom of the UI is not refreshing correctly
Error when trying to authenticate without host/port in Python Model recognizer

Note: you can find all the details in Jira here.

Known Issues

Spell checker only works with Elasticsearch
Saga-Parser using recognizers that need Python Bridge, cannot connect to a Python Bridge set up with HTTPS, only HTTP
Debug setting in Saga-parser causes a NPE during crawl. This setting needs to be disabled
Statistics screen is not rendering correctly after an Evaluation with a dataset is performed
Check Tag button in preview screen is not working properly
Depending on connection quality a false "Server Down" message could appear in the UI

Page tree

Versions Compared

Old Version 3

New Version Current

Key

Tech Stack

NoSQL DB provider supported

Java supported

Python supported

Node.js

New Features

UI/UX

Processors and Recognizers

Improvements

Aspire Saga-Parser

Server

UI/UX

Recognizers/Processors

Docker

Known Issues