This page contains a guide on how to install and use the SAGA Suite (Saga-Server/Enterprise Search UI and Python Bridge) using Docker.
In case you only need one element of the SAGA Suite, then, we've got you covered! Down below this page we have more specific instructions for all 3 different products.
Table of Contents |
---|
Tip |
---|
For a better understanding of this documentation please review this Docker documentation topics: |
Note |
---|
This quickstart-guide with Docker provides an easy way to run, test and develop with SAGA, but in order to do a production deployment or expose these services publicly, you will need to do all the changes required, including changing of configurations, creation of certificates, enabling security measures and whatever is needed to production environments. |
In order to run all elements of the SAGA Suite and Elasticsearch, the saga-quickstart.zip contains the files needed. So, the steps to run are the following:
Open a Command prompt (CMD/Powershell in Windows, sh/bash in Linux) and login to the docker artifactory repository
Code Block |
---|
docker login docker.repository.sca.accenture.com |
You will be prompted for your credentials (registered email and password).
Info |
---|
This step is only required the first time, once you have logged in, Docker will remember the credentials so you do not need to re-run this command. |
In linux, you may need first to give execution permissions before running the shell file.
Code Block |
---|
chmod +x saga-quickstart.sh |
Info |
---|
There are two more options within the shell script, in case you need to execute the script using sudo permissions, this happens when docker is not configured properly and requires to use sudo each time a docker command is run. |
Warning |
---|
If you have anything running in those ports, the quickstart will fail because docker cannot start properly the images due to ports already in use. |
Info |
---|
The first time you start these scripts, it may take a some time to start everything because they need to download the images and then run them in order. |
When you are done with your work, you can remove everything by executing the uninstall scripts (bat in Windows, sh in Linux).
saga-cleanup.bat - Windows
saga-cleanup.sh - Linux
This cleanup script stops the docker containers running, deletes them after stopped and removes the network where they were running.
Now, if you only require the docker image of Saga-Server, then you do not need to download, run and mount the whole quickstart from above.
You can get pull the image from our official repository using:
Code Block |
---|
docker pull docker.repository.sca.accenture.com/docker/saga-server:1.3.3-javacio17-base //OR/================================ CIO based images ========================================== docker pull docker.repository.sca.accenture.com/docker/saga-server:1.3.3-tensor4-javacio17-base |
Info |
If you want the latest features of SAGA, you can download the SNAPSHOT versions of our next release, however, they may be volatile and be used with caution. The tags for the SNAPSHOT versions are these ones: //OR docker pull docker.repository.sca.accenture.com/docker/saga-server:1.3.4-SNAPSHOT- tensor-javacio17-base |
And from then you can configure it and use it as you need, however there are some things you need to know before pulling and running this docker image:
If it is your first time pulling the images, you will need to login with docker to our artifactory, otherwise, you will not be able to pull the image.
//=============================== Alpine based images ======================================== // ONLY FOR 1.3.4 ONWARDS docker pull docker.repository.sca.accenture.com/docker/saga-server:1.3.4-alpine3.19 //OR docker pull docker.repository.sca.accenture.com/docker/saga-server:1.3.4-tensor-alpine3.19 |
And from then you can configure it and use it as you need, however there are some things you need to know before pulling and running this docker image:
If it is your first time pulling the images, you will need to login with docker to our artifactory, otherwise, you will not be able to pull the image. (ONLY NEEDED FOR CIO IMAGES)
Code Block |
---|
docker login docker.repository.sca.accenture.com |
Then, as said before, you must enter your artifactory credentials.
SAGA_CONFIG : this JSON-like string contains all the config to overwrite the default one that comes with the image. It must be a one liner string due to environmental limitations when using docker.
For example:
This is a fragment
Then, as said before, you must enter your artifactory credentials.
SAGA_CONFIG : this JSON-like string contains all the config to overwrite the default one that comes with the image. It must be a one liner string due to environmental limitations when using docker.
For example:
This is a fragment of the config for SAGA to run.
Code Block | ||
---|---|---|
| ||
{ "config": { "apiPort": 8080, "secureApiPort": 443, "host": "0.0.0.0", "allow-domains": "localhost", "serverTimeout": 30000, "maxRequestPayloadSize": 1000000, "cors": { "allow_origins": [ "http://localhost:8080", "https://localhost", "https://login.microsoftonline.com" ], "allow_credentials": "true", "allow_methods": ["*"], "allow_headers": ["*"], "expose_headers": ["*"] }, "security": { "enable": false, "encryptionKeyFile" : "./bin/saga.ek", "inactiveInterval": 600, "defaultRole": "admin", "users": [{ "username": "admin", "password": "notpassword", "roles": "admin" }], "type": "none", "openid": { "serverURL": "http://localhost:8080", "clientId": "clientId", "discoveryURI": "discoveryURI" } }, "ssl": { "enable": false, "keyStore": "./bin/saga.jks", "keyStorePassword": "encrypted:KCe8RrPQ8MV3po8NqHo0G7q7sa6T6yzf1JrTQ5VD0uty0elmrqRuybaAmrEHJ37d" }, "libraryJars": [ "./lib" ], "exportSettings" : { "maxSize" : 40, "batchSize" : 5000 }, "restHandlers": [], "models": [], "uiHandlers": [], "providers": [ { "name": "filesystem-provider", "type": "FileSystem", "baseDir": "./config" }, { "name": "saga-provider", "type": "Elastic", "nodeUrls": ["http://localhost:9200"], "timestamp": "updatedAt", "indexName": "saga", "encryptionKeyFile" : "./bin/saga.ek", "authentication": "none", "caFilePath": "", "timeout": 90, "delay": 5, "retries": 3, "exclude": [ ] } ], "gpt3": { "key": "", "openAIHost": "https://api.openai.com", "openAIAPIVersion": "v1" } } } |
And this is in the one-liner style that the SAGA_CONFIG environmental variable accepts:
Code Block | ||
---|---|---|
| ||
{"config":{"apiPort":8080,"secureApiPort":443,"host":"0.0.0.0","allow-domains":"localhost","serverTimeout":30000,"maxRequestPayloadSize":1000000,"cors":{"allow_origins":["http://localhost:8080","https://localhost","https://login.microsoftonline.com"],"allow_credentials":"true","allow_methods":["*"],"allow_headers":["*"],"expose_headers":["*"]},"security":{"enable":false,"encryptionKeyFile":"./bin/saga.ek","inactiveInterval":600,"defaultRole":"admin","users":[{"username":"admin","password":"notpassword","roles":"admin"}],"type":"none","openid":{"serverURL":"http://localhost:8080","clientId":"clientId","discoveryURI":"discoveryURI"}},"ssl":{"enable":false,"keyStore":"./bin/saga.jks","keyStorePassword":"encrypted:KCe8RrPQ8MV3po8NqHo0G7q7sa6T6yzf1JrTQ5VD0uty0elmrqRuybaAmrEHJ37d"},"libraryJars":["./lib"],"exportSettings":{"maxSize":40,"batchSize":5000},"restHandlers":[],"models":[],"uiHandlers":[],"providers":[{"name":"filesystem-provider","type":"FileSystem","baseDir":"./config"},{"name":"saga-provider","type":"Elastic","nodeUrls":["http://localhost:9200"],"timestamp":"updatedAt","indexName":"saga","encryptionKeyFile":"./bin/saga.ek","authentication":"none","caFilePath":"","timeout":90,"delay":5,"retries":3,"exclude":[]}],"gpt3":{"key":"","openAIHost":"https://api.openai.com","openAIAPIVersion":"v1"}}} |
Note |
---|
This SAGA_CONFIG is temporal, we needed a way to configure environmental variables quickly. |
method has been deprecated (you can still use it), but now, you can use your own config files with all the environmental variables you want to configure and SAGA will use them accordingly. |
SAGA_CONFIG_PATH: Path to a config file that will be used instead of the default ones. Added to make mounting a docker volume easier.
Info |
---|
This path is to a config file, e.g <PATH>/config.json. The files needed to be inside the same path as the config file are: |
JAVA_MAX_META_MEMORY : The amount of initial memory Saga will start with. Default is '1024m'.
Info |
---|
All these values |
Info |
All these values can be marked as ‘g' for GB, ‘m' for MB and ‘k' for KB. |
SAGA_ELASTIC_PASSWORD: The password to use on the providers section when connecting to Elastic/Opensearch.
Info |
---|
These last two environmental variables are used together and they'll call the "SAGA-Secure" jar to encrypt the password and automatically change the values on the SAGA config file. |
SAGA_DISABLE_SSL_VERIFY: This will disable all SSL certificate verification on SAGA for all requests (HTTP/Direct to provider). (USE IT ONLY FOR DEVELOPMENT/STAGING)
As well as the SAGA image, you can download the Enterprise Search UI docker image and run it where you need it using:
Code Block |
---|
docker pull docker.repository.sca.accenture.com/docker/esui:latest |
For the Saga-Python-Bridge, the image can be downloaded using:
Code Block |
---|
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.3-ubuntu22.04cio-base |
ELASTIC_CA_PATH: The custom certificate path to connect to Elastic/OpenSearch via HTTPS.
SAGA_CA_PATH: The custom certificate path to use for enabling HTTPS in SAGA.
Info |
---|
These last two environmental variables are used inside the docker entrypoint to import certificates to keystores. THEY NEED TO POINT TO AN EXISTING FILE INSIDE THE CONTAINER. |
Note | |||||
---|---|---|---|---|---|
| |||||
If you need to set multiple variables at the same time, you can use an environment file and send it at runtime inside the container to have them ready to use. We have an example file in the repo of SAGA, but it looks something like this:
|
Note | ||
---|---|---|
| ||
You can use volumes as well to mount specific files into the running container, e.g. the certificates files and the config.json. Here you can see the documentation regarding the configuration and usage of volumes in docker. Also in Kubernetes you can mount volumes as well. |
As well as the SAGA image, you can download the Enterprise Search UI docker image and run it where you need it using:
Code Block |
---|
docker pull docker.repository.sca.accenture.com/docker/esui:latest |
Some of the basic configuration that can be do it via environmental variables are these:
For the Saga-Python-Bridge, the image can be downloaded using:
Code Block |
---|
//================================ CIO based images ==========================================
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.3-ubuntu22.04cio-base
// ONLY FOR 1.3.4 ONWARDS
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.4-ubuntu22.04cio-basic
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.4-ubuntu22.04cio-all
//=============================== Debian 12 based images ========================================
// ONLY FOR 1.3.4 ONWARDS
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.4-debian12-basic
docker pull docker.repository.sca.accenture.com/docker/saga-python-bridge:1.3.4-debian12-all |
Info |
---|
Regarding the images, we have a "basic" with just the required dependencies to make the python bridge work with no models, and an "all" tag with all the libraries installed, the difference is between (5 to 10 GB in size depending on the image). |
And from then you can configure it and use it as you need, however there are some things you need to know before pulling and running this docker image:
If it is your first time pulling the images, you will need to login with docker to our artifactory, otherwise, you will not be able to pull the image. (ONLY NEEDED FOR CIO IMAGES)
Code Block |
---|
docker login docker.repository.sca.accenture.com |
Then, as said before, you must enter your artifactory credentials.
PB_CONFIG : this JSON-like string contains all the config to overwrite the default one that comes with the image. It must be a one liner string due to environmental limitations when using docker.
For example:
This is a fragment of the config for the python bridge to run.
Code Block | ||
---|---|---|
| ||
{
"host": "0.0.0.0",
"port": 5000,
"ssl": {
"enabled": false,
"secure_port": 5443,
"certificate": {
"cert": "cert.pem",
"key": "key.pem"
}
},
"authentication": {
"enabled": false,
"credentials": {
"user": "admin",
"password": "password"
}
},
"threads": 30,
"logging": {
"level": "info",
"loggers": {
"werkzeug": "info",
"gensim.utils": "warn",
"pytorch_pretrained_bert.modeling": "warn",
"pytorch_pretrained_bert.tokenization": "warn"
}
},
"models_data_dir": "models_data",
"model_types": {
"Classification_watcher_example": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"Classification_watcher_example"
],
"default_model": "Classification_watcher_example"
},
"LatentSemanticIndexing": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"lsi"
]
},
"Bert": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"bert-base-uncased"
],
"default_model": "bert-base-uncased"
},
"BioBert": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"biobert-base-cased-v1.2"
],
"default_model": "biobert-base-cased-v1.2"
},
"BertQA": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"bert-large-uncased-whole-word-masking-finetuned-squad"
],
"default_model": "bert-large-uncased-whole-word-masking-finetuned-squad"
},
"SBert": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"bert-base-nli-stsb-mean-tokens",
"bert-base-nli-mean-tokens",
"distilbert-base-nli-stsb-mean-tokens"
],
"default_model": "bert-base-nli-stsb-mean-tokens"
},
"SentimentAnalysisVader": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"vader"
]
},
"SentimentAnalysisTextBlob": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"textBlob"
]
},
"TfidfVectorizer": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"Tfidf"
]
},
"GTRT5": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"gtr-t5-base",
"gtr-t5-xl"
],
"default_model": "gtr-t5-base"
},
"T5": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"sentence-t5-base"
]
},
"MiniLM": {
"enabled": false,
"input_data_as_tokens": false,
"model_names": [
"all-MiniLM-L12-v2"
]
}
}
} |
And this is in the one-liner style that the PB_CONFIG environmental variable accepts:
Code Block | ||
---|---|---|
| ||
{"host":"0.0.0.0","port":5000,"ssl":{"enabled":false,"secure_port":5443,"certificate":{"cert":"cert.pem","key":"key.pem"}},"authentication":{"enabled":false,"credentials":{"user":"admin","password":"password"}},"threads":30,"logging":{"level":"info","loggers":{"werkzeug":"info","gensim.utils":"warn","pytorch_pretrained_bert.modeling":"warn","pytorch_pretrained_bert.tokenization":"warn"}},"models_data_dir":"models_data","model_types":{"Classification_watcher_example":{"enabled":true,"input_data_as_tokens":false,"model_names":["Classification_watcher_example"],"default_model":"Classification_watcher_example"},"LatentSemanticIndexing":{"enabled":false,"input_data_as_tokens":false,"model_names":["lsi"]},"Bert":{"enabled":true,"input_data_as_tokens":false,"model_names":["bert-base-uncased"],"default_model":"bert-base-uncased"},"BioBert":{"enabled":false,"input_data_as_tokens":false,"model_names":["biobert-base-cased-v1.2"],"default_model":"biobert-base-cased-v1.2"},"BertQA":{"enabled":false,"input_data_as_tokens":false,"model_names":["bert-large-uncased-whole-word-masking-finetuned-squad"],"default_model":"bert-large-uncased-whole-word-masking-finetuned-squad"},"SBert":{"enabled":false,"input_data_as_tokens":false,"model_names":["bert-base-nli-stsb-mean-tokens","bert-base-nli-mean-tokens","distilbert-base-nli-stsb-mean-tokens"],"default_model":"bert-base-nli-stsb-mean-tokens"},"SentimentAnalysisVader":{"enabled":true,"input_data_as_tokens":false,"model_names":["vader"]},"SentimentAnalysisTextBlob":{"enabled":true,"input_data_as_tokens":false,"model_names":["textBlob"]},"TfidfVectorizer":{"enabled":true,"input_data_as_tokens":false,"model_names":["Tfidf"]},"GTRT5":{"enabled":true,"input_data_as_tokens":false,"model_names":["gtr-t5-base","gtr-t5-xl"],"default_model":"gtr-t5-base"},"T5":{"enabled":true,"input_data_as_tokens":false,"model_names":["sentence-t5-base"]},"MiniLM":{"enabled":true,"input_data_as_tokens":false,"model_names":["all-MiniLM-L12-v2"]}}} |
Note |
---|
This PB_CONFIG method has been deprecated (you can still use it), but now, you can use your own config files with all the environmental variables you want to configure and the python bridge will use them accordingly.
|
Info |
---|
If you do not need to install libraries (if the image already have them for example) this environmental variable should not be used! |
VADER_ENABLED=To enable the Vader models inside the config file and install the extra dependencies related to it.
Info |
---|
All these "enabled" variables must be sent always if you are using any of these models. OR... You can use the config.json file and enable them from there, but you'll need to manually install the dependencies needed as well. |
Note | |||||
---|---|---|---|---|---|
| |||||
If you need to set multiple variables at the same time, you can use an environment file and send it at runtime inside the container to have them ready to use. We have an example file in the repo of Python Bridge, but it looks something like this:
|
Note | ||
---|---|---|
| ||
You can use volumes as well to mount specific files into the running container, e.g. the certificates files and the config.json. Here you can see the documentation regarding the configuration and usage of volumes in docker. Also in Kubernetes you can mount volumes as well. |
Content by Label | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Page properties | ||
---|---|---|
| ||
|