Python Bridge can be configured using the config.json file in the config folder:
{ "host": "0.0.0.0", "port": 5000, "ssl": { "enabled": false, "secure_port": 5443, "certificate": { "cert": "cert.pem", "key": "key.pem" } }, "authentication": { "enabled": false, "credentials": { "user": "admin", "password": "password" } }, "threads": 30, "logging": { "level": "info", "loggers": { "werkzeug": "info", "gensim.utils": "warn", "pytorch_pretrained_bert.modeling": "warn", "pytorch_pretrained_bert.tokenization": "warn" } }, "models_data_dir": "models_data", "model_types": { "LatentSemanticIndexing" : { "enabled": true, "input_data_as_tokens": false, "model_names": ["lsi"] }, "Bert": { "enabled": true, "input_data_as_tokens": false, "model_names": ["bert-base-uncased"], "default_model": "bert-base-uncased" }, "BioBert": { "enabled": false, "input_data_as_tokens": false, "model_names": ["biobert-base-cased-v1.2"], "default_model": "biobert-base-cased-v1.2" }, "BertQA": { "enabled": true, "input_data_as_tokens": false, "model_names": ["bert-large-uncased-whole-word-masking-finetuned-squad"], "default_model": "bert-large-uncased-whole-word-masking-finetuned-squad" }, "SBert": { "enabled": true, "input_data_as_tokens": false, "model_names": ["bert-base-nli-stsb-mean-tokens", "bert-base-nli-mean-tokens", "distilbert-base-nli-stsb-mean-tokens"], "default_model": "bert-base-nli-stsb-mean-tokens" }, "SentimentAnalysisVader": { "enabled": true, "input_data_as_tokens": false, "model_names": ["vader"] }, "SentimentAnalysisTextBlob": { "enabled": true, "input_data_as_tokens": false, "model_names": ["textBlob"] }, "TfidfVectorizer": { "enabled": true, "input_data_as_tokens": false, "model_names": ["Tfidf"] } } }
In Line 2, the host ( type=string | default=0.0.0.0 | required ) - Identifies the host from which the service will listen to requests. By default the service accepts all request
Line 3, the port ( type=integer | default=5000 | required ) - Port in which the service will listen for requests
ssl ( type=json | optional ) - Specifies the availability and settings for SSL operation
authentication ( type=json | optional ) - contains the settings for enabling basic authentication
In the logging section the logging level can be specified for the root logger, and the level for specific loggers
models_data_dir ( type=string | default=models_data | required ) - path to folder storing the models
model_types ( type=json | required ) - section holding the types of model to load. If a new type is added, it needs to be added here too
Each type of model is constructed with an specific structure:
"Model_Type" : { "enabled": {True, False}, "input_data_as_tokens": {True, False}, "model_names": ["name1", "name2", "name3"], "default_model": "name1" }
A model can be retrain, or a new model can be generated using the same algorithm with different parameters, so it could be said every single one of those parameters is a different version, each one of this versions can be stored in folders inside the model_name folder. An example of the directory tree and how each folder is named can be seen below
models_data │ ├───Model_Name │ ├───name1 │ | ├───1 │ | ├───2 │ | └───3 │ ├───name2 │ | ├───1 │ | └───2 │ └───name3 │ └───1 ├───Bert │ └───bert-base-uncased │ └───1 ├───BertQA │ └───bert-large-uncased-whole-word-making-finetuned-squad │ └───1 ├───LatentSemanticIndexing │ └───lsi │ ├───1 │ └───2 ├───SBert │ ├───bert-base-nli-mean-tokens │ │ └───1 │ ├───bert-base-nli-stsb-mean-tokens │ │ └───1 │ └───distilbert-base-nli-mean-tokens │ └───1 ├───SentimentAnalysisTextBlob │ └───textBlob │ └───1 ├───SentimentAnalysisVader │ └───vader │ └───1 └───TfidfVectorizer └───tfidf └───1