Configuring Python Bridge

Python Bridge can be configured using the config.json file in the config folder:

Default Python Bridge Configuration

{
    "host": "0.0.0.0",
    "port": 5000,
    "ssl": {
        "enabled": false,
        "secure_port": 5443,
        "certificate": {
            "cert": "cert.pem",
            "key": "key.pem"
        }
    },
    "authentication": {
        "enabled": false,
        "credentials": {
            "user": "admin",
            "password": "password"
        }
    },
    "threads": 30,
    "logging": {
        "level": "info",
        "loggers": {
            "werkzeug": "info",
            "gensim.utils": "warn",
            "pytorch_pretrained_bert.modeling": "warn",
            "pytorch_pretrained_bert.tokenization": "warn"
        }
    },
    "models_data_dir": "models_data",
    "model_types": {
        "LatentSemanticIndexing" : {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["lsi"]
        },
        "Bert": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-uncased"],
            "default_model": "bert-base-uncased"
        },
		"BioBert": {
             "enabled": false,
             "input_data_as_tokens": false,
             "model_names": ["biobert-base-cased-v1.2"],
             "default_model": "biobert-base-cased-v1.2"
        },
        "BertQA": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-large-uncased-whole-word-masking-finetuned-squad"],
            "default_model": "bert-large-uncased-whole-word-masking-finetuned-squad"
        },
        "SBert": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-nli-stsb-mean-tokens", "bert-base-nli-mean-tokens", "distilbert-base-nli-stsb-mean-tokens"],
            "default_model": "bert-base-nli-stsb-mean-tokens"
        },
        "SentimentAnalysisVader": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["vader"]
        },
        "SentimentAnalysisTextBlob": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["textBlob"]
        },
        "TfidfVectorizer": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["Tfidf"]
        }
    }
}

In Line 2, the host ( type=string | default=0.0.0.0 | required ) - Identifies the host from which the service will listen to requests. By default the service accepts all request

Line 3, the port ( type=integer | default=5000 | required ) - Port in which the service will listen for requests

ssl ( type=json | optional ) - Specifies the availability and settings for SSL operation

enabled ( type=boolean | default=false | required ) - Enables or disables SSL configuration
secure_port ( type=integer | default=5443 | required ) - The port number for secure connection
cert ( type=string | required ) - The path to the certificate file
key ( type=string | required ) - The path to the key file

authentication ( type=json | optional ) - contains the settings for enabling basic authentication

enabled ( type=boolean | default=false | required ) - enables or disables basic authentication
user ( type=string | default=admin | required ) - the user id for basic authentication
password ( type=string | default=password | required ) - the authentication password

In the logging section the logging level can be specified for the root logger, and the level for specific loggers

level ( type=string | default=info | required ) - Level of the root logger, and by extension all loggers without a set level
loggers ( type=json | optional ) - section for specific loggers, each logger can be identified by its name and level (e.g."name": "level")

models_data_dir ( type=string | default=models_data | required ) - path to folder storing the models

model_types ( type=json | required ) - section holding the types of model to load. If a new type is added, it needs to be added here too

Security features such as SSL and Basic Authentication are only supported when using server-cherrypy.py.

Model Type Structure

Each type of model is constructed with an specific structure:

"Model_Type" : {
  "enabled": {True, False},
  "input_data_as_tokens": {True, False},
  "model_names": ["name1", "name2", "name3"],
  "default_model": "name1"
}

Model_Type ( type=json | optional ) - The model type indicates the type of logic to implement, (e.i. the name of the class)
- enabled ( type=boolean | optional ) - this parameter tells Python Bridge to include (if True) this model or not to include it (if False).
- input_data_as_tokens ( type=boolean | optional ) - Possible values: if this parameter is set to True, then the model will expect to receive the sentence as saga tokens, if the parameter is set to False then it expects the sentence to be raw text.
- model_names ( type=string array | optional ) - Holds the names of the actual models to implement.
  - Each model is stored in a folder with the same name, inside the models_data folder, it makes a path like "models_data\Model_Name\name1".
- default_model ( type=string | optional ) - Holds the names of the default model for this model type.

Model Versioning

A model can be retrain, or a new model can be generated using the same algorithm with different parameters, so it could be said every single one of those parameters is a different version, each one of this versions can be stored in folders inside the model_name folder. An example of the directory tree and how each folder is named can be seen below

models_data
│
├───Model_Name
│   ├───name1
│   |   ├───1
│   |   ├───2
│   |   └───3
│   ├───name2
│   |   ├───1
│   |   └───2
│   └───name3
│       └───1
├───Bert
│	└───bert-base-uncased
│       └───1
├───BertQA
│	└───bert-large-uncased-whole-word-making-finetuned-squad
│       └───1
├───LatentSemanticIndexing
│   └───lsi
│       ├───1
│       └───2
├───SBert
│	├───bert-base-nli-mean-tokens
│   │   └───1
│	├───bert-base-nli-stsb-mean-tokens
│   │   └───1
│	└───distilbert-base-nli-mean-tokens
│       └───1
├───SentimentAnalysisTextBlob
│   └───textBlob
│       └───1
├───SentimentAnalysisVader
│   └───vader
│       └───1
└───TfidfVectorizer
    └───tfidf
        └───1

Page tree

Configuring Python Bridge

Model Type Structure

Model Versioning