Python Bridge can be configured using the config.json file in the config folder:

Default Python Bridge Configuration
{
    "host": "0.0.0.0",
    "port": 5000,
    "ssl": {
        "enabled": false,
        "secure_port": 5443,
        "certificate": {
            "cert": "cert.pem",
            "key": "key.pem"
        }
    },
    "authentication": {
        "enabled": false,
        "credentials": {
            "user": "admin",
            "password": "password"
        }
    },
    "threads": 30,
    "logging": {
        "level": "info",
        "loggers": {
            "werkzeug": "info",
            "gensim.utils": "warn",
            "pytorch_pretrained_bert.modeling": "warn",
            "pytorch_pretrained_bert.tokenization": "warn"
        }
    },
    "models_data_dir": "models_data",
    "model_types": {
        "LatentSemanticIndexing" : {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["lsi"]
        },
        "Bert": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-uncased"],
            "default_model": "bert-base-uncased"
        },
		"BioBert": {
             "enabled": false,
             "input_data_as_tokens": false,
             "model_names": ["biobert-base-cased-v1.2"],
             "default_model": "biobert-base-cased-v1.2"
        },
        "BertQA": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-large-uncased-whole-word-masking-finetuned-squad"],
            "default_model": "bert-large-uncased-whole-word-masking-finetuned-squad"
        },
        "SBert": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-nli-stsb-mean-tokens", "bert-base-nli-mean-tokens", "distilbert-base-nli-stsb-mean-tokens"],
            "default_model": "bert-base-nli-stsb-mean-tokens"
        },
        "SentimentAnalysisVader": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["vader"]
        },
        "SentimentAnalysisTextBlob": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["textBlob"]
        },
        "TfidfVectorizer": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["Tfidf"]
        }
    }
}


In Line 2, the  host ( type=string | default=0.0.0.0 | required ) - Identifies the host from which the service will listen to requests. By default the service accepts all request

Line 3, the  port ( type=integer | default=5000 | required ) - Port in which the service will listen for requests

ssl ( type=json | optional ) - Specifies the availability and settings for SSL operation

  • enabled ( type=boolean | default=false | required ) - Enables or disables SSL configuration
  • secure_port ( type=integer | default=5443 | required ) - The port number for secure connection
  • cert ( type=string | required ) - The path to the certificate file
  • key ( type=string | required ) - The path to the key file

authentication ( type=json | optional ) - contains the settings for enabling basic authentication

  • enabled ( type=boolean | default=false | required ) - enables or disables basic authentication
  • user ( type=string | default=admin | required ) - the user id for basic authentication
  • password ( type=string | default=password | required ) - the authentication password

In the logging section the logging level can be specified for the root logger, and the level for specific loggers

  • level ( type=string | default=info | required ) - Level of the root logger, and by extension all loggers without a set level
  • loggers ( type=json | optional ) - section for specific loggers, each logger can be identified by its name and level (e.g."name": "level")

models_data_dir ( type=string | default=models_data | required ) - path to folder storing the models

model_types ( type=json | required ) - section holding the types of model to load. If a new type is added, it needs to be added here too


Security features such as SSL and Basic Authentication are only supported when using server-cherrypy.py.

Model Type Structure

Each type of model is constructed with an specific structure:

"Model_Type" : {
  "enabled": {True, False},
  "input_data_as_tokens": {True, False},
  "model_names": ["name1", "name2", "name3"],
  "default_model": "name1"
}
  • Model_Type ( type=json | optional ) - The model type indicates the type of logic to implement, (e.i. the name of the class)
    • enabled ( type=boolean | optional ) - this parameter tells Python Bridge to include (if True) this model or not to include it (if False).
    • input_data_as_tokens ( type=boolean | optional ) - Possible values: if this parameter is set to True, then the model will expect to receive the sentence as saga tokens, if the parameter is set to False then it expects the sentence to be raw text.
    • model_names ( type=string array | optional ) - Holds the names of the actual models to implement.
      • Each model is stored in a folder with the same name, inside the models_data folder, it makes a path like "models_data\Model_Name\name1".
    • default_model ( type=string | optional ) - Holds the names of the default model for this model type.

Model Versioning

A model can be retrain, or a new model can be generated using the same algorithm with different parameters, so it could be said every single one of those parameters is a different version, each one of this versions can be stored in folders inside the model_name folder. An example of the directory tree and how each folder is named can be seen below

models_data
│
├───Model_Name
│   ├───name1
│   |   ├───1
│   |   ├───2
│   |   └───3
│   ├───name2
│   |   ├───1
│   |   └───2
│   └───name3
│       └───1
├───Bert
│	└───bert-base-uncased
│       └───1
├───BertQA
│	└───bert-large-uncased-whole-word-making-finetuned-squad
│       └───1
├───LatentSemanticIndexing
│   └───lsi
│       ├───1
│       └───2
├───SBert
│	├───bert-base-nli-mean-tokens
│   │   └───1
│	├───bert-base-nli-stsb-mean-tokens
│   │   └───1
│	└───distilbert-base-nli-mean-tokens
│       └───1
├───SentimentAnalysisTextBlob
│   └───textBlob
│       └───1
├───SentimentAnalysisVader
│   └───vader
│       └───1
└───TfidfVectorizer
    └───tfidf
        └───1
  • No labels