The implementation between each model can could be entire completely different, we are not going to go into explaining details about each particular model, each model implementation in Python Bridge is a set of expected functions focus focused in functionality.

Model Class Creation

Every model class must be under the package models, inside the Python Bridge folder, and must extended extend from the class ModelWrapperclass ModelWrapper, which can be imported from models.model_wrapper. An example of a model class with the bare minimum, can be found below:

Code Block

language	py
theme	FadeToGrey

from models.model_wrapper import ModelWrapper


class Test(ModelWrapper):
    
    def __init__(self):
        super().__init__()

    def initialize(self, model_dir, **kwargs):
        pass

    def load(self, model_dir, name, version):
        pass

    def save(self):
        pass

    def clear(self):
        pass

    def feed(self, data):
        pass

    def train(self, **kwargs):
        pass

    def predict(self, data: list):
        pass

    def regress(self, data):
        pass

    def classify(self, data) -> (str, float):
        pass

Implementation Class

Every class extending from ModelWrapper must implement the following methods, but if by any reason you don't need one of them, you can leave it as pass

Initialize

Code Block

language	py
theme	RDarkFadeToGrey

def initialize(self, model_dir, **kwargs):

Receive Receives the path to the model type and the configuration for the path.

Load

Code Block

language	py
theme	RDarkFadeToGrey

def load(self, model_dir, name, version):

Receive Receives the path to the model type, the name of the model and the version of it, in this section the loading of the model is expected.

Save

Code Block

language	py
theme	RDarkFadeToGrey

def save(self):

Save Saves the current loaded model.

Clear

Code Block

language	py
theme	RDarkFadeToGrey

def clear(self):

Remove Removes the current loaded model, and any training data in memory.

Feed

Code Block

language	py
theme	RDarkFadeToGrey

def feed(self, data):

Receives a list of string tokens tokens to be added to the training data.

Train

Code Block

language	py
theme	RDarkFadeToGrey

def train(self, **kwargs):

Trains the model with the documents fed documents, the model can be either kept in memory or saved.

Predict

Code Block

language	py
theme	RDarkFadeToGrey

def predict(self, data: list):

Retrieves a vector or an array of vectors , from processing the data with the loaded modemodel, which is return returned inside a JSON { 'vector': [ ] }, the value of the vector key must be always be an array.

Regress

Code Block

language	py
theme	RDarkFadeToGrey

def regress(self, data):

...

Implements regression. We yet haven't implement a model for this particular method, at the moment it exists for future implementation.

Classify

Code Block

language	py
theme	RDarkFadeToGrey

def classify(self, data) -> (str, float):

Retrieves a label or multiple labels , using the loaded model , from the data. We recommend returning the label along with its confidence.

...

Model Class References

Now that the class is ready, we need to make it available to be used, for this we need to add a reference in 2 files

models/__init__.py

Inside the models package there is a __init__.py file which exposes the Model Classes to the server. Any new class needs to be added to this files, an example can be seen below with the Test class:

Code Block

language	py
theme	FadeToGrey

from .latent_semantic_indexing import LatentSemanticIndexing
from .bert import Bert
from .biobert import BioBert
from .bert_qa import BertQA
from .sbert import SBert
from .sentiment_analysis_vader import SentimentAnalysisVader
from .sentiment_analysis_text_blob import SentimentAnalysisTextBlob
from .model_wrapper import ModelWrapper
from .tfidf_vectorizer import TfidfVectorizer

from .test import Test

config/config.json

The other reference lies in the config.json file in the config folder, in this file there is a section called "model_types", which refers to the classes available. As in the __init__.py file, any new class needs to be referenced in this file.

Note
"model_names" does reference the actual model data, each name in the model_names refers to a folder which also contains folders representing the versions of the model

Code Block

language	js
theme	DJango

"model_types": {
        "LatentSemanticIndexing" : {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["lsi"]
        },
        "Bert": {
            "enabled": false,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-uncased"],
            "default_model": "bert-base-uncased"
        },
		"BioBert": {
             "enabled": false,
             "input_data_as_tokens": false,
             "model_names": ["biobert-base-cased-v1.2"],
             "default_model": "biobert-base-cased-v1.2"
        },         "BertQA": {
            "enabled": false,
            "input_data_as_tokens": false,
            "model_names": ["bert-large-uncased-whole-word-masking-finetuned-squad"],
            "default_model": "bert-large-uncased-whole-word-masking-finetuned-squad"
        },
        "SBert": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["bert-base-nli-stsb-mean-tokens", "bert-base-nli-mean-tokens", "distilbert-base-nli-stsb-mean-tokens"],
            "default_model": "bert-base-nli-stsb-mean-tokens"
        },
        "SentimentAnalysisVader": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["vader"]
        },
        "SentimentAnalysisTextBlob": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["textBlob"]
        },
        "TfidfVectorizer": {
            "enabled": true,
            "input_data_as_tokens": false,
            "model_names": ["Tfidf"]
        },
		"Test" {
			"enabled": true,
			"input_data_as_tokens": false,
			"model_names": ["test"]
        }
    }

Model Data

Every model to be used by its implementation needs to be stored in a specific path, composed by the Name of model type, a representative name of the model and a folder representing the version (the version doesn't have to be a number, it can be a name). As it can be seen below, the model for the Test Class was added following this structure

Code Block

language	text
theme	FadeToGrey

models_data
│
├───Bert
│	└───bert-base-uncased
│       └───1
├───BertQA
│	└───bert-large-uncased-whole-word-making-finetuned-squad
│       └───1
├───LatentSemanticIndexing
│   └───lsi
│       ├───1
│       └───2
├───SBert
│	├───bert-base-nli-mean-tokens
│   │   └───1
│	├───bert-base-nli-stsb-mean-tokens
│   │   └───1
│	└───distilbert-base-nli-mean-tokens
│       └───1
├───SentimentAnalysisTextBlob
│   └───textBlob
│       └───1
├───SentimentAnalysisVader
│   └───vader
│       └───1
└───TfidfVectorizer
│   └───tfidf
│       └───1
└───Test
    └───test
        └───1

Page tree

Versions Compared

Old Version 5

New Version Current

Key

Model Class Creation

Implementation Class

Initialize

Load

Save

Clear

Feed

Train

Predict

Regress

Classify

Model Class References

Model Data

Page tree

Page History

Versions Compared

Old Version 5

New Version Current

Key

Model Class Creation

Implementation Class

Initialize

Load

Save

Clear

Feed

Train

Predict

Regress

Classify

Model Class References

Model Data