The implementation between each model can could be entire completely different, we are not going to go into explaining details about each particular model, each model implementation in Python Bridge is a set of expected functions focused in functionality.
Every model class must be under the package models, inside the Python Bridge folder, and must extended extend from the class ModelWrapper, which can be import imported from models.model_wrapper. An example of a model class with the bare minimum, can be found below:
Code Block | ||||
---|---|---|---|---|
| ||||
from models.model_wrapper import ModelWrapper class Test(ModelWrapper): def __init__(self): super().__init__() def initialize(self, model_dir, **kwargs): pass def load(self, model_dir, name, version): pass def save(self): pass def clear(self): pass def feed(self, data): pass def train(self, **kwargs): pass def predict(self, data: list): pass def regress(self, data): pass def classify(self, data) -> (str, float): pass |
...
Code Block | ||||
---|---|---|---|---|
| ||||
def initialize(self, model_dir, **kwargs): |
Receive Receives the path to the model type and the configuration for the path.
...
Code Block | ||||
---|---|---|---|---|
| ||||
def load(self, model_dir, name, version): |
Receive Receives the path to the model type, the name of the model and the version of it, in this section the loading of the model is expected.
...
Code Block | ||||
---|---|---|---|---|
| ||||
def save(self): |
Save Saves the current loaded model.
...
Code Block | ||||
---|---|---|---|---|
| ||||
def clear(self): |
Remove Removes the current loaded model, and any training data in memory.
...
Inside the models package there is a __init__.py file which exposes the Model Classes to the server. Any new class needs to be added to this files, and an example can be seen below with the Test class:
Code Block | ||||
---|---|---|---|---|
| ||||
from .latent_semantic_indexing import LatentSemanticIndexing
from .bert import Bert
from .biobert import BioBert
from .bert_qa import BertQA
from .sbert import SBert
from .sentiment_analysis_vader import SentimentAnalysisVader
from .sentiment_analysis_text_blob import SentimentAnalysisTextBlob
from .model_wrapper import ModelWrapper
from .tfidf_vectorizer import TfidfVectorizer
from .test import Test |
config/config.json
The other reference lies inside in the config.json file in the config folder, in this file there is a section called "model_types", which refers to the classes available. As in the __init__.py file, any new class needs to be reference referenced in this file.
Note |
---|
"model_names" does reference the actual model data, each name in the model_names refers to a folder which also contains folders representing the versions of the model |
Code Block | ||||
---|---|---|---|---|
| ||||
"model_types": { "LatentSemanticIndexing" : { "enabled": true, "input_data_as_tokens": false, "model_names": ["lsi"] }, "Bert": { "enabled": false, "input_data_as_tokens": false, "model_names": ["bert-base-uncased"], "default_model": "bert-base-uncased" }, "BioBert": { "enabled": false, "input_data_as_tokens": false, "model_names": ["biobert-base-cased-v1.2"], "default_model": "biobert-base-cased-v1.2" }, "BertQA": { "enabled": false, "input_data_as_tokens": false, "model_names": ["bert-large-uncased-whole-word-masking-finetuned-squad"], "default_model": "bert-large-uncased-whole-word-masking-finetuned-squad" }, "SBert": { "enabled": true, "input_data_as_tokens": false, "model_names": ["bert-base-nli-stsb-mean-tokens", "bert-base-nli-mean-tokens", "distilbert-base-nli-stsb-mean-tokens"], "default_model": "bert-base-nli-stsb-mean-tokens" }, "SentimentAnalysisVader": { "enabled": true, "input_data_as_tokens": false, "model_names": ["vader"] }, "SentimentAnalysisTextBlob": { "enabled": true, "input_data_as_tokens": false, "model_names": ["textBlob"] }, "TfidfVectorizer": { "enabled": true, "input_data_as_tokens": false, "model_names": ["Tfidf"] }, "Test" { "enabled": true, "input_data_as_tokens": false, "model_names": ["test"] } } |
...