...
Every model class must be under the package models, inside the Python Bridge folder, and must extended from the class ModelWrapperclass ModelWrapper, which can be import from models.model_wrapper. An example of a model class with the bare minimum, can be found below
Code Block |
---|
language | py |
---|
theme | FadeToGrey |
---|
|
from models.model_wrapper import ModelWrapper
class Test(ModelWrapper):
def __init__(self):
super().__init__()
def initialize(self, model_dir, **kwargs):
pass
def load(self, model_dir, name, version):
pass
def save(self):
pass
def clear(self):
pass
def feed(self, data):
pass
def train(self, **kwargs):
pass
def predict(self, data: list):
pass
def regress(self, data):
pass
def classify(self, data) -> (str, float):
pass
|
Implementation Class
Every class extending from ModelWrapper must implements the following methods, but if by any reason you don't need one of them, you can leave it as pass
Initialize
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def initialize(self, model_dir, **kwargs): |
Receive the path to the model type and the configuration for the path
Load
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def load(self, model_dir, name, version): |
Receive the path to the model type, the name of the model and the version of it, in this section the loading of the model is expected
Save
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def save(self): |
Save the current loaded model
Clear
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def clear(self): |
Remove the current loaded model, and any training data in memory
Feed
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def feed(self, data): |
Receives a list of string tokens tokens to be added to the training data
Train
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def train(self, **kwargs): |
Trains model with the fed documents, the model can be either kept in memory or saved
Predict
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def predict(self, data: list): |
Retrieves a vector or an array of vectors, from processing the data with the loaded mode, which is return inside a JSON { 'vector': [ ] }, the value of the vector key must be always be an array.
Regress
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def regress(self, data): |
...
Implements regression. We yet haven't implement a model for this particular method, at the moment it exists for future implementation
Classify
Code Block |
---|
language | py |
---|
theme | RDarkFadeToGrey |
---|
|
def classify(self, data) -> (str, float): |
Retrieves a label or multiple labels, using the loaded model, from the data. We recommend returning the label along with its confidence.
Model Class References
Now that the class is ready, we need to make it available to be use, for this we need to add a reference in 2 files
models/__init__.py
Inside the models package there is a __init__.py file which exposes the Model Classes to the server. Any new class needs to be added to this files, and example can be seen below with the Test class
Code Block |
---|
from .latent_semantic_indexing import LatentSemanticIndexing
from .bert import Bert
from .sentiment_analysis_vader import SentimentAnalysisVader
from .sentiment_analysis_text_blob import SentimentAnalysisTextBlob
from .model_wrapper import ModelWrapper
from .test import Test |
config/config.json
The other reference lies inside the config.json file in the config folder, in this file there is a section called "model_types", which refers to the classes available. As in the __init__.py file, any new class needs to be reference in this files
Note |
---|
"model_names" does reference to the actual model data, each name in the model_names, refers to a folder, which also contains folders representing the version of the model |
Code Block |
---|
|
"model_types": {
"LatentSemanticIndexing" : {
"model_names": ["lsi"]
},
"Bert": {
"model_names": ["bert-base-uncased"],
"default_model": "bert-base-uncased"
},
"SentimentAnalysisVader": {
"model_names": ["vader"]
},
"SentimentAnalysisTextBlob": {
"model_names": ["textBlob"]
}
"Test"" {
"model_names": ["test"]
}
} |
Model Data
Every model to be used by its implementation needs to be stored in a specific path, compoused by the Name of model type, a representative name of the model and a folder representing the version (the version doesn't have to be a number, it can be a name).
...
As it can be seen below, the model for the Test Class was added following this structure
Code Block |
---|
models_data
│
├───Bert
│ ├───bert-base-uncased
│ | └───1
├───LatentSemanticIndexing
│ └───lsi
│ ├───1
│ └───2
├───SentimentAnalysisTextBlob
│ └───textBlob
│ └───1
├───SentimentAnalysisVader
│ └───vader
│ └───1
├───TfidfVectorizer
│ └───tfidf
│ └───1
└───Test
└───test
└───1 |