Model Management¶
The Rorodata Platform has built-in support for managing multiple versions of machine learning models. Along with storing the models, it allows keeping track of any metadata required to identify what went into building the model and also attaching any related files.
Overview¶
Every project can have zero or more model repositories. Each model repository manages multiple versions of one model. Each version is called a ModelImage, which contains the serialized model object, associated metadata and attached files.
+-----------------------------------------------+
| ModelRepository A |
| |
| ModelImage - v1 ModelImage - v2 |
| +---------------+ +---------------+ |
| | Model v1 | | Model v1 | |
| +---------------+ +---------------+ |
| | Metadata v1 | | Metadata v1 | |
| +---------------+ +---------------+ |
+-----------------------------------------------+
+-----------------------------------------------+
| ModelRepository B |
| |
| ModelImage - v1 ModelImage - v2 |
| +---------------+ +---------------+ |
| | Model v1 | | Model v1 | |
| +---------------+ +---------------+ |
| | Metadata v1 | | Metadata v1 | |
| +---------------+ +---------------+ |
+-----------------------------------------------+
The system stores all the models and the attachments in an S3 bucket and the metadata is stored in a SQL database.
The Python Interface¶
Here is a sample script to save a new version of an ML model:
import roro
project = roro.get_current_project()
model_repo = project.get_model_repository("credit-risk")
# run the training algorithm to build the model
model = train_machine_learning_model()
model_image = model_repo.new_model_image(model)
model_image['Input-Data-Source'] = 's3://credit-risk-dataset-201706'
model_image['Accuracy'] = 0.89
model_image.save(comment="Built new model using the data till June 2017")
And another script that predicts from a saved model.
import roro
project = roro.get_current_project()
# Take the repo for required model
model_repo = project.get_model_repository("credit-risk")
# get the image of the latest version or any tag
model_image = model_repo.get_model_image(tag="latest")
# get the actual model object
model = model_image.get_model()
def predict(features):
return model.predict(features)
The API¶
-
class
Project
¶ -
get_model_repository
(name)¶ Returns the
ModelRepository
with given name.
-
list_model_repositories
()¶ Returns all the model repositories associated with this project.
-
create_model_repository
(name)¶ Creates a new
ModelRepository
with given name.
-
-
class
ModelRepository
¶ -
new_model_image
(self, model)¶ Creates a new model image.
The save method must be called on the model image object after preparing the image by adding metadata and attachments.
-
get_model_image
(self, version=None, tag=None)¶ Returns the model image with given version number or tag name.
Returns all the tags available in this repository.
-
add_tag
(self, tag, version)¶ Tags the specified
version
of the model image as giventag
name.
-
-
class
ModelImage
¶ ModelImage represents one version of an ML model and its assocated metadata.
Metadata can be added to a model image like a dictionary.
model_image['Accuracy'] = 0.83 model_image['Input-Source'] = 's3://credit-risk-201706'
Some metadata like timestamp, author etc. are automatically added.
-
save
(self, comment)¶ Saves the model image as a new version.
-
__getattr__
(self, name)¶ Returns the metadata with given name.
-
__setattr__
(self, name, value)¶ Set value of the metadata attribute with given name.
-
get_metadata
(self)¶ Returns all metadata as a dictionary.
-
set_metadata
(self, metadata)¶ Sets the model metadata.
-
get_model
(self)¶ Returns the model object.
-
version
¶ Version number of this model image.
-
The Command-line API¶
The Rorodata platform also provides a way to browse though the models from command line.
roro models
Lists all model repositories in the project.
roro models:create name
Creates a new model repository.
roro models:log [name]
Shows a log of model saves.
roro models:show name:version-tag
Shows the metadata of the model image specified by model name and version or tag.
roro models:download name:version-tag
Downloads the model of name with given version or tag.