Part 2: Training an ML Model

This part of the MLRun getting-started tutorial walks you through the steps for training a machine-learning (ML) model, including data exploration and model testing.

The tutorial consists of the following steps:

  1. Setup and Configuration

  2. Creating a training function

  3. Exploring the data with an MLRun marketplace function

  4. Testing your model

By the end of this tutorial you’ll learn how to

  • Create a training function, store models, and track experiments while running them.

  • Use artifacts as inputs to functions.

  • Leverage the MLRun functions marketplace.

  • View plot artifacts.

Prerequisites

The following steps are a continuation of the previous part of this getting-started tutorial and rely on the generated outputs. Therefore, make sure to first run part 1 of the tutorial.

Step 1: Setup and Configuration

Initializing Your MLRun Environment

Use the get_or_create_project MLRun method to create a new project or fetch it from the DB/repository if it already exists. Set the project and user_project parameters to the same values that you used in the call to this method in the Part 1: MLRun Basics tutorial notebook.

import mlrun

# Set the base project name
project_name_base = 'getting-started'

# Initialize the MLRun project object
project = mlrun.get_or_create_project(project_name_base, context="./", user_project=True)
> 2022-02-08 19:46:23,537 [info] loaded project getting-started from MLRun DB

Marking The Beginning of Your Function Code

The following code uses the # mlrun: start-code marker comment annotation to instruct MLRun to start processing code only from this location.

Note: You can add code to define function dependencies and perform additional configuration after the # mlrun: start-code marker and before the # mlrun: end-code marker.

# mlrun: start-code

Step 2: Creating a Training Function

Training functions generate models and various model statistics, we want to store the models along will all the relevant data, metadata and measurements. This can be achieved automatically using MLRun auto logging capabilities.

MLRun can apply all the MLOps functionality by simply using the framework specific apply_mlrun() method which manages the training process and automatically logs all the framework specific model details, data, metadata and metrics.

To log the training results and store a model named my_model, we simply need to add the following lines:

from mlrun.frameworks.sklearn import apply_mlrun
apply_mlrun(model, context, model_name='my_model', x_test=X_test, y_test=y_test)

The training job will automatically generate a set of results and versioned artifacts (run train_run.outputs to view the job outputs):

{'accuracy': 1.0,
 'f1_score': 1.0,
 'precision_score': 1.0,
 'recall_score': 1.0,
 'auc-micro': 1.0,
 'auc-macro': 1.0,
 'auc-weighted': 1.0,
 'feature-importance': 'v3io:///projects/getting-started-admin/artifacts/feature-importance.html',
 'test_set': 'store://artifacts/getting-started-admin/train-iris-train_iris_test_set:86fd0a3754c34f75b8afc5c2464959fc',
 'confusion-matrix': 'v3io:///projects/getting-started-admin/artifacts/confusion-matrix.html',
 'roc-curves': 'v3io:///projects/getting-started-admin/artifacts/roc-curves.html',
 'my_model': 'store://artifacts/getting-started-admin/my_model:86fd0a3754c34f75b8afc5c2464959fc'}
from sklearn import ensemble
from sklearn.model_selection import train_test_split
from mlrun.frameworks.sklearn import apply_mlrun
import mlrun
def train_iris(dataset: mlrun.DataItem, label_column: str):
    
    # Initialize our dataframes
    df = dataset.as_df()
    X = df.drop(label_column, axis=1)
    y = df[label_column]

    # Train/Test split Iris data-set
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Pick an ideal ML model
    model = ensemble.RandomForestClassifier()
    
    # Wrap our model with Mlrun features, specify the test dataset for analysis and accuracy measurements
    apply_mlrun(model=model, model_name='my_model', x_test=X_test, y_test=y_test)
    
    # Train our model
    model.fit(X_train, y_train)

Marking The End of Your Function Code

The following code uses the # mlrun: end-code marker code annotation to mark the end of the code section that should be converted to your MLRun function (which began with the # mlrun: start-code annotation) and instruct MLRun to stop parsing the notebook at this point.

Important: Don’t remove the start-code and end-code annotation cells.

# mlrun: end-code

Converting the Code to an MLRun Function

Use the MLRun code_to_function method to convert the selected portions of your notebook code into an MLRun function in your project — a function object with embedded code, which can run on the cluster.

The following code converts the code of your local train_iris function, which is defined within # mlrun: start-code and # mlrun: end-code annotations that mark the notebook code to convert (see the previous code cells), into into a train_iris_func MLRun function. The function will be stored and run under the current project (which was specified in the get_or_create_project method above).

The code sets the following code_to_function parameters:

  • name — the name of the new MLRun function (train_iris).

  • handler — the name of the function-handler method (train_iris; the default is main).

  • kind — the function’s runtime type (job for a Python process).

  • image — the name of the container image to use for running the job — “mlrun/mlrun”. This image contains the basic machine-learning Python packages (such as scikit-learn).

train_iris_func = mlrun.code_to_function(name='train_iris',
                                         handler='train_iris',
                                         kind='job',
                                         image='mlrun/mlrun')

Running the Function on a Cluster

# Our dataset location (uri)
dataset = project.get_artifact_uri('prep_data_cleaned_data')
#train_iris_func.spec.image_pull_policy = "Always"
train_run = train_iris_func.run(inputs={'dataset': dataset},
                                params={'label_column': 'label'},local=True)
> 2022-02-08 19:58:17,705 [info] starting run train-iris-train_iris uid=9d6806aec3134110bd5358479152aa5d DB=http://mlrun-api:8080
project uid iter start state name labels inputs parameters results artifacts
getting-started-admin 0 Feb 08 19:58:17 completed train-iris-train_iris
v3io_user=admin
kind=
owner=admin
host=jupyter-b7945bb6c-zv48d
dataset
label_column=label
accuracy=1.0
f1_score=1.0
precision_score=1.0
recall_score=1.0
auc-micro=1.0
auc-macro=1.0
auc-weighted=1.0
feature-importance
test_set
confusion-matrix
roc-curves
my_model

> to track results use the .show() or .logs() methods or click here to open in UI
> 2022-02-08 19:58:19,385 [info] run executed, status=completed

Reviewing the Run Output

You can view extensive run information and artifacts from Jupyter Notebook and the MLRun dashboard, as well as browse the project artifacts from the dashboard.

The following code extracts and displays the model from the training-job outputs.

train_run.outputs['model']
'store://artifacts/getting-started-admin/my_model:86fd0a3754c34f75b8afc5c2464959fc'

Your project’s artifacts directory contains the results for the executed training job. The plots subdirectory has HTML output artifacts for the selected run iteration; (the data subdirectory contains the artifacts for the test data set).

Use the following code to extract and display information from the run object — the accuracy that was achieved with the model, and the confusion and roc HTML output artifacts for the optimal run iteration.

print(f'Accuracy: {train_run.outputs["accuracy"]}')
Accuracy: 1.0
# Display HTML output artifacts
train_run.artifact('confusion-matrix').show()