# Train, compare, and register models

This notebook provides a quick overview of training ML models using [MLRun](https://www.mlrun.org/) MLOps orchestration framework.

Make sure you reviewed the basics in MLRun [**Quick Start Tutorial**](./01-mlrun-basics.ipynb).

Tutorial steps:
- [**Define an MLRun project and a training functions**](#define-project)
- [**Run the function, log the artifacts and model**](#run-function)
- [**Hyper-parameter tuning and model/experiment comparison**](#hyper-param)
- [**Build and test the model serving functions**](#model-serving)


## MLRun installation and configuration

Before running this notebook make sure `mlrun` and `sklearn` packages are installed (`pip install mlrun scikit-learn~=1.0`) and that you have configured the access to the MLRun service. 

In [1]:
# install MLRun if not installed, run this only once (restart the notebook after the install !!!)
%pip install mlrun scikit-learn~=1.0.0

You should consider upgrading via the '/opt/conda/bin/python -m pip install --upgrade pip' command.[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


<a id="define-project"></a>
## Define MLRun project and a training functions

You should create, load, or use (get) an **{ref}`MLRun Project <Projects>`** that holds all your functions and assets.

**Get or create a new project:**

The `get_or_create_project()` method tries to load the project from MLRun DB. If the project does not exist it creates a new one.

In [2]:
import mlrun
project = mlrun.get_or_create_project("tutorial", context="src/", user_project=True)

> 2022-08-24 08:50:23,251 [info] loaded project tutorial from None or context and saved in MLRun DB


**Add (auto) MLOps to your training function:**

Training functions generate models and various model statistics. You'll want to store the models along with all the relevant data,
metadata, and measurements. MLRun can apply all the MLOps functionality automatically ("Auto-MLOps") by simply using the framework specific `apply_mlrun()` method.

In the training function below note the **single** custom line you need to add to your code:

```python
apply_mlrun(model=model, model_name="my_model", x_test=x_test, y_test=y_test)
```

`apply_mlrun()` manages the training process and automatically logs all the framework-specific model object, details, data, metadata, and metrics.
It accepts the model object and various optional parameters. When specifying the `x_test` and `y_test` data it generates various plots and calculations to evaluate the model.
Metadata and parameters are automatically recorded (from MLRun `context` object) and don't need to be specified.

**Function code:**

Run the following cell to generate the `trainer.py` file (or copy it manually):

**Create a serverless function object from the code above, and register it in the project:**

In [3]:
trainer = project.set_function("trainer.py", name="trainer", kind="job", image="mlrun/mlrun", handler="train")


<a id="run-function"></a>
## Run the training function and log the artifacts and model

**Create a dataset for training:**

In [4]:
import pandas as pd
from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
breast_cancer_dataset = pd.DataFrame(data=breast_cancer.data, columns=breast_cancer.feature_names)
breast_cancer_labels = pd.DataFrame(data=breast_cancer.target, columns=["label"])
breast_cancer_dataset = pd.concat([breast_cancer_dataset, breast_cancer_labels], axis=1)

breast_cancer_dataset.to_csv("cancer-dataset.csv", index=False)

**Run the function (locally) using the generated dataset:**

In [5]:
trainer_run = project.run_function(
    "trainer", 
    inputs={"dataset": "cancer-dataset.csv"}, 
    params = {"n_estimators": 100, "learning_rate": 1e-1, "max_depth": 3},
    local=True
)

> 2022-08-24 08:50:24,636 [info] starting run trainer-train uid=05c6e41b668f460fa67d7abf9dff9542 DB=http://mlrun-api:8080


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
tutorial-jovyan,...9dff9542,0,Aug 24 08:50:24,completed,trainer-train,kind=owner=jovyanhost=mlrun-jupyter-6b78bf965-knkrt,dataset,n_estimators=100learning_rate=0.1max_depth=3,accuracy=0.956140350877193f1_score=0.965034965034965precision_score=0.9583333333333334recall_score=0.971830985915493,feature-importancetest_setconfusion-matrixroc-curvescalibration-curvemodel





> 2022-08-24 08:50:32,906 [info] run executed, status=completed


<br>

**View the auto generated results and artifacts:**

In [6]:
trainer_run.outputs

{'accuracy': 0.956140350877193,
 'f1_score': 0.965034965034965,
 'precision_score': 0.9583333333333334,
 'recall_score': 0.971830985915493,
 'feature-importance': 's3://mlrun/trainer-train/0/feature-importance.html',
 'test_set': 'store://artifacts/tutorial-jovyan/trainer-train_test_set:05c6e41b668f460fa67d7abf9dff9542',
 'confusion-matrix': 's3://mlrun/trainer-train/0/confusion-matrix.html',
 'roc-curves': 's3://mlrun/trainer-train/0/roc-curves.html',
 'calibration-curve': 's3://mlrun/trainer-train/0/calibration-curve.html',
 'model': 'store://artifacts/tutorial-jovyan/cancer_classifier:05c6e41b668f460fa67d7abf9dff9542'}

In [7]:
trainer_run.artifact('feature-importance').show()

**Export model files + metadata into a zip:** (require MLRun 1.1.0 and above)

You can `export()` the model package (files + metadata) into a zip, and load it on a remote system/cluster (by simply running `model = project.import_artifact(key, path)`). 

In [8]:
trainer_run.artifact('model').meta.export("model.zip")

<a id="hyper-param"></a>
## Hyper-parameter tuning and model/experiment comparison

Run a `GridSearch` with a couple of parameters, and select the best run with respect to the `max accuracy`. <br>
(Read more about MLRun [Hyper-Param and Iterative jobs](https://docs.mlrun.org/en/stable/hyper-params.html).)

For basic usage you can run the hyperparameters tuning job by using the arguments: 
* `hyperparams` for the hyperparameters options and values of choice.
* `selector` for specifying how to select the best model.

**Running a remote function:**

In order to run the hyper-param task over the cluster you need the input data to be available for the job, using object storage or the mlrun versioned artifact store.

The following line logs (and uploads) the dataframe as a project artifact:

In [9]:
dataset_artifact = project.log_dataset("cancer-dataset", df=breast_cancer_dataset, index=False)

Run the function over the remote Kubernetes cluster (`local` is not set):

In [10]:
hp_tuning_run = project.run_function(
    "trainer", 
    inputs={"dataset": dataset_artifact.uri}, 
    hyperparams={
        "n_estimators": [10, 100, 1000], 
        "learning_rate": [1e-1, 1e-3], 
        "max_depth": [2, 8]
    }, 
    selector="max.accuracy", 
)

> 2022-08-24 08:50:34,657 [info] starting run trainer-train uid=22118ca5268e45babf7668ce11064837 DB=http://mlrun-api:8080
> 2022-08-24 08:50:35,094 [info] Job is running in the background, pod: trainer-train-8c6f4
> 2022-08-24 08:51:27,920 [info] best iteration=3, used criteria max.accuracy
> 2022-08-24 08:51:28,380 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
tutorial-jovyan,...11064837,0,Aug 24 08:50:46,completed,trainer-train,kind=jobowner=jovyanmlrun/client_version=1.1.0-rc24,dataset,,best_iteration=3accuracy=0.9649122807017544f1_score=0.9722222222222222precision_score=0.958904109589041recall_score=0.9859154929577465,feature-importancetest_setconfusion-matrixroc-curvescalibration-curvemodeliteration_resultsparallel_coordinates





> 2022-08-24 08:51:35,352 [info] run executed, status=completed


<br>

**View Hyper-param results and the selected run in the MLRun UI:**

![hprun](../_static/images/tutorial/hprun.png)

Interactive Parallel Coordinates Plot:

![pcp](../_static/images/tutorial/pcp.png)

<br>

**List the generated models and compare the different runs:**

In [11]:
hp_tuning_run.outputs

{'best_iteration': 3,
 'accuracy': 0.9649122807017544,
 'f1_score': 0.9722222222222222,
 'precision_score': 0.958904109589041,
 'recall_score': 0.9859154929577465,
 'feature-importance': 's3://mlrun/trainer-train/3/feature-importance.html',
 'test_set': 'store://artifacts/tutorial-jovyan/trainer-train_test_set:22118ca5268e45babf7668ce11064837',
 'confusion-matrix': 's3://mlrun/trainer-train/3/confusion-matrix.html',
 'roc-curves': 's3://mlrun/trainer-train/3/roc-curves.html',
 'calibration-curve': 's3://mlrun/trainer-train/3/calibration-curve.html',
 'model': 'store://artifacts/tutorial-jovyan/cancer_classifier:22118ca5268e45babf7668ce11064837',
 'iteration_results': 's3://mlrun/trainer-train/0/iteration_results.csv',
 'parallel_coordinates': 's3://mlrun/trainer-train/0/parallel_coordinates.html'}

In [12]:
# list the models in the project (can apply filters)
models = project.list_models()
for model in models:
    print(f"uri: {model.uri}, metrics: {model.metrics}")

uri: store://models/tutorial-jovyan/cancer_classifier#0:05c6e41b668f460fa67d7abf9dff9542, metrics: {'accuracy': 0.956140350877193, 'f1_score': 0.965034965034965, 'precision_score': 0.9583333333333334, 'recall_score': 0.971830985915493}
uri: store://models/tutorial-jovyan/cancer_classifier#1:22118ca5268e45babf7668ce11064837, metrics: {'accuracy': 0.956140350877193, 'f1_score': 0.965034965034965, 'precision_score': 0.9583333333333334, 'recall_score': 0.971830985915493}
uri: store://models/tutorial-jovyan/cancer_classifier#2:22118ca5268e45babf7668ce11064837, metrics: {'accuracy': 0.956140350877193, 'f1_score': 0.965034965034965, 'precision_score': 0.9583333333333334, 'recall_score': 0.971830985915493}
uri: store://models/tutorial-jovyan/cancer_classifier#3:22118ca5268e45babf7668ce11064837, metrics: {'accuracy': 0.9649122807017544, 'f1_score': 0.9722222222222222, 'precision_score': 0.958904109589041, 'recall_score': 0.9859154929577465}
uri: store://models/tutorial-jovyan/cancer_classifier#

In [13]:
# to view the full model object use:
# print(models[0].to_yaml())

In [14]:
# compare the runs (generate interactive parallel coordinates plot and a table)
project.list_runs(name="trainer-train", iter=True).compare()

uid,iter,start,state,name,parameters,results
...11064837,12,Aug 24 08:51:18,completed,trainer-train,n_estimators=1000learning_rate=0.001max_depth=8,accuracy=0.9385964912280702f1_score=0.951048951048951precision_score=0.9444444444444444recall_score=0.9577464788732394
...11064837,11,Aug 24 08:51:16,completed,trainer-train,n_estimators=100learning_rate=0.001max_depth=8,accuracy=0.6228070175438597f1_score=0.7675675675675676precision_score=0.6228070175438597recall_score=1.0
...11064837,10,Aug 24 08:51:14,completed,trainer-train,n_estimators=10learning_rate=0.001max_depth=8,accuracy=0.6228070175438597f1_score=0.7675675675675676precision_score=0.6228070175438597recall_score=1.0
...11064837,9,Aug 24 08:51:10,completed,trainer-train,n_estimators=1000learning_rate=0.1max_depth=8,accuracy=0.9385964912280702f1_score=0.951048951048951precision_score=0.9444444444444444recall_score=0.9577464788732394
...11064837,8,Aug 24 08:51:08,completed,trainer-train,n_estimators=100learning_rate=0.1max_depth=8,accuracy=0.9385964912280702f1_score=0.951048951048951precision_score=0.9444444444444444recall_score=0.9577464788732394
...11064837,7,Aug 24 08:51:06,completed,trainer-train,n_estimators=10learning_rate=0.1max_depth=8,accuracy=0.9385964912280702f1_score=0.951048951048951precision_score=0.9444444444444444recall_score=0.9577464788732394
...11064837,6,Aug 24 08:51:01,completed,trainer-train,n_estimators=1000learning_rate=0.001max_depth=2,accuracy=0.956140350877193f1_score=0.965034965034965precision_score=0.9583333333333334recall_score=0.971830985915493
...11064837,5,Aug 24 08:50:59,completed,trainer-train,n_estimators=100learning_rate=0.001max_depth=2,accuracy=0.6228070175438597f1_score=0.7675675675675676precision_score=0.6228070175438597recall_score=1.0
...11064837,4,Aug 24 08:50:57,completed,trainer-train,n_estimators=10learning_rate=0.001max_depth=2,accuracy=0.6228070175438597f1_score=0.7675675675675676precision_score=0.6228070175438597recall_score=1.0
...11064837,3,Aug 24 08:50:52,completed,trainer-train,n_estimators=1000learning_rate=0.1max_depth=2,accuracy=0.9649122807017544f1_score=0.9722222222222222precision_score=0.958904109589041recall_score=0.9859154929577465


<a id="model-serving"></a>
## Build and test the model serving functions

MLRun serving can produce managed, real-time, serverless, pipelines composed of various data processing and ML tasks. The pipelines use the Nuclio real-time serverless engine, which can be deployed anywhere. For more details and examples, see the [MLRun Serving Graphs](https://docs.mlrun.org/en/stable/serving/serving-graph.html).

**Create a model serving function from our [code](src/serving.py)**

In [15]:
serving_fn = mlrun.new_function("serving", image="mlrun/mlrun", kind="serving")
serving_fn.add_model('cancer-classifier',model_path=hp_tuning_run.outputs["model"], class_name='mlrun.frameworks.sklearn.SklearnModelServer')

<mlrun.serving.states.TaskStep at 0x7ffb4eba6bb0>

In [16]:
# create a mock (simulator of the real-time function)
server = serving_fn.to_mock_server()

my_data = {"inputs"
           :[[
               1.371e+01, 2.083e+01, 9.020e+01, 5.779e+02, 1.189e-01, 1.645e-01,
               9.366e-02, 5.985e-02, 2.196e-01, 7.451e-02, 5.835e-01, 1.377e+00,
               3.856e+00, 5.096e+01, 8.805e-03, 3.029e-02, 2.488e-02, 1.448e-02,
               1.486e-02, 5.412e-03, 1.706e+01, 2.814e+01, 1.106e+02, 8.970e+02,
               1.654e-01, 3.682e-01, 2.678e-01, 1.556e-01, 3.196e-01, 1.151e-01]
            ]
}
server.test("/v2/models/cancer-classifier/infer", body=my_data)

> 2022-08-24 08:51:37,660 [info] model cancer-classifier was loaded
> 2022-08-24 08:51:37,661 [info] Loaded ['cancer-classifier']



X does not have valid feature names, but GradientBoostingClassifier was fitted with feature names



{'id': 'ec127766ed0d4e229496a61e8790047e',
 'model_name': 'cancer-classifier',
 'outputs': [0]}

## Done!

Congratulation! You've completed Part 2 of the MLRun getting-started tutorial.
Proceed to [**Part 3: Model serving**](03-model-serving.ipynb) to learn how to deploy and serve your model using a serverless function.