Create a basic training job#
In this section, you create a simple job to train a model and log metrics, logs, and plots using MLRun's auto-logging:
Define the training code#
The code you run is as follows. Notice, there is only a single line from MLRun to add all the MLOps capabilities:
%%writefile trainer.py
from sklearn import ensemble
from sklearn.model_selection import train_test_split
import mlrun
from mlrun.frameworks.sklearn import apply_mlrun
def train(
dataset: mlrun.DataItem, # data inputs are of type DataItem (abstract the data source)
label_column: str = "label",
n_estimators: int = 100,
learning_rate: float = 0.1,
max_depth: int = 3,
model_name: str = "cancer_classifier",
):
# Get the input dataframe (Use DataItem.as_df() to access any data source)
df = dataset.as_df()
# Initialize the x & y data
X = df.drop(label_column, axis=1)
y = df[label_column]
# Train/Test split the dataset
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Pick an ideal ML model
model = ensemble.GradientBoostingClassifier(
n_estimators=n_estimators, learning_rate=learning_rate, max_depth=max_depth
)
# -------------------- The only line you need to add for MLOps -------------------------
# Wraps the model with MLOps (test set is provided for analysis & accuracy measurements)
apply_mlrun(model=model, model_name=model_name, x_test=X_test, y_test=y_test)
# --------------------------------------------------------------------------------------
# Train the model
model.fit(X_train, y_train)
Writing trainer.py
Create the job#
Next, use code_to_function
to package up the Job
to get ready to execute on the cluster:
import mlrun
training_job = mlrun.code_to_function(
name="basic-training",
filename="trainer.py",
kind="job",
image="mlrun/mlrun",
handler="train",
)
Run the job#
Finally, run the job. The dataset is from S3, but usually it is the output from a previous step in a pipeline.
run = training_job.run(
inputs={
"dataset": "https://igz-demo-datasets.s3.us-east-2.amazonaws.com/cancer-dataset.csv"
},
params={"n_estimators": 100, "learning_rate": 1e-1, "max_depth": 3},
)
> 2022-07-22 22:27:15,162 [info] starting run basic-training-train uid=bc1c6ad491c340e1a3b9b91bb520454f DB=http://mlrun-api:8080
> 2022-07-22 22:27:15,349 [info] Job is running in the background, pod: basic-training-train-kkntj
> 2022-07-22 22:27:20,927 [info] run executed, status=completed
final state: completed
project | uid | iter | start | state | name | labels | inputs | parameters | results | artifacts |
---|---|---|---|---|---|---|---|---|---|---|
default | 0 | Jul 22 22:27:18 | completed | basic-training-train | v3io_user=nick kind=job owner=nick mlrun/client_version=1.0.4 host=basic-training-train-kkntj |
dataset |
n_estimators=100 learning_rate=0.1 max_depth=3 |
accuracy=0.956140350877193 f1_score=0.965034965034965 precision_score=0.9583333333333334 recall_score=0.971830985915493 |
feature-importance test_set confusion-matrix roc-curves calibration-curve model |
> to track results use the .show() or .logs() methods or click here to open in UI
> 2022-07-22 22:27:21,640 [info] run executed, status=completed
View job results#
Once the job is complete, you can view the output metrics and visualize the artifacts.
run.outputs
{'accuracy': 0.956140350877193,
'f1_score': 0.965034965034965,
'precision_score': 0.9583333333333334,
'recall_score': 0.971830985915493,
'feature-importance': 'v3io:///projects/default/artifacts/feature-importance.html',
'test_set': 'store://artifacts/default/basic-training-train_test_set:bc1c6ad491c340e1a3b9b91bb520454f',
'confusion-matrix': 'v3io:///projects/default/artifacts/confusion-matrix.html',
'roc-curves': 'v3io:///projects/default/artifacts/roc-curves.html',
'calibration-curve': 'v3io:///projects/default/artifacts/calibration-curve.html',
'model': 'store://artifacts/default/cancer_classifier:bc1c6ad491c340e1a3b9b91bb520454f'}
run.artifact("confusion-matrix").show()
run.artifact("feature-importance").show()