MLRun execution context
MLRun execution context#
After running a job, you need to be able to track it. To gain the maximum value, MLRun uses the job context
object inside
the code. This provides access to job metadata, parameters, inputs, secrets, and API for logging and monitoring the results, as well as log text, files, artifacts, and labels.
If
context
is specified as the first parameter in the function signature, MLRun injects the current job context into it.Alternatively, if it does not run inside a function handler (e.g. in Python main or Notebook) you can obtain the
context
object from the environment using theget_or_create_ctx()
function.
Common context methods:
get_secret(key: str)
— get the value of a secretlogger.info("started experiment..")
— textual logslog_result(key: str, value)
— log simple valuesset_label(key, value)
— set a label tag for that tasklog_artifact(key, body=None, local_path=None, ...)
— log an artifact (body or local file)log_dataset(key, df, ...)
— log a dataframe objectlog_model(key, ...)
— log a model object
Example function and usage of the context object:
from mlrun.artifacts import ChartArtifact
import pandas as pd
def my_job(context, p1=1, p2="x"):
# load MLRUN runtime context (will be set by the runtime framework)
# get parameters from the runtime context (or use defaults)
# access input metadata, values, files, and secrets (passwords)
print(f"Run: {context.name} (uid={context.uid})")
print(f"Params: p1={p1}, p2={p2}")
print("accesskey = {}".format(context.get_secret("ACCESS_KEY")))
print("file\n{}\n".format(context.get_input("infile.txt", "infile.txt").get()))
# Run some useful code e.g. ML training, data prep, etc.
# log scalar result values (job result metrics)
context.log_result("accuracy", p1 * 2)
context.log_result("loss", p1 * 3)
context.set_label("framework", "sklearn")
# log various types of artifacts (file, web page, table), will be versioned and visible in the UI
context.log_artifact(
"model",
body=b"abc is 123",
local_path="model.txt",
labels={"framework": "xgboost"},
)
context.log_artifact(
"html_result", body=b"<b> Some HTML <b>", local_path="result.html"
)
# create a chart output (will show in the pipelines UI)
chart = ChartArtifact("chart")
chart.labels = {"type": "roc"}
chart.header = ["Epoch", "Accuracy", "Loss"]
for i in range(1, 8):
chart.add_row([i, i / 20 + 0.75, 0.30 - i / 20])
context.log_artifact(chart)
raw_data = {
"first_name": ["Jason", "Molly", "Tina", "Jake", "Amy"],
"last_name": ["Miller", "Jacobson", "Ali", "Milner", "Cooze"],
"age": [42, 52, 36, 24, 73],
"testScore": [25, 94, 57, 62, 70],
}
df = pd.DataFrame(raw_data, columns=["first_name", "last_name", "age", "testScore"])
context.log_dataset("mydf", df=df, stats=True)
Example of creating the context objects from the environment:
if __name__ == "__main__":
context = mlrun.get_or_create_ctx('train')
p1 = context.get_param('p1', 1)
p2 = context.get_param('p2', 'a-string')
# do something
context.log_result("accuracy", p1 * 2)
# commit the tracking results to the DB (and mark as completed)
context.commit(completed=True)
Note that MLRun context is also a python context and can be used in a with
statement (eliminating the need for commit
).
if __name__ == "__main__":
with mlrun.get_or_create_ctx('train') as context:
p1 = context.get_param('p1', 1)
p2 = context.get_param('p2', 'a-string')
# do something
context.log_result("accuracy", p1 * 2)