MLRun execution context#

After running a job, you need to be able to track it. To gain the maximum value, MLRun uses the job context object inside the code. This provides access to job metadata, parameters, inputs, secrets, and API for logging and monitoring the results, as well as log text, files, artifacts, and labels.

Inside the function you can access the parameters/inputs by simply adding them as parameters to the function, or you can get them from the context object (using get_param() and get_input()).

  • If context is specified as the first parameter in the function signature, MLRun injects the current job context into it.

  • Alternatively, if it does not run inside a function handler (e.g. in Python main or Notebook) you can obtain the context object from the environment using the get_or_create_ctx() function.

Common context methods:

  • get_secret(key: str) — get the value of a secret

  • logger.info("started experiment..") — textual logs

  • log_result(key: str, value) — log simple values

  • set_label(key, value) — set a label tag for that task

  • log_artifact(key, body=None, local_path=None, ...) — log an artifact (body or local file)

  • log_dataset(key, df, ...) — log a dataframe object

  • log_model(key, ...) — log a model object

Example function and usage of the context object:

from mlrun.artifacts import PlotlyArtifact
import pandas as pd

def my_job(context, p1=1, p2="x"):
    # load MLRUN runtime context (will be set by the runtime framework)

    # get parameters from the runtime context (or use defaults)

    # access input metadata, values, files, and secrets (passwords)
    print(f"Run: {context.name} (uid={context.uid})")
    print(f"Params: p1={p1}, p2={p2}")
    print("accesskey = {}".format(context.get_secret("ACCESS_KEY")))
    print("file\n{}\n".format(context.get_input("infile.txt", "infile.txt").get()))

    # Run some useful code e.g. ML training, data prep, etc.

    # log scalar result values (job result metrics)
    context.log_result("accuracy", p1 * 2)
    context.log_result("loss", p1 * 3)
    context.set_label("framework", "sklearn")

    # log various types of artifacts (file, web page, table), will be versioned and visible in the UI
    context.log_artifact(
        "model",
        body=b"abc is 123",
        local_path="model.txt",
        labels={"framework": "xgboost"},
    )
    context.log_artifact(
        "html_result", body=b"<b> Some HTML <b>", local_path="result.html"
    )

     # create a plotly output (will show in the pipelines UI)
    x = np.arange(10)
    fig = go.Figure(data=go.Scatter(x=x, y=x**2))

    # Create a PlotlyArtifact using the figure and log it
    plotly_artifact = PlotlyArtifact(figure=fig, key="plotly")
    context.log_artifact(plotly_artifact)
    
    raw_data = {
        "first_name": ["Jason", "Molly", "Tina", "Jake", "Amy"],
        "last_name": ["Miller", "Jacobson", "Ali", "Milner", "Cooze"],
        "age": [42, 52, 36, 24, 73],
        "testScore": [25, 94, 57, 62, 70],
    }
    df = pd.DataFrame(raw_data, columns=["first_name", "last_name", "age", "testScore"])
    context.log_dataset("mydf", df=df, stats=True)

Example of creating the context objects from the environment:

if __name__ == "__main__":
    context = mlrun.get_or_create_ctx('train')
    p1 = context.get_param('p1', 1)
    p2 = context.get_param('p2', 'a-string')
    # do something
    context.log_result("accuracy", p1 * 2)

Note: The context object is expected to be used as part of a run. If you are looking for a similar API to use on your local environment (outside a local run) you can use the '~mlrun.projects.MLRunProject' object.