Using projects¶
You can add/update a project’s functions, artifacts, or workflows using set_function()
,
set_artifact()
, set_workflow()
, and set
various project attributes (parameters
, secrets
, etc.).
Use the project run()
method to run a registered workflow using a pipeline engine (e.g.
Kubeflow pipelines). The workflow executes its registered functions in a sequence/graph (DAG). The workflow can reference project
parameters, secrets and artifacts by name.
Projects can also be loaded and workflows/pipelines can be executed using the CLI (using mlrun project
command).
In this section
Updating and using project functions¶
Projects host or link to functions that are used in job or workflow runs. You add functions to a project using
set_function()
. This registers them as part of the project definition (and Yaml file).
Alternatively you can create functions using methods like code_to_function()
and save them to the DB (under the same project).
The preferred approach is to use set_function
(which also records the functions in the project spec).
The set_function()
method allow you to add/update many types of functions:
marketplace functions - load/register a marketplace function into the project (func=”hub://…”)
notebook file - convert a notebook file into a function (func=”path/to/file.ipynb”)
python file - convert a python file into a function (func=”path/to/file.py”)
database function - function stored in MLRun DB (func=”db://project/func-name:version”)
function yaml file - read the function object from a yaml file (func=”path/to/file.yaml”)
inline function spec - save the full function spec in the project definition file (func=func_object), not recommended
When loading a function from code file (py, ipynb) you should also specify a container image
and the runtime kind
(will use job
kind as default).
You can optionally specify the function handler
(the function handler to invoke), and a name
.
If the function is not a single file function, and it requires access to multiple files/libraries in the project,
you should set the with_repo=True
to add the entire repo code into the destination container during build or run time.
Note
When using with_repo=True
the functions need to be deployed (function.deploy()
) to build a container, unless you set project.spec.load_source_on_run=True
which instructs MLRun to load the git/archive repo into the function container
at run time and do not require a build (this is simpler when developing, for production its preferred to build the image with the code)
Examples:
project.set_function('hub://sklearn_classifier', 'train')
project.set_function('http://.../mynb.ipynb', 'test', image="mlrun/mlrun")
project.set_function('./src/mycode.py', 'ingest',
image='myrepo/ing:latest', with_repo=True)
project.set_function('db://project/func-name:version')
project.set_function('./func.yaml')
project.set_function(func_object)
once functions are registered or saved in the project we can get their function object using project.get_function(key)
.
example:
# get the data-prep function, add volume mount and run it with data input
project.get_function("data-prep").apply(v3io_mount())
run = project.run_function("data-prep", inputs={"data": data_url})
Run, Build, and Deploy functions¶
there are a set of methods used to deploy and run project functions, those can be used interactively or inside a pipeline (inside a pipeline it will be automatically mapped to the relevant pipeline engine command).
run_function()
- Run a local or remote task as part of a local run or pipelinebuild_function()
- deploy ML function, build container with its dependencies for use in runsdeploy_function()
- deploy real-time/online (nuclio or serving based) functions
You can use those methods as project
methods, or as global (mlrun.
) methods, the current project will be assumed for the later case.
run = myproject.run_function("train", inputs={"data": data_url}) # will run the "train" function in myproject
run = mlrun.run_function("train", inputs={"data": data_url}) # will run the "train" function in the current/active project
The first parameter in those three methods is the function name (in the project), or it can be a function object if we want to use functions we imported/created ad hoc, example:
# import a serving function from the marketplace and deploy a tarined model over it
serving = import_function("hub://v2_model_server", new_name="serving")
deploy = deploy_function(
serving,
models=[{"key": "mymodel", "model_path": train.outputs["model"]}],
)