mlrun.runtimes

mlrun.runtimes#

class mlrun.runtimes.ApplicationRuntime(**kwargs)[source]#

Bases: RemoteRuntime

property api_gateway#

create_api_gateway(name: str | None = None, path: str | None = None, direct_port_access: bool = False, authentication_mode: APIGatewayAuthenticationMode = None, authentication_creds: tuple[str, str] | None = None, ssl_redirect: bool | None = None, set_as_default: bool = False, gateway_timeout: int | None = None, port: int | None = None)[source]#

Create the application API gateway. Once the application is deployed, the API gateway can be created. An application without an API gateway is not accessible.

Parameters:

name -- The name of the API gateway
path -- Optional path of the API gateway, default value is "/". The given path should be supported by the deployed application
direct_port_access -- Set True to allow direct port access to the application sidecar
authentication_mode -- API Gateway authentication mode
authentication_creds -- API Gateway basic authentication credentials as a tuple (username, password)
ssl_redirect -- Set True to force SSL redirect, False to disable. Defaults to mlrun.mlconf.force_api_gateway_ssl_redirect()
set_as_default -- Set the API gateway as the default for the application (status.api_gateway)
gateway_timeout -- nginx ingress timeout in sec (request timeout, when will the gateway return an error)
port -- The API gateway port, used only when direct_port_access=True

Returns:

The API gateway URL

delete_api_gateway(name: str)[source]#: Delete API gateway by name. Refreshes the application status to update api gateway and invocation URLs. :param name: The API gateway name

delete_probe(type: str)[source]#

Delete a Kubernetes probe configuration from the sidecar container

Parameters:: type -- Probe type - one of "readiness", "liveness", "startup"
Returns:: function object (self)

deploy(project='', tag='', verbose=False, builder_env: dict | None = None, force_build: bool = False, with_mlrun=None, skip_deployed=False, is_kfp=False, mlrun_version_specifier=None, show_on_failure: bool = False, create_default_api_gateway: bool = True, track_models: bool | None = None, wait: bool = True)[source]#

Deploy function, builds the application image if required (self.requires_build()) or force_build is True, Once the image is built, the function is deployed.

Parameters:

project -- Project name
tag -- Function tag
verbose -- Set True for verbose logging
builder_env -- Env vars dict for source archive config/credentials e.g. builder_env={"GIT_TOKEN": token}
force_build -- Set True to force rebuilding the application image. Use this when changing requirements, commands, or base image after the initial deployment. Code-only changes don't require force_build as the init container loads the new source at runtime.
with_mlrun -- Add the current mlrun package to the container build
skip_deployed -- Skip the build if we already have an image for the function
is_kfp -- Deploy as part of a kfp pipeline
mlrun_version_specifier -- Which mlrun package version to include (if not current)
show_on_failure -- Show logs only in case of build failure
create_default_api_gateway -- When deploy finishes the default API gateway will be created for the application. Disabling this flag means that the application will not be accessible until an API gateway is created for it.
track_models -- override state of self.spec.track_models. If not provided, uses the spec value (False by default, True after setup_model_monitoring() is called). When True, model endpoints are created at deployment time.
wait -- must be True for application functions. Application deploy performs readiness-dependent post-deploy steps (API gateway and sidecar), so external wait orchestration is unsupported; passing False raises MLRunInvalidArgumentError.

Returns:

The default API gateway URL if created or True if the function is ready (deployed)

classmethod deploy_reverse_proxy_image()[source]#

Build the reverse proxy image and save it. The reverse proxy image is used to route requests to the application sidecar. This is useful when you want to decrease build time by building the application image only once.

Parameters:: use_cache -- Use the cache when building the image

disable_default_http_trigger(**kwargs)#

enable_default_http_trigger(**kwargs)#

from_image(image)[source]#

Deploy the function with an existing nuclio processor image. This applies only for the reverse proxy and not the application image.

Parameters:: image -- image name

static get_filename_and_handler() -> (<class 'str'>, <class 'str'>)[source]#

Invoke the remote (live) function and return the results

example:

function.invoke("/api", body={"inputs": x})

Parameters:

path -- request sub path (e.g. /images)
body -- request body (str, bytes or a dict for json requests)
method -- HTTP method (GET, PUT, ..)
headers -- key/value dict with http headers
force_external_address -- use the external ingress URL
auth_info -- service AuthInfo
mock -- use mock server vs a real Nuclio function (for local simulations)
http_client_kwargs -- allow the user to pass any parameter supported in requests.request method see this link for more information: https://requests.readthedocs.io/en/latest/api/#requests.request

kind = 'application'#

pre_deploy_validation()[source]#

prepare_image_for_deploy()[source]#: if a function has a 'spec.image' it is considered to be deployed, but because we allow the user to set 'spec.image' for usability purposes, we need to check whether this is a built image or it requires to be built on top.

requires_build() → bool[source]#

Check if the application image needs to be built.

For ApplicationRuntime, store:// URIs don't require a build because the init container loads them at runtime. This allows redeploying with source code changes without rebuilding the image.

resolve_default_api_gateway_name()[source]#

reverse_proxy_image = None#

set_internal_application_port(port: int)[source]#

Set a Kubernetes probe configuration for the sidecar container

The config parameter serves as the base configuration, and individual parameters override values in config. If http_path is provided without http_port and config is not provided, the port will be enriched from the internal application port just before deployment.

Parameters:

type -- Probe type - one of "readiness", "liveness", "startup"
initial_delay_seconds -- Number of seconds after the container has started before probes are initiated
period_seconds -- How often (in seconds) to perform the probe
failure_threshold -- Minimum consecutive failures for the probe to be considered failed
timeout_seconds -- Number of seconds after which the probe times out
http_path -- If provided, use an HTTP probe with this path
http_port -- If HTTP probe is used and no port provided, the internal application port will be used
http_scheme -- "http" or "https" for HTTP probe. Defaults to "http"
config -- A full dict with the probe configuration (used as base, overridden by individual parameters)

Returns:

function object (self)

set_source_target(target_dir: str)[source]#

Configure the target directory where application source code will be extracted at runtime by the init container.

Parameters:: target_dir -- Absolute path inside the runtime container where the source code will be placed

property spec: ApplicationSpec#

property status: ApplicationStatus#

property url#

Add a sidecar container to the function pod

Parameters:

name -- Sidecar container name.
image -- Sidecar container image.
ports -- Sidecar container ports to expose. Can be a single port or a list of ports.
command -- Sidecar container command instead of the image entrypoint.
args -- Sidecar container command args (requires command to be set).

with_source_archive(source, workdir=None, pull_at_runtime: bool = False, target_dir: str | None = None)[source]#

load the code from git/tar/zip archive at build or runtime

Parameters:

source -- valid absolute path or URL to git, zip, or tar file, e.g. git://github.com/mlrun/something.git http://some/url/file.zip note path source must exist on the image or exist locally when run is local (it is recommended to use 'workdir' when source is a filepath instead)
workdir -- working dir relative to the archive root (e.g. './subdir') or absolute to the image root
pull_at_runtime -- load the archive into the container at runtime (via init container) vs on build
target_dir -- target dir on runtime pod for repo clone / archive extraction

class mlrun.runtimes.BaseRuntime(metadata=None, spec=None)[source]#

Bases: ModelObj

as_step(runspec: RunObject | RunTemplate = None, handler=None, name: str = '', project: str = '', params: dict | None = None, hyperparams=None, selector='', hyper_param_options: HyperParamOptions = None, inputs: dict[str, str | list | dict] | None = None, outputs: list | None = None, workdir: str = '', artifact_path: str = '', image: str = '', labels: dict | None = None, use_db=True, verbose=None, scrape_metrics=False, returns: list[str | LogHint] | None = None, auto_build: bool = False)[source]#

Run a local or remote task.

Parameters:

runspec -- run template object or dict (see RunTemplate)
handler -- name of the function handler
name -- execution name
project -- project name
params -- input parameters (dict)
hyperparams -- hyper parameters
selector -- selection criteria for hyper params
hyper_param_options -- hyper param options (selector, early stop, strategy, ..) see: HyperParamOptions
inputs -- Input objects to pass to the handler. Type hints can be given so the input will be parsed during runtime from mlrun.DataItem to the given type hint. The type hint can be given in the key field of the dictionary after a colon, e.g: "<key> : <type_hint>". An input can include a collection of inputs in a dict or list.
outputs -- list of outputs which can pass in the workflow
artifact_path -- default artifact output path (replace out_path)
workdir -- working directory of the executed job and the default path for artifact inputs
image -- container image to use
labels -- labels to tag the job/run with ({key:val, ..})
use_db -- save function spec in the db (vs the workflow file)
verbose -- add verbose prints/logs
scrape_metrics -- whether to add the mlrun/scrape-metrics label to this run's resources
returns --
List of log hints - configurations for how to log the returning values from the handler's run (as artifacts or results). The list's length must be equal to the amount of returning objects. A log hint may be given as:
- A LogHint object with the key and extra configurations.
- A "shortcut" string of the key to use to log the returning value as result or as an artifact. To specify The artifact type, it is possible to pass a string in the following structure: "<key> : <type>". Available artifact types can be seen in mlrun.ArtifactType. If no artifact type is specified, the object's default artifact type will be used. Itemization can also be specified before the key using the following structure: "<unbundle-level> * <key>". If unbundle level is not specified, the default is full unbundling.
auto_build -- when set to True and the function require build it will be built on the first function run, use only if you dont plan on changing the build config between runs

Returns:

mlrun_pipelines.models.PipelineNodeWrapper

clean_build_params()[source]#

doc()[source]#

enrich_runtime_spec(project_node_selector: dict[str, str])[source]#

export(target='', format='.yaml', secrets=None, strip=True)[source]#

save function spec to a local/remote path (default to./function.yaml)

Parameters:

target -- target path/url
format -- .yaml (default) or .json
secrets -- optional secrets dict/object for target path (e.g. s3)
strip -- strip status data

Returns:

self

full_image_path(image=None, client_version: str | None = None, client_python_version: str | None = None)[source]#

is_deployed()[source]#

is_model_monitoring_function()[source]#

kind = 'base'#

property metadata: BaseMetadata#

prepare_image_for_deploy()[source]#: if a function has a 'spec.image' it is considered to be deployed, but because we allow the user to set 'spec.image' for usability purposes, we need to check whether this is a built image or it requires to be built on top.

remove_auth_secret_volumes()[source]#

requires_build() → bool[source]#

Run a local or remote task.

Parameters:

runspec -- The run spec to generate the RunObject from. Can be RunTemplate | RunObject | dict.
handler -- Pointer or name of a function handler.
name -- Execution name.
project -- Project name.
params -- Input parameters (dict).
inputs -- Input objects to pass to the handler. Type hints can be given so the input will be parsed during runtime from mlrun.DataItem to the given type hint. The type hint can be given in the key field of the dictionary after a colon, e.g: "<key> : <type_hint>". An input can include a collection of inputs in a dict or list.
workdir -- Working directory of the executed job and the default path for artifact inputs
watch -- Watch/follow run log.
schedule -- ScheduleCronTrigger class instance or a standard crontab expression string (which will be converted to the class using its from_crontab constructor), see this link for help: https://apscheduler.readthedocs.io/en/3.x/modules/triggers/cron.html#module-apscheduler.triggers.cron
hyperparams -- Dict of param name and list of values to be enumerated. The default strategy is grid search and uses e.g. {"p1": [1,2,3]}. (Can be specified as a JSON file) For list, lists must be of equal length, e.g. {"p1": [1], "p2": [2]}. (Can be specified as JSON file or as a CSV file listing the parameter values per iteration.) You can specify strategy of type grid, list, random, and other options in the hyper_param_options parameter.
hyper_param_options -- Dict or HyperParamOptions struct of hyperparameter options.
verbose -- Add verbose prints/logs.
scrape_metrics -- Whether to add the mlrun/scrape-metrics label to this run's resources.
local -- Run the function locally vs on the runtime/cluster.
local_code_path -- Path of the code for local runs & debug.
auto_build -- When set to True and the function require build it will be built on the first function run, use only if you don't plan on changing the build config between runs.
param_file_secrets -- Dictionary of secrets to be used only for accessing the hyper-param parameter file. These secrets are only used locally and will not be stored anywhere
notifications -- List of notifications to push when the run is completed
returns --
List of log hints - configurations for how to log the returning values from the handler's run (as artifacts or results). The list's length must be equal to the amount of returning objects. A log hint may be given as:
- A LogHint object with the key and extra configurations.
- A "shortcut" string of the key to use to log the returning value as result or as an artifact. To specify The artifact type, it is possible to pass a string in the following structure: "<key> : <type>". Available artifact types can be seen in mlrun.ArtifactType. If no artifact type is specified, the object's default artifact type will be used. Packing kwargs can be passed alongside the artifact type using square brackets: "<key> : <type>[<kwarg1>=<value1>, <kwarg2>=<value2>]". Itemization can also be specified before the key using the following structure: "<unbundle-level> * <key>". If unbundle level is not specified, the default is full unbundling.
state_thresholds -- Dictionary of states to time thresholds. The state will be matched against the k8s resource's status. The threshold should be a time string that conforms to timelength python package standards and is at least 1 minute (-1 for infinite). If the phase is active for longer than the threshold, the run will be aborted. See mlconf.function.spec.state_thresholds for the state options and default values.
reset_on_run -- When True, function python modules would reload prior to code execution. This ensures latest code changes are executed. This argument must be used in conjunction with the local=True argument.
output_path -- Default artifact output path.
retry -- Retry configuration for the run, can be a dict or an instance of Retry. The count field in the Retry object specifies the number of retry attempts. If count=0, the run will not be retried. The backoff field specifies the retry backoff strategy between retry attempts. If not provided, the default backoff delay is 30 seconds.

Returns:

Run context object (RunObject) with run metadata, results and status

save(tag='', versioned=False, refresh=False) → str[source]#

set_categories(categories: list[str])[source]#

set_db_connection(conn)[source]#

set_label(key, value)[source]#

skip_image_enrichment()[source]#

property spec: FunctionSpec#

property status: FunctionStatus#

store_run(runobj: RunObject)[source]#

try_auto_mount_based_on_config()[source]#

property uri#

validate()[source]#

Validate that this runtime is allowed to run on the current system.

Subclasses override this to enforce runtime-specific preconditions (raising an mlrun.errors exception when violated).

validate_and_enrich_service_account(allowed_service_accounts, forbidden_service_accounts, default_service_account)[source]#

with_code(from_file='', body=None, with_doc=True)[source]#

Update the function code This function eliminates the need to build container images every time we edit the code

Parameters:

from_file -- blank for current notebook, or path to .py/.ipynb file
body -- will use the body as the function code
with_doc -- update the document of the function parameters

Returns:

function object

with_commands(commands: list[str], overwrite: bool = False, prepare_image_for_deploy: bool = True)[source]#

add commands to build spec.

Parameters:

commands -- list of commands to run during build
overwrite -- overwrite existing commands
prepare_image_for_deploy -- prepare the image/base_image spec for deployment

Returns:

function object

with_requirements(requirements: list[str] | None = None, overwrite: bool = False, prepare_image_for_deploy: bool = True, requirements_file: str | None = '')[source]#

add package requirements from file or list to build spec.

Parameters:

requirements -- a list of python packages
requirements_file -- a local python requirements file path
overwrite -- overwrite existing requirements
prepare_image_for_deploy -- prepare the image/base_image spec for deployment

Returns:

function object

class mlrun.runtimes.DaskCluster(spec=None, metadata=None)[source]#

Bases: KubejobRuntime

property client#

close(running=True)[source]#

cluster()[source]#

deploy(watch=True, with_mlrun=None, skip_deployed=False, is_kfp=False, mlrun_version_specifier=None, builder_env: dict | None = None, show_on_failure: bool = False, force_build: bool = False)[source]#

deploy function, build container with dependencies

Parameters:

watch -- wait for the deploy to complete (and print build logs)
with_mlrun -- add the current mlrun package to the container build
skip_deployed -- skip the build if we already have an image for the function
is_kfp -- deploy as part of a kfp pipeline
mlrun_version_specifier -- which mlrun package version to include (if not current)
builder_env -- Kaniko builder pod env vars dict (for config/credentials) e.g. builder_env={"GIT_TOKEN": token}
show_on_failure -- show logs only in case of build failure
force_build -- force building the image, even when no changes were made

Returns:

True if the function is ready (deployed)

get_status()[source]#

property initialized#

is_deployed()[source]#: check if the function is deployed (has a valid container)

kind = 'dask'#

Run a local or remote task.

Parameters:

runspec -- The run spec to generate the RunObject from. Can be RunTemplate | RunObject | dict.
handler -- Pointer or name of a function handler.
name -- Execution name.
project -- Project name.
params -- Input parameters (dict).
inputs -- Input objects to pass to the handler. Type hints can be given so the input will be parsed during runtime from mlrun.DataItem to the given type hint. The type hint can be given in the key field of the dictionary after a colon, e.g: "<key> : <type_hint>". An input can include a collection of inputs in a dict or list.
workdir -- Working directory of the executed job and the default path for artifact inputs
watch -- Watch/follow run log.
schedule -- ScheduleCronTrigger class instance or a standard crontab expression string (which will be converted to the class using its from_crontab constructor), see this link for help: https://apscheduler.readthedocs.io/en/3.x/modules/triggers/cron.html#module-apscheduler.triggers.cron
hyperparams -- Dict of param name and list of values to be enumerated. The default strategy is grid search and uses e.g. {"p1": [1,2,3]}. (Can be specified as a JSON file) For list, lists must be of equal length, e.g. {"p1": [1], "p2": [2]}. (Can be specified as JSON file or as a CSV file listing the parameter values per iteration.) You can specify strategy of type grid, list, random, and other options in the hyper_param_options parameter.
hyper_param_options -- Dict or HyperParamOptions struct of hyperparameter options.
verbose -- Add verbose prints/logs.
scrape_metrics -- Whether to add the mlrun/scrape-metrics label to this run's resources.
local -- Run the function locally vs on the runtime/cluster.
local_code_path -- Path of the code for local runs & debug.
auto_build -- When set to True and the function require build it will be built on the first function run, use only if you don't plan on changing the build config between runs.
param_file_secrets -- Dictionary of secrets to be used only for accessing the hyper-param parameter file. These secrets are only used locally and will not be stored anywhere
notifications -- List of notifications to push when the run is completed
returns --
List of log hints - configurations for how to log the returning values from the handler's run (as artifacts or results). The list's length must be equal to the amount of returning objects. A log hint may be given as:
- A LogHint object with the key and extra configurations.
- A "shortcut" string of the key to use to log the returning value as result or as an artifact. To specify The artifact type, it is possible to pass a string in the following structure: "<key> : <type>". Available artifact types can be seen in mlrun.ArtifactType. If no artifact type is specified, the object's default artifact type will be used. Packing kwargs can be passed alongside the artifact type using square brackets: "<key> : <type>[<kwarg1>=<value1>, <kwarg2>=<value2>]". Itemization can also be specified before the key using the following structure: "<unbundle-level> * <key>". If unbundle level is not specified, the default is full unbundling.
state_thresholds -- Dictionary of states to time thresholds. The state will be matched against the k8s resource's status. The threshold should be a time string that conforms to timelength python package standards and is at least 1 minute (-1 for infinite). If the phase is active for longer than the threshold, the run will be aborted. See mlconf.function.spec.state_thresholds for the state options and default values.
reset_on_run -- When True, function python modules would reload prior to code execution. This ensures latest code changes are executed. This argument must be used in conjunction with the local=True argument.
output_path -- Default artifact output path.
retry -- Retry configuration for the run, can be a dict or an instance of Retry. The count field in the Retry object specifies the number of retry attempts. If count=0, the run will not be retried. The backoff field specifies the retry backoff strategy between retry attempts. If not provided, the default backoff delay is 30 seconds.

Returns:

Run context object (RunObject) with run metadata, results and status

set_state_thresholds(state_thresholds: dict[str, str], patch: bool = True)[source]#

Set the threshold for a specific state of the runtime. The threshold is the amount of time that the runtime will wait before aborting the run if the job is in the matching state. The threshold time string must conform to timelength python package standards and be at least 1 minute (e.g. 1000s, 1 hour 30m, 1h etc. or -1 for infinite). If the threshold is not set for a state, the default threshold will be used.

Parameters:

state_thresholds --
A dictionary of state to threshold. The supported states are:
- pending_scheduled - The pod/crd is scheduled on a node but not yet running
- pending_not_scheduled - The pod/crd is not yet scheduled on a node
- executing - The pod/crd started and is running
- image_pull_backoff - The pod/crd is in image pull backoff
See mlrun.mlconf.function.spec.state_thresholds for the default thresholds.
patch -- Whether to merge the given thresholds with the existing thresholds (True, default) or override them (False)

property spec: DaskSpec#

property status: DaskStatus#

validate()[source]#

Validate that this runtime is allowed to run on the current system.

Subclasses override this to enforce runtime-specific preconditions (raising an mlrun.errors exception when violated).

with_limits(mem=None, cpu=None, gpus=None, gpu_type='nvidia.com/gpu', patch: bool = False)[source]#

Set pod cpu/memory/gpu limits (max values)

Parameters:

mem -- set limit for memory e.g. '500M', '2G', etc.
cpu -- set limit for cpu e.g. '0.5', '2', etc.
gpus -- set limit for gpu
gpu_type -- set gpu type e.g. "nvidia.com/gpu"
patch -- by default it overrides the whole limits section, if you wish to patch specific resources use patch=True

with_requests(mem=None, cpu=None, patch: bool = False)[source]#

Set requested (desired) pod cpu/memory resources

Parameters:

mem -- set request for memory e.g. '200M', '1G', etc.
cpu -- set request for cpu e.g. '0.1', '1', etc.
patch -- by default it overrides the whole requests section, if you wish to patch specific resources use patch=True

with_scheduler_limits(mem: str | None = None, cpu: str | None = None, gpus: int | None = None, gpu_type: str = 'nvidia.com/gpu', patch: bool = False)[source]#: set scheduler pod resources limits by default it overrides the whole limits section, if you wish to patch specific resources use patch=True.

with_scheduler_requests(mem: str | None = None, cpu: str | None = None, patch: bool = False)[source]#: set scheduler pod resources requests by default it overrides the whole requests section, if you wish to patch specific resources use patch=True.

with_worker_limits(mem: str | None = None, cpu: str | None = None, gpus: int | None = None, gpu_type: str = 'nvidia.com/gpu', patch: bool = False)[source]#: set worker pod resources limits by default it overrides the whole limits section, if you wish to patch specific resources use patch=True.

with_worker_requests(mem: str | None = None, cpu: str | None = None, patch: bool = False)[source]#: set worker pod resources requests by default it overrides the whole requests section, if you wish to patch specific resources use patch=True.

class mlrun.runtimes.DatabricksRuntime(spec=None, metadata=None)[source]#

Bases: KubejobRuntime

get_internal_parameters(runobj: RunObject)[source]#: Return the internal function parameters + code.

kind = 'databricks'#

Run a local or remote task.

Parameters:

runspec -- The run spec to generate the RunObject from. Can be RunTemplate | RunObject | dict.
handler -- Pointer or name of a function handler.
name -- Execution name.
project -- Project name.
params -- Input parameters (dict).
inputs -- Input objects to pass to the handler. Type hints can be given so the input will be parsed during runtime from mlrun.DataItem to the given type hint. The type hint can be given in the key field of the dictionary after a colon, e.g: "<key> : <type_hint>". An input can include a collection of inputs in a dict or list.
workdir -- Working directory of the executed job and the default path for artifact inputs
watch -- Watch/follow run log.
schedule -- ScheduleCronTrigger class instance or a standard crontab expression string (which will be converted to the class using its from_crontab constructor), see this link for help: https://apscheduler.readthedocs.io/en/3.x/modules/triggers/cron.html#module-apscheduler.triggers.cron
hyperparams -- Dict of param name and list of values to be enumerated. The default strategy is grid search and uses e.g. {"p1": [1,2,3]}. (Can be specified as a JSON file) For list, lists must be of equal length, e.g. {"p1": [1], "p2": [2]}. (Can be specified as JSON file or as a CSV file listing the parameter values per iteration.) You can specify strategy of type grid, list, random, and other options in the hyper_param_options parameter.
hyper_param_options -- Dict or HyperParamOptions struct of hyperparameter options.
verbose -- Add verbose prints/logs.
scrape_metrics -- Whether to add the mlrun/scrape-metrics label to this run's resources.
local -- Run the function locally vs on the runtime/cluster.
local_code_path -- Path of the code for local runs & debug.
auto_build -- When set to True and the function require build it will be built on the first function run, use only if you don't plan on changing the build config between runs.
param_file_secrets -- Dictionary of secrets to be used only for accessing the hyper-param parameter file. These secrets are only used locally and will not be stored anywhere
notifications -- List of notifications to push when the run is completed
returns --
List of log hints - configurations for how to log the returning values from the handler's run (as artifacts or results). The list's length must be equal to the amount of returning objects. A log hint may be given as:
- A LogHint object with the key and extra configurations.
- A "shortcut" string of the key to use to log the returning value as result or as an artifact. To specify The artifact type, it is possible to pass a string in the following structure: "<key> : <type>". Available artifact types can be seen in mlrun.ArtifactType. If no artifact type is specified, the object's default artifact type will be used. Packing kwargs can be passed alongside the artifact type using square brackets: "<key> : <type>[<kwarg1>=<value1>, <kwarg2>=<value2>]". Itemization can also be specified before the key using the following structure: "<unbundle-level> * <key>". If unbundle level is not specified, the default is full unbundling.
state_thresholds -- Dictionary of states to time thresholds. The state will be matched against the k8s resource's status. The threshold should be a time string that conforms to timelength python package standards and is at least 1 minute (-1 for infinite). If the phase is active for longer than the threshold, the run will be aborted. See mlconf.function.spec.state_thresholds for the state options and default values.
reset_on_run -- When True, function python modules would reload prior to code execution. This ensures latest code changes are executed. This argument must be used in conjunction with the local=True argument.
output_path -- Default artifact output path.
retry -- Retry configuration for the run, can be a dict or an instance of Retry. The count field in the Retry object specifies the number of retry attempts. If count=0, the run will not be retried. The backoff field specifies the retry backoff strategy between retry attempts. If not provided, the default backoff delay is 30 seconds.

Returns:

Run context object (RunObject) with run metadata, results and status

property spec: DatabricksSpec#

class mlrun.runtimes.HandlerRuntime(metadata=None, spec=None)[source]#

Bases: BaseRuntime, ParallelRunner

kind = 'handler'#

class mlrun.runtimes.KubeResource(spec=None, metadata=None)[source]#

Bases: BaseRuntime

A parent class for runtimes that generate k8s resources when executing.

apply(modifier: Callable[[KubeResource], KubeResource]) → KubeResource[source]#

Apply a modifier to the runtime which is used to change the runtimes k8s object's spec. All modifiers accept Kube, apply some changes on its spec and return it so modifiers can be chained one after the other.

Parameters:: modifier -- a modifier callable object
Returns:: the runtime (self) after the modifications

detect_preemptible_affinity(affinity: V1Affinity) → list[str][source]#

Check whether any provided affinity rules match preemptible affinity configs.

Parameters:: affinity -- User-provided affinity object.
Returns:: List of formatted expressions that overlap with preemptible terms.

static detect_preemptible_node_selector(node_selector: dict[str, str]) → list[str][source]#

Check whether any provided node selector matches preemptible selectors.

Parameters:: node_selector -- User-provided node selector mapping.
Returns:: List of "key='value'" strings that match a preemptible selector.

detect_preemptible_tolerations(tolerations: list[V1Toleration]) → list[str][source]#

Check whether any provided toleration matches preemptible tolerations.

Parameters:: tolerations -- User-provided tolerations.
Returns:: List of formatted toleration strings that are considered preemptible.

get_default_priority_class_name()[source]#

get_env(name, default=None)[source]#: Get the pod environment variable for the given name, if not found return the default If it's a scalar value, will return it, if the value is from source, return the k8s struct (V1EnvVarSource)

has_user_set_plain_env(name: str) → bool[source]#

Check whether name is present in the runtime spec as a plain-value env var.

Returns True only for env vars with a plain .value and not flagged as auto-mount-injected (see mark_env_auto_mount_injected). Returns False for secret-injected vars (.value_from) and for plain values that an auto-mount modifier wrote — the latter must yield to project-secret injection. Auto-mount modifiers and secret-store injection consult this to defer to user-set values instead of overriding them.

is_env_exists(name)[source]#: Check whether there is an environment variable define for the given key

kind = 'job'#

list_valid_priority_class_names()[source]#

mark_env_auto_mount_injected(name: str) → None[source]#

Flag name as written by an auto-mount modifier (plain-value path).

Called by mount_s3 (and any future modifier that injects plain values) right after writing the env var, so that project-secret injection can override the value via has_user_set_plain_env. Idempotent.

Ordering contract: must be called after set_env/_set_env, never before. Any write to name via _set_env clears the marker (see the comment there), so flagging first and writing second would silently wipe the flag.

raise_preemptible_warning(node_selector: dict[str, str] | None, tolerations: list[V1Toleration] | None, affinity: V1Affinity | None) → None[source]#

Detect conflicts and emit a single consolidated warning if needed.

Parameters:

node_selector -- User-provided node selector.
tolerations -- User-provided tolerations.
affinity -- User-provided affinity.

Warns:

PreemptionWarning - Emitted when any of the provided selectors, tolerations, or affinity terms match the configured preemptible settings. The message lists the conflicting items.

set_env(name: str, value: str | None = None, value_from: Any | None = None)[source]#: Set an environment variable. If value comes from a Secret, validate on client-side only.

set_env_from_secret(name: str, secret: str | None = None, secret_key: str | None = None)[source]#: Set an environment variable from a Kubernetes Secret. Client-side guard forbids MLRun internal auth/project secrets; no-op on API.

set_env_from_secret_ref(secret_name: str)[source]#: Mount all keys from a Kubernetes Secret as environment variables. Uses envFrom.secretRef so every key in the secret becomes an env var.

set_envs(env_vars: dict | None = None, file_path: str | None = None)[source]#

set pod environment var from key/value dict or .env file

Parameters:

env_vars -- dict with env key/values
file_path -- .env file with key=value lines

set_image_pull_configuration(image_pull_policy: str | None = None, image_pull_secret_name: str | None = None)[source]#

Configure the image pull parameters for the runtime.

Parameters:

image_pull_policy -- The policy to use when pulling. One of IfNotPresent, Always or Never
image_pull_secret_name -- Name of a k8s secret containing image repository's authentication credentials

set_state_thresholds(state_thresholds: dict[str, str], patch: bool = True)[source]#

Set the threshold for a specific state of the runtime. The threshold is the amount of time that the runtime will wait before aborting the run if the job is in the matching state. The threshold time string must conform to timelength python package standards and be at least 1 minute (e.g. 1000s, 1 hour 30m, 1h etc. or -1 for infinite). If the threshold is not set for a state, the default threshold will be used.

Parameters:

state_thresholds --
A dictionary of state to threshold. The supported states are:
- pending_scheduled - The pod/crd is scheduled on a node but not yet running
- pending_not_scheduled - The pod/crd is not yet scheduled on a node
- executing - The pod/crd started and is running
- image_pull_backoff - The pod/crd is in image pull backoff
See mlrun.mlconf.function.spec.state_thresholds for the default thresholds.
patch -- Whether to merge the given thresholds with the existing thresholds (True, default) or override them (False)

property spec: KubeResourceSpec#

try_auto_mount_based_on_config(override_params=None)[source]#

validate_and_enrich_service_account(allowed_service_accounts, forbidden_service_accounts, default_service_account)[source]#

with_annotations(annotations: dict)[source]#: set a key/value annotations in the metadata of the pod

with_limits(mem: str | None = None, cpu: str | None = None, gpus: int | None = None, gpu_type: str = 'nvidia.com/gpu', patch: bool = False)[source]#

Set pod cpu/memory/gpu limits (max values)

Parameters:

mem -- set limit for memory e.g. '500M', '2G', etc.
cpu -- set limit for cpu e.g. '0.5', '2', etc.
gpus -- set limit for gpu
gpu_type -- set gpu type e.g. "nvidia.com/gpu"
patch -- by default it overrides the whole limits section, if you wish to patch specific resources use patch=True

with_node_selection(node_name: str | None = None, node_selector: dict[str, str] | None = None, affinity: V1Affinity | None = None, tolerations: list[V1Toleration] | None = None)[source]#

Configure Kubernetes node scheduling for this function.

Updates one or more scheduling hints: exact node pinning, label-based selection, affinity/anti-affinity rules, and taint tolerations. Passing None leaves the current value unchanged; pass an empty dict/list (e.g., {}, []) to clear.

Parameters:

node_name -- Exact Kubernetes node name to pin the pod to.
node_selector -- Mapping of label selectors. Use {} to clear.
affinity -- kubernetes.client.V1Affinity constraints.
tolerations -- List of kubernetes.client.V1Toleration. Use [] to clear.

Warns:

PreemptionWarning - Emitted if provided selectors/tolerations/affinity conflict with the function's preemption mode.

Example usage:

Prefer a GPU pool and allow scheduling on spot nodes:

job.with_node_selection(
    node_selector={"nodepool": "gpu"},
    tolerations=[
        k8s_client.V1Toleration(key="spot", operator="Exists")
    ],
)

with_preemption_mode(mode: PreemptionModes | str)[source]#

Preemption mode controls whether pods can be scheduled on preemptible nodes. Tolerations, node selector, and affinity are populated on preemptible nodes corresponding to the function spec.

The supported modes are:

allow - The function can be scheduled on preemptible nodes
constrain - The function can only run on preemptible nodes
prevent - The function cannot be scheduled on preemptible nodes
none - No preemptible configuration will be applied on the function

The default preemption mode is configurable in mlrun.mlconf.function_defaults.preemption_mode, by default it's set to prevent

Parameters:: mode -- allow | constrain | prevent | none defined in PreemptionModes

with_priority_class(name: str | None = None)[source]#

Enables to control the priority of the pod If not passed - will default to mlrun.mlconf.default_function_priority_class_name

Parameters:: name -- The name of the priority class

with_requests(mem: str | None = None, cpu: str | None = None, patch: bool = False)[source]#

Set requested (desired) pod cpu/memory resources

Parameters:

mem -- set request for memory e.g. '200M', '1G', etc.
cpu -- set request for cpu e.g. '0.1', '1', etc.
patch -- by default it overrides the whole requests section, if you wish to patch specific resources use patch=True

with_security_context(security_context: V1SecurityContext)[source]#

Set security context for the pod. For Iguazio we handle security context internally - see mlrun.common.schemas.function.SecurityContextEnrichmentModes

Example:

from kubernetes import client as k8s_client

security_context = k8s_client.V1SecurityContext(
    run_as_user=1000,
    run_as_group=3000,
)
function.with_security_context(security_context)

More info: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod

Parameters:: security_context -- The security context for the pod

class mlrun.runtimes.KubejobRuntime(spec=None, metadata=None)[source]#

Bases: KubeResource

build_config(image='', base_image=None, commands: list | None = None, secret=None, source=None, extra=None, load_source_on_run=None, with_mlrun=None, auto_build=None, requirements=None, overwrite=True, prepare_image_for_deploy=True, requirements_file=None, builder_env=None, extra_args=None)[source]#

specify builder configuration for the deploy operation

Parameters:

image -- target image name/path
base_image -- base image name/path
commands -- list of docker build (RUN) commands e.g. ['pip install pandas']
secret -- k8s secret for accessing the docker registry
source -- source git/tar archive to load code from in to the context/workdir e.g. git://github.com/mlrun/something.git#development
extra -- extra Dockerfile lines
load_source_on_run -- load the archive code into the container at runtime vs at build time
with_mlrun -- add the current mlrun package to the container build
auto_build -- when set to True and the function require build it will be built on the first function run, use only if you dont plan on changing the build config between runs
requirements -- a list of packages to install
requirements_file -- requirements file to install
overwrite -- overwrite existing build configuration (currently applies to requirements and commands) * False: the new params are merged with the existing * True: the existing params are replaced by the new ones
prepare_image_for_deploy -- prepare the image/base_image spec for deployment
extra_args -- A string containing additional builder arguments in the format of command-line options, e.g. extra_args="--skip-tls-verify --build-arg A=val"
builder_env -- Kaniko builder pod env vars dict (for config/credentials) e.g. builder_env={"GIT_TOKEN": token}

deploy(watch: bool = True, with_mlrun: bool | None = None, skip_deployed: bool = False, is_kfp: bool = False, mlrun_version_specifier: bool | None = None, builder_env: dict | None = None, show_on_failure: bool = False, force_build: bool = False) → bool[source]#

Deploy function, build container with dependencies

Parameters:

watch -- Wait for the deploy to complete (and print build logs)
with_mlrun -- Add the current mlrun package to the container build
skip_deployed -- Skip the build if we already have an image for the function
is_kfp -- Deploy as part of a kfp pipeline
mlrun_version_specifier -- Which mlrun package version to include (if not current)
builder_env -- Kaniko builder pod env vars dict (for config/credentials) e.g. builder_env={"GIT_TOKEN": token}
show_on_failure -- Show logs only in case of build failure
force_build -- Set True for force building the image, even when no changes were made

Returns:

True if the function is ready (deployed)

deploy_step(image=None, base_image=None, commands: list | None = None, secret_name='', with_mlrun=True, skip_deployed=False)[source]#

is_deployed()[source]#: check if the function is deployed (has a valid container)

kind = 'job'#

property serving_spec#

with_source_archive(source, workdir=None, handler=None, pull_at_runtime=True, target_dir=None)[source]#

load the code from git/tar/zip archive at runtime or build

Parameters:

source -- valid absolute path or URL to git, zip, or tar file, e.g. git://github.com/mlrun/something.git http://some/url/file.zip note path source must exist on the image or exist locally when run is local (it is recommended to use 'workdir' when source is a filepath instead)
handler -- default function handler
workdir -- working dir relative to the archive root (e.g. './subdir') or absolute to the image root
pull_at_runtime -- load the archive into the container at job runtime vs on build/deploy
target_dir -- target dir on runtime pod or repo clone / archive extraction

class mlrun.runtimes.LocalRuntime(metadata=None, spec=None)[source]#

Bases: BaseRuntime, ParallelRunner

is_deployed()[source]#

kind = 'local'#

to_job(image='', func_name: str | None = None)[source]#

with_source_archive(source, workdir=None, handler=None, target_dir=None)[source]#

load the code from git/tar/zip archive at runtime or build

Parameters:

source -- valid path to git, zip, or tar file, e.g. git://github.com/mlrun/something.git http://some/url/file.zip
handler -- default function handler
workdir -- working dir relative to the archive root (e.g. './subdir') or absolute
target_dir -- local target dir for repo clone (by default its <current-dir>/code)

class mlrun.runtimes.MpiRuntimeV1(spec=None, metadata=None)[source]#

Bases: AbstractMPIJobRuntime

crd_group = 'kubeflow.org'#

crd_plural = 'mpijobs'#

crd_version = 'v1'#

property spec: MPIV1ResourceSpec#

class mlrun.runtimes.RemoteRuntime(spec=None, metadata=None)[source]#

Bases: KubeResource

Add a RabbitMQ trigger to the function.

Allows consuming messages from RabbitMQ queues or topic-based routing. See https://docs.nuclio.io/en/latest/reference/triggers/rabbitmq.html for more details.

Parameters:

url -- RabbitMQ connection URL in AMQP format (e.g., 'amqp://host:port' or 'amqp://user:pass@host:port') or a datastore profile URL (e.g., 'ds://profile-name')
exchange_name -- The exchange that contains the queue (required unless using a datastore profile that provides it)
name -- Trigger name (default: 'rabbitmq')
queue_name -- Specific queue to consume from. Either queue_name or topics must be specified, but not both.
topics -- List of topics (routing keys) to subscribe to. Creates a unique queue and binds it to these routing keys. Either queue_name or topics must be specified, but not both.
username -- RabbitMQ username (can also be embedded in URL)
password -- RabbitMQ password (can also be embedded in URL)
prefetch_count -- Broker channel prefetch limit (0 = unlimited)
durable_exchange -- Whether the exchange should survive broker restart
durable_queue -- Whether the queue should survive broker restart
on_error -- Error handling strategy: 'ack' or 'nack'
requeue_on_error -- Whether to requeue failed messages (when on_error='nack')
reconnect_duration -- Total time to attempt reconnection (e.g., '5m')
reconnect_interval -- Time between reconnection attempts (e.g., '15s')
num_workers -- Number of workers processing messages concurrently
worker_termination_timeout -- Timeout for worker termination (e.g., '10s')

Example usage:

function.add_rabbitmq_trigger(
    url="amqp://rabbitmq-host:5672",
    exchange_name="my-exchange",
    queue_name="my-queue",
    username="user",
    password="pass",
)

Or with topics (routing keys):

function.add_rabbitmq_trigger(
    url="amqp://rabbitmq-host:5672",
    exchange_name="my-exchange",
    topics=["key1", "key2"],
)

Or using a datastore profile:

function.add_rabbitmq_trigger(url="ds://my-rabbitmq-profile")

When using a datastore profile (ds:// URL), all parameters from the profile are used as defaults. Any parameter explicitly passed to this method will override the corresponding profile value, including falsy values like 0 or False:

# Profile has prefetch_count=10, but explicit 0 overrides it
function.add_rabbitmq_trigger(
    url="ds://my-rabbitmq-profile",
    prefetch_count=0,  # Overrides profile's prefetch_count=10
)

add_trigger(name, spec)[source]#

add a nuclio trigger object/dict

Parameters:

name -- trigger name
spec -- trigger object or dict

add_v3io_stream_trigger(stream_path, name='stream', group='serving', seek_to='earliest', shards=1, extra_attributes=None, ack_window_size=None, **kwargs)[source]#

add v3io stream trigger to the function

Parameters:

stream_path -- v3io stream path (e.g. 'v3io:///projects/myproj/stream1')
name -- trigger name
group -- consumer group
seek_to -- start seek from: "earliest", "latest", "time", "sequence"
shards -- number of shards (used to set number of replicas)
extra_attributes -- key/value dict with extra trigger attributes
ack_window_size -- stream ack window size (the consumer group will be updated with the event id - ack_window_size, on failure the events in the window will be retransmitted)
kwargs -- extra V3IOStreamTrigger class attributes

delete_probe(*args, **kwargs)[source]#

Delete a Kubernetes probe configuration from the sidecar container

This method is only available for ApplicationRuntime.

deploy(project='', tag='', verbose=False, builder_env: dict | None = None, force_build: bool = False, track_models: bool | None = None, wait: bool = True, timeout: int | None = None)[source]#

Deploy the nuclio function to the cluster

Parameters:

project -- project name
tag -- function tag
verbose -- set True for verbose logging
builder_env -- env vars dict for source archive config/credentials e.g. builder_env={"GIT_TOKEN": token}
force_build -- set True for force building the image
track_models -- override state of self.spec.track_models. If not provided, uses the spec value (False by default, True after setup_model_monitoring() is called). When True, model endpoints are created at deployment time.
wait -- when True (default), wait for readiness and return the invocation command (str). When False, submit and return self so the caller can later call wait_for_deployment() or poll db.get_nuclio_deploy_status.
timeout -- optional deadline in seconds for the readiness wait when wait=True; forwarded to wait_for_deployment. None waits indefinitely. Ignored when wait=False.

Returns:

the invocation command (str) when wait=True; the function object (self) when wait=False.

deploy_step(project='', models=None, env=None, tag=None, verbose=None, use_function_from_db=None)[source]#

return as a Kubeflow pipeline step (ContainerOp), recommended to use mlrun.deploy_function() instead

Parameters:

project -- project name, defaults to function project
models -- model name and paths
env -- dict of environment variables
tag -- version tag
verbose -- verbose output
use_function_from_db -- use the function from the DB instead of the local function object

disable_default_http_trigger(**kwargs)[source]#

enable_default_http_trigger(**kwargs)[source]#

from_image(image)[source]#

Deploy the function with an existing nuclio processor image.

Parameters:: image -- image name

get_url(force_external_address: bool = False)[source]#

This method returns function's url.

Parameters:: force_external_address -- use the external ingress URL
Returns:: returns function's url

Invoke the remote (live) function and return the results

example:

function.invoke("/api", body={"inputs": x})

Parameters:

path -- request sub path (e.g. /images)
body -- request body (str, bytes or a dict for json requests)
method -- HTTP method (GET, PUT, ..)
headers -- key/value dict with http headers
force_external_address -- use the external ingress URL
auth_info -- service AuthInfo
mock -- use mock server vs a real Nuclio function (for local simulations)
http_client_kwargs -- allow the user to pass any parameter supported in requests.request method see this link for more information: https://requests.readthedocs.io/en/latest/api/#requests.request

kind = 'remote'#

mask_sensitive_data_in_config()[source]#

pre_deploy_validation()[source]#

set_config(key, value)[source]#

set_probe(*args, **kwargs)[source]#

Set a Kubernetes probe configuration for the sidecar container

This method is only available for ApplicationRuntime.

set_state_thresholds(state_thresholds: dict[str, int], patch: bool = True)[source]#

Set the threshold for a specific state of the runtime. The threshold is the amount of time that the runtime will wait before aborting the run if the job is in the matching state. The threshold time string must conform to timelength python package standards and be at least 1 minute (e.g. 1000s, 1 hour 30m, 1h etc. or -1 for infinite). If the threshold is not set for a state, the default threshold will be used.

Parameters:

state_thresholds --
A dictionary of state to threshold. The supported states are:
- pending_scheduled - The pod/crd is scheduled on a node but not yet running
- pending_not_scheduled - The pod/crd is not yet scheduled on a node
- executing - The pod/crd started and is running
- image_pull_backoff - The pod/crd is in image pull backoff
See mlrun.mlconf.function.spec.state_thresholds for the default thresholds.
patch -- Whether to merge the given thresholds with the existing thresholds (True, default) or override them (False)

setup_model_monitoring(general_model_endpoint_instructions: ModelEndpointInstruction | None = None, extra_model_endpoint_instructions: list[ModelEndpointInstruction] | list[dict] | None = None) → RemoteRuntime[source]#

Setup model monitoring on the RemoteRuntime create by default model endpoint represent the Runtime, Optional configure custom model endpoint or extra model endpoints to be created at deployment time.

Each instruction describes a USER_EP model endpoint that will be registered when this function is deployed. Calling this method sets the track_models flag on the spec so the deployment stage knows to create the endpoints.

Instructions must not set function_name or function_tag; both are derived from the runtime's metadata.name / metadata.tag at deployment time. Setting either raises MLRunInvalidArgumentError.

Parameters:

general_model_endpoint_instructions -- Optional ModelEndpointInstruction parameter for main model endpoint instructions, if not provided a default one will be created with the USER_EP endpoint type and the default name f'{function_name}_model_endpoint'.
extra_model_endpoint_instructions -- List of ModelEndpointInstruction objects or equivalent dicts, one per model endpoint to register.

Returns:

The runtime object, for method chaining.

skip_image_enrichment()[source]#

property spec: NuclioSpec#

property status: NuclioStatus#

wait_for_deployment(verbose: bool = False, timeout: int | None = None) → str[source]#

Finalize a submitted Nuclio deploy.

Waits for terminal deploy status, handles model-endpoint tasks, enriches the invocation command from status, and returns it. deploy(wait=True) calls this directly; callers of deploy(wait=False) can call it explicitly.

Parameters:

verbose -- print verbose build logs
timeout -- optional deadline in seconds for reaching a terminal deploy state. None waits indefinitely. Raises mlrun.errors.MLRunTimeoutError on timeout. The deadline is checked once per status poll, so the wait may overshoot timeout by up to one poll interval.

Returns:

the function's invocation command

with_annotations(annotations: dict)[source]#: set a key/value annotations for function

update/add nuclio HTTP trigger settings

Note: gateway timeout is the maximum request time before an error is returned, while the worker timeout if the max time a request will wait for until it will start processing, gateway_timeout must be greater than the worker_timeout.

Parameters:

workers -- Number of worker processes. Defaults to 8 in synchronous mode and 1 in asynchronous mode. Set to 0 to use Nuclio’s default worker count.
port -- TCP port to listen on. by default, nuclio will choose a random port as long as the function service is NodePort. if the function service is ClusterIP, the port is ignored.
host -- Ingress hostname
paths -- list of Ingress sub paths
canary -- k8s ingress canary (% traffic value between 0 and 100)
secret -- k8s secret name for SSL certificate
worker_timeout -- worker wait timeout in sec (how long a message should wait in the worker queue before an error is returned)
gateway_timeout -- nginx ingress timeout in sec (request timeout, when will the gateway return an error)
trigger_name -- alternative nuclio trigger name
annotations -- key/value dict of ingress annotations
extra_attributes -- key/value dict of extra nuclio trigger attributes
batching_spec -- BatchingSpec object that defines batching configuration. By default, batching is disabled.
async_spec -- AsyncSpec object defines async configuration. By default, mode will be sync. If number of max connections won't be set, the default value will be set to 1000 according to nuclio default.

Returns:

function object (self)

with_node_selection(**kwargs)[source]#

Configure Kubernetes node scheduling for this function.

Updates one or more scheduling hints: exact node pinning, label-based selection, affinity/anti-affinity rules, and taint tolerations. Passing None leaves the current value unchanged; pass an empty dict/list (e.g., {}, []) to clear.

Parameters:

node_name -- Exact Kubernetes node name to pin the pod to.
node_selector -- Mapping of label selectors. Use {} to clear.
affinity -- kubernetes.client.V1Affinity constraints.
tolerations -- List of kubernetes.client.V1Toleration. Use [] to clear.

Warns:

PreemptionWarning - Emitted if provided selectors/tolerations/affinity conflict with the function's preemption mode.

Example usage:

Prefer a GPU pool and allow scheduling on spot nodes:

job.with_node_selection(
    node_selector={"nodepool": "gpu"},
    tolerations=[
        k8s_client.V1Toleration(key="spot", operator="Exists")
    ],
)

with_preemption_mode(**kwargs)[source]#

Preemption mode controls whether pods can be scheduled on preemptible nodes. Tolerations, node selector, and affinity are populated on preemptible nodes corresponding to the function spec.

The supported modes are:

allow - The function can be scheduled on preemptible nodes
constrain - The function can only run on preemptible nodes
prevent - The function cannot be scheduled on preemptible nodes
none - No preemptible configuration will be applied on the function

The default preemption mode is configurable in mlrun.mlconf.function_defaults.preemption_mode, by default it's set to prevent

Parameters:: mode -- allow | constrain | prevent | none defined in PreemptionModes

with_priority_class(**kwargs)[source]#

Enables to control the priority of the pod If not passed - will default to mlrun.mlconf.default_function_priority_class_name

Parameters:: name -- The name of the priority class

with_service_type(service_type: str, add_templated_ingress_host_mode: str | None = None)[source]#

Enables to control the service type of the pod and the addition of templated ingress host

Parameters:

service_type -- service type (ClusterIP, NodePort), defaults to mlrun.mlconf.httpdb.nuclio.service_type
add_templated_ingress_host_mode -- add templated ingress host mode (never, always, onClusterIP), see mlrun.mlconf.httpdb.nuclio.add_templated_ingress_host_mode for the default and more information

Add a sidecar container to the function pod

Parameters:

name -- Sidecar container name.
image -- Sidecar container image.
ports -- Sidecar container ports to expose. Can be a single port or a list of ports.
command -- Sidecar container command instead of the image entrypoint.
args -- Sidecar container command args (requires command to be set).

with_source_archive(source, workdir=None, handler=None, runtime='')[source]#

Load nuclio function from remote source

Note: remote source may require credentials, those can be stored in the project secrets or passed in the function.deploy() using the builder_env dict, see the required credentials per source:

v3io - "V3IO_ACCESS_KEY".
git - "GIT_USERNAME", "GIT_PASSWORD".
AWS S3 - "AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY" or "AWS_SESSION_TOKEN".

Parameters:

source -- a full path to the nuclio function source (code entry) to load the function from
handler -- a path to the function's handler, including path inside archive/git repo
workdir -- working dir relative to the archive root (e.g. 'subdir')
runtime -- (optional) the runtime of the function (defaults to mlrun.mlconf.default_nuclio_runtime)

Examples:

git:

fn.with_source_archive(
    "git://github.com/org/repo#my-branch",
    handler="main:handler",
    workdir="path/inside/repo",
)

s3:

fn.spec.nuclio_runtime = "golang"
fn.with_source_archive(
    "s3://my-bucket/path/in/bucket/my-functions-archive",
    handler="my_func:Handler",
    workdir="path/inside/functions/archive",
    runtime="golang",
)

with_v3io(local='', remote='')[source]#

Add v3io volume to the function

Parameters:

local -- local path (mount path inside the function container)
remote -- v3io path

class mlrun.runtimes.RemoteSparkRuntime(spec=None, metadata=None)[source]#

Bases: KubejobRuntime

default_image = '.remote-spark-default-image'#

deploy(watch=True, with_mlrun=None, skip_deployed=False, is_kfp=False, mlrun_version_specifier=None, builder_env: dict | None = None, show_on_failure: bool = False, force_build: bool = False)[source]#

deploy function, build container with dependencies

Parameters:

watch -- wait for the deploy to complete (and print build logs)
with_mlrun -- add the current mlrun package to the container build
skip_deployed -- skip the build if we already have an image for the function
is_kfp -- deploy as part of a kfp pipeline
mlrun_version_specifier -- which mlrun package version to include (if not current)
builder_env -- Kaniko builder pod env vars dict (for config/credentials) e.g. builder_env={"GIT_TOKEN": token}
show_on_failure -- show logs only in case of build failure
force_build -- force building the image, even when no changes were made

:return True if the function is ready (deployed)

classmethod deploy_default_image()[source]#

is_deployed()[source]#: check if the function is deployed (has a valid container)

kind = 'remote-spark'#

property spec: RemoteSparkSpec#

with_security_context(security_context: V1SecurityContext)[source]#: With security context is not supported for spark runtime. Driver / Executor processes run with uid / gid 1000 as long as security context is not defined. If in the future we want to support setting security context it will work only from spark version 3.2 onwards.

with_spark_service(spark_service, provider='iguazio', with_v3io_mount=True)[source]#: Attach spark service to function

class mlrun.runtimes.RuntimeKinds[source]#

Bases: object

static abortable_runtimes()[source]#

static all()[source]#

application = 'application'#

dask = 'dask'#

databricks = 'databricks'#

handler = 'handler'#

static handlerless_runtimes()[source]#

static is_local_runtime(kind)[source]#

static is_log_collectable_runtime(kind: str | None)[source]#: whether log collector can collect logs for that runtime :param kind: kind name :return: whether log collector can collect logs for that runtime

job = 'job'#

local = 'local'#

static local_runtimes()[source]#

mpijob = 'mpijob'#

nuclio = 'nuclio'#

static nuclio_runtimes()[source]#

static pure_nuclio_deployed_runtimes()[source]#

remote = 'remote'#

remotespark = 'remote-spark'#

static requires_absolute_artifacts_path(kind)[source]#: Returns True if the runtime kind requires absolute artifacts' path (i.e. is local), False otherwise.

static requires_image_name_for_execution(kind)[source]#

static requires_k8s_name_validation(kind: str) → bool[source]#

Returns True if the runtime kind creates Kubernetes resources that use the function name.

Function names for k8s-deployed runtimes must conform to DNS-1123 label requirements: - Lowercase alphanumeric characters or '-' - Start and end with an alphanumeric character - Maximum 63 characters

Local runtimes (local, handler) run on the local machine and don't create k8s resources, so they don't require k8s naming validation.

Parameters:: kind -- Runtime kind string (job, spark, serving, local, etc.)
Returns:: True if function name needs k8s DNS-1123 validation, False otherwise

static resolve_nuclio_runtime(kind: str, sub_kind: str)[source]#

static resolve_nuclio_sub_kind(kind: str)[source]#

static retriable_runtimes()[source]#

static runtime_with_handlers()[source]#

serving = 'serving'#

spark = 'spark'#

static supports_from_notebook(kind)[source]#

class mlrun.runtimes.ServingRuntime(spec=None, metadata=None)[source]#

Bases: RemoteRuntime

MLRun Serving Runtime

add_child_function(name, url=None, image=None, requirements=None, kind=None)[source]#

in a multi-function pipeline add child function

example:

fn.add_child_function("enrich", "./enrich.ipynb", "mlrun/mlrun")

Parameters:

name -- child function name
url -- function/code url, support .py, .ipynb, .yaml extensions
image -- base docker image for the function
requirements -- py package requirements file path OR list of packages
kind -- mlrun function/runtime kind

Returns:

function object

Add ml model and/or route to the function.

Example, create a function (from the notebook), add a model class, and deploy:

fn = code_to_function(kind="serving")
fn.add_model("boost", model_path, model_class="MyClass", my_arg=5)
fn.deploy()

Only works with router topology. For nested topologies (model under router under flow) need to add router to flow and use router.add_route()

Parameters:

key -- model api key (or name:version), will determine the relative url/path
model_path -- path to mlrun model artifact or model directory file/object path
class_name -- V2 Model python class name or a model class instance (can also module.submodule.class and it will be imported automatically)
model_url -- url of a remote model serving endpoint (cannot be used with model_path)
handler -- for advanced users!, override default class handler name (do_event)
router_step -- router step name (to determine which router we add the model to in graphs with multiple router steps)
child_function -- child function name, when the model runs in a child function
creation_strategy --
Strategy for creating or updating the model endpoint:
- overwrite: If model endpoints with the same name exist, delete the latest one. Create a new model endpoint entry and set it as latest.
- inplace (default): If model endpoints with the same name exist, update the latest entry. Otherwise, create a new entry.
- archive: If model endpoints with the same name exist, preserve them. Create a new model endpoint with the same name and set it to latest.
outputs -- list of the model outputs (e.g. labels), if provided will override the outputs that were configured in the model artifact. Note that those outputs need to be equal to the model serving function outputs (length, and order).
class_args -- extra kwargs to pass to the model serving class __init__ (can be read in the model using .get_param(key) method)

add_trigger(name: str, spec: NuclioTrigger | dict)[source]#

Add a nuclio trigger object/dict.

Overrides parent to validate streaming compatibility.

Parameters:

name -- trigger name
spec -- trigger object or dict

deploy(**kwargs)#

Deploy the nuclio function to the cluster

Parameters:

project -- project name
tag -- function tag
verbose -- set True for verbose logging
builder_env -- env vars dict for source archive config/credentials e.g. builder_env={"GIT_TOKEN": token}
force_build -- set True for force building the image
track_models -- override state of self.spec.track_models. If not provided, uses the spec value (False by default, True after setup_model_monitoring() is called). When True, model endpoints are created at deployment time.
wait -- when True (default), wait for readiness and return the invocation command (str). When False, submit and return self so the caller can later call wait_for_deployment() or poll db.get_nuclio_deploy_status.
timeout -- optional deadline in seconds for the readiness wait when wait=True; forwarded to wait_for_deployment. None waits indefinitely. Ignored when wait=False.

Returns:

the invocation command (str) when wait=True; the function object (self) when wait=False.

kind = 'serving'#

plot(filename=None, format=None, source=None, **kw)[source]#

plot/save graph using graphviz

example:

serving_fn = mlrun.new_function(
    "serving", image="mlrun/mlrun", kind="serving"
)
serving_fn.add_model(
    "my-classifier",
    model_path=model_path,
    class_name="mlrun.frameworks.sklearn.SKLearnModelServer",
)
serving_fn.plot(rankdir="LR")

Parameters:

filename -- target filepath for the image (None for the notebook)
format -- The output format used for rendering ('pdf', 'png', etc.)
source -- source step to add to the graph
kw -- kwargs passed to graphviz, e.g. rankdir="LR" (see: https://graphviz.org/doc/info/attrs.html)

Returns:

graphviz graph object

remove_states(keys: list)[source]#: remove one, multiple, or all states/models from the spec (blank list for all)

property serving_spec#

set_api_handler_config(config: APIHandlerConfig | dict) → None[source]#

Set the API handler configuration for the serving function.

Parameters:: config -- APIHandlerConfig object or dictionary containing the configuration for handling different API endpoints and their actions.

Example:

# Using APIHandlerConfig object
from mlrun.serving.endpoint_mapping import APIHandlerConfig
from mlrun.common.schemas.serving import APIHandlerAction
from http import HTTPMethod

api_config = APIHandlerConfig()
api_config.add_endpoint_handler(
    "/v1/models", HTTPMethod.GET, APIHandlerAction.ALLOW
)
serving_fn.set_api_handler_config(api_config)

# Using dictionary
serving_fn.set_api_handler_config(
    {"endpoints": {("GET", "/v1/models"): {"action": "allow"}}}
)

set_openai_frontend(endpoints: list[OpenAIEndpoint] | None = None, prefix: str | None = None) → None[source]#

Wire up OpenAI-compatible API handler endpoints in one call.

Registers pre-built input and output body mappings for each selected OpenAI operation group. If endpoints is None, all supported groups are registered.

Validation scope — top-level only. Mandatory field validation applies only to the top-level keys of the request/response body. Full structural validation is delegated to the OpenAI Python SDK, which deserializes responses into typed objects and raises a ValidationError on any structural mismatch.

Parameters:

endpoints -- Optional list of OpenAIEndpoint values selecting which operation groups to enable. Defaults to all groups.
prefix -- Optional path prefix to prepend to every registered endpoint path. Use this when clients send requests with a path prefix (e.g. prefix="/v1" registers /v1/chat/completions instead of /chat/completions). Defaults to None (no prefix).

Example:

from mlrun.serving.openai_mappings import OpenAIEndpoint

fn.set_openai_frontend()  # all groups, no prefix
fn.set_openai_frontend(prefix="/v1")  # all groups, /v1/ prefix
fn.set_openai_frontend([OpenAIEndpoint.RESPONSES], prefix="/v1")

set_streaming(enabled: bool = True) → None[source]#

Enable or disable streaming mode for the serving function.

When streaming is enabled, the function handler yields results as they arrive from streaming steps in the graph, allowing for real-time streaming responses (e.g., for LLM token streaming).

Streaming is only supported with HTTP triggers. When streaming is enabled, non-HTTP triggers cannot be added to the function.

Parameters:: enabled -- Enable or disable streaming mode. Default is True.

Example:

# Create a serving function with streaming enabled
serving_fn = mlrun.code_to_function(kind="serving")
serving_fn.set_topology("flow", engine="async")
serving_fn.set_streaming(enabled=True)

set_topology(topology=None, class_name=None, engine=None, exist_ok=False, allow_cyclic: bool = False, max_iterations: int | None = None, **class_args) → RootFlowStep | RouterStep[source]#

set the serving graph topology (router/flow) and root class or params

examples:

# simple model router topology
graph = fn.set_topology("router")
fn.add_model(name, class_name="ClassifierModel", model_path=model_uri)

# async flow topology
graph = fn.set_topology("flow", engine="async")
graph.to("MyClass").to(name="to_json", handler="json.dumps").respond()

topology options are:

router - root router + multiple child route states/models
         route is usually determined by the path (route key/name)
         can specify special router class and router arguments

flow   - workflow (DAG) with a chain of states
         flow supports both "sync" and "async" engines, with "async" being the default.
         Branches are not allowed in sync mode.
         when using async mode calling state.respond() will mark the state as the
         one which generates the (REST) call response

Parameters:

topology --
- graph topology, router or flow
class_name --
- optional for router, router class name/path or router object
engine --
- optional for flow, sync or async engine
exist_ok --
- allow overriding existing topology
allow_cyclic --
- allow cyclic graphs (only for async flow)
max_iterations --
- optional, max iterations for cyclic graphs (only for async flow), default 100
class_args --
- optional, router/flow class init args

Returns:

graph object (fn.spec.graph)

set_tracking(stream_path: str | None = None, sampling_percentage: float = 100, stream_args: dict | None = None, enable_tracking: bool = True) → None[source]#

Apply on your serving function to monitor a deployed model, including real-time dashboards to detect drift and analyze performance.

Parameters:

stream_path -- Path/url of the tracking stream e.g. v3io:///users/mike/mystream you can use the "dummy://" path for test/simulation.
sampling_percentage -- Down sampling events that will be pushed to the monitoring stream based on a specified percentage. e.g. 50 for 50%. By default, all events are pushed.
stream_args -- Stream initialization parameters, e.g. shards, retention_in_hours, ..
enable_tracking -- Enabled/Disable model-monitoring tracking. Default True (tracking enabled).

Example:

# initialize a new serving function
serving_fn = mlrun.import_function(
    "hub://v2-model-server", new_name="serving"
)
# apply model monitoring
serving_fn.set_tracking()

setup_model_monitoring(general_model_endpoint_instructions=None, extra_model_endpoint_instructions=None)[source]#

Setup model monitoring on the RemoteRuntime create by default model endpoint represent the Runtime, Optional configure custom model endpoint or extra model endpoints to be created at deployment time.

Each instruction describes a USER_EP model endpoint that will be registered when this function is deployed. Calling this method sets the track_models flag on the spec so the deployment stage knows to create the endpoints.

Instructions must not set function_name or function_tag; both are derived from the runtime's metadata.name / metadata.tag at deployment time. Setting either raises MLRunInvalidArgumentError.

Parameters:

general_model_endpoint_instructions -- Optional ModelEndpointInstruction parameter for main model endpoint instructions, if not provided a default one will be created with the USER_EP endpoint type and the default name f'{function_name}_model_endpoint'.
extra_model_endpoint_instructions -- List of ModelEndpointInstruction objects or equivalent dicts, one per model endpoint to register.

Returns:

The runtime object, for method chaining.

property spec: ServingSpec#

to_job(func_name: str | None = None) → KubejobRuntime[source]#

Convert this ServingRuntime to a KubejobRuntime, so that the graph can be run as a standalone job.

Parameters:: func_name -- Optional custom name for the job function. If not provided, automatically appends '-batch' suffix to the serving function name to prevent database collision.
Returns:: KubejobRuntime configured to execute the serving graph as a batch job.

Note

The job will have a different name than the serving function to prevent database collision. The original serving function remains unchanged and can still be invoked after running the job.

to_mock_server(namespace=None, current_function='*', track_models=False, workdir=None, stream_profile: DatastoreProfile | None = None, **kwargs) → GraphServer[source]#

create mock server object for local testing/emulation

Parameters:

namespace -- one or list of namespaces/modules to search the steps classes/functions in
current_function -- specify if you want to simulate a child function, * for all functions
track_models -- allow model tracking (disabled by default in the mock server)
workdir -- working directory to locate the source code (if not the current one)
stream_profile -- stream profile to use for the mock server output stream.

with_secrets(kind, source)[source]#

register a secrets source (file, env or dict)

read secrets from a source provider to be used in workflows, example:

task.with_secrets('file', 'file.txt')
task.with_secrets('inline', {'key': 'val'})
task.with_secrets('env', 'ENV1,ENV2')
task.with_secrets('vault', ['secret1', 'secret2'...])

# If using an empty secrets list [] then all accessible secrets will be available.
task.with_secrets('vault', [])

# To use with Azure key vault, a k8s secret must be created with the following keys:
# kubectl -n <namespace> create secret generic azure-key-vault-secret \
#     --from-literal=tenant_id=<service principal tenant ID> \
#     --from-literal=client_id=<service principal client ID> \
#     --from-literal=secret=<service principal secret key>

task.with_secrets('azure_vault', {
    'name': 'my-vault-name',
    'k8s_secret': 'azure-key-vault-secret',
    # An empty secrets list may be passed ('secrets': []) to access all vault secrets.
    'secrets': ['secret1', 'secret2'...]
})

Parameters:

kind -- secret type (file, inline, env)
source -- secret data or link (see example)

Returns:

The Runtime (function) object

class mlrun.runtimes.Spark3Runtime(spec=None, metadata=None)[source]#

Bases: KubejobRuntime

apiVersion = 'sparkoperator.k8s.io/v1beta2'#

code_path = '/etc/config/mlrun'#

code_script = 'spark-function-code.py'#

default_mlrun_image = '.spark-job-default-image'#

deploy(watch=True, with_mlrun=True, skip_deployed=False, is_kfp=False, mlrun_version_specifier=None, builder_env: dict | None = None, show_on_failure: bool = False, force_build: bool = False)[source]#

deploy function, build container with dependencies

Parameters:

watch -- wait for the deploy to complete (and print build logs)
with_mlrun -- add the current mlrun package to the container build
skip_deployed -- skip the build if we already have an image for the function
is_kfp -- deploy as part of a kfp pipeline
mlrun_version_specifier -- which mlrun package version to include (if not current)
builder_env -- Kaniko builder pod env vars dict (for config/credentials) e.g. builder_env={"GIT_TOKEN": token}
show_on_failure -- show logs only in case of build failure
force_build -- set True for force building the image, even when no changes were made

Returns:

True if the function is ready (deployed)

classmethod deploy_default_image(with_gpu=False)[source]#

disable_monitoring()[source]#

gpu_suffix = '-cuda'#

gpus(gpus, gpu_type='nvidia.com/gpu')[source]#

group = 'sparkoperator.k8s.io'#

is_deployed()[source]#: check if the function is deployed (has a valid container)

kind = 'spark'#

plural = 'sparkapplications'#

property spec: Spark3JobSpec#

version = 'v1beta2'#

with_cores(executor_cores: int | None = None, driver_cores: int | None = None)[source]#

Allows to configure spark.executor.cores and spark.driver.cores parameters. The values must be integers greater than or equal to 1. If a parameter is not specified, it defaults to 1.

Spark operator has multiple options to control the number of cores available to the executor and driver. The .coreLimit and .coreRequest parameters can be set for both executor and driver, but they only control the k8s properties of the pods created to run the driver/executor. Spark itself uses the spec.[executor|driver].cores parameter to set the parallelism of tasks and cores assigned to each task within the pod. This function sets the .cores parameters for the job executed.

See kubeflow/spark-operator#581 for a discussion about those parameters and their meaning in Spark operator.

Parameters:

executor_cores -- Number of cores to use for executor (spark.executor.cores)
driver_cores -- Number of cores to use for driver (spark.driver.cores)

with_driver_host_path_volume(host_path: str, mount_path: str, type: str = '', volume_name: str = 'host-path-volume')[source]#

Add a host path volume and mounts it to the driver pod More info: https://kubernetes.io/docs/concepts/storage/volumes#hostpath

Parameters:

host_path -- Path of the directory on the host. If the path is a symlink, it follows the link to the real path
mount_path -- Path within the container at which the volume should be mounted. Must not contain ':'
type -- Type for HostPath Volume Defaults to ""
volume_name -- Volume's name. Must be a DNS_LABEL and unique within the pod

with_driver_limits(cpu: str | None = None, gpus: int | None = None, gpu_type: str = 'nvidia.com/gpu', patch: bool = False)[source]#: set driver pod cpu limits by default it overrides the whole limits section, if you wish to patch specific resources use patch=True.

with_driver_node_selection(node_name: str | None = None, node_selector: dict[str, str] | None = None, affinity: V1Affinity | None = None, tolerations: list[V1Toleration] | None = None)[source]#

Enables control of which k8s node the spark executor will run on.

Parameters:

node_name -- The name of the k8s node
node_selector -- Label selector, only nodes with matching labels are eligible to be picked
affinity -- Expands the types of constraints you can express - see https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity for details
tolerations -- Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints - see https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration for details

with_driver_preemption_mode(mode: PreemptionModes | str)[source]#

Preemption mode controls whether the spark driver can be scheduled on preemptible nodes. Tolerations, node selector, and affinity are populated on preemptible nodes corresponding to the function spec.

The supported modes are:

allow - The function can be scheduled on preemptible nodes
constrain - The function can only run on preemptible nodes
prevent - The function cannot be scheduled on preemptible nodes
none - No preemptible configuration will be applied on the function

The default preemption mode is configurable in mlrun.mlconf.function_defaults.preemption_mode. By default it's set to prevent

Parameters:: mode -- allow | constrain | prevent | none defined in PreemptionModes

with_driver_requests(mem: str | None = None, cpu: str | None = None, patch: bool = False)[source]#: set driver pod required cpu/memory/gpu resources by default it overrides the whole requests section, if you wish to patch specific resources use patch=True.

with_dynamic_allocation(min_executors=None, max_executors=None, initial_executors=None)[source]#

Allows to configure spark's dynamic allocation

Parameters:

min_executors -- Min. number of executors
max_executors -- Max. number of executors
initial_executors -- Initial number of executors

with_executor_host_path_volume(host_path: str, mount_path: str, type: str = '', volume_name: str = 'host-path-volume')[source]#

Add an host path volume and mount it to the executor pod/s More info: https://kubernetes.io/docs/concepts/storage/volumes#hostpath

Parameters:

host_path -- Path of the directory on the host. If the path is a symlink, it follows the link to the real path
mount_path -- Path within the container at which the volume should be mounted. Must not contain ':'
type -- Type for HostPath Volume Defaults to ""
volume_name -- Volume's name. Must be a DNS_LABEL and unique within the pod

with_executor_limits(cpu: str | None = None, gpus: int | None = None, gpu_type: str = 'nvidia.com/gpu', patch: bool = False)[source]#: set executor pod limits by default it overrides the whole limits section, if you wish to patch specific resources use patch=True.

with_executor_node_selection(node_name: str | None = None, node_selector: dict[str, str] | None = None, affinity: V1Affinity | None = None, tolerations: list[V1Toleration] | None = None)[source]#

Enables control of which k8s node the spark executor will run on.

Parameters:

node_name -- The name of the k8s node
node_selector -- Label selector, only nodes with matching labels are eligible to be picked
affinity -- Expands the types of constraints you can express - see https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity for details
tolerations -- Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints - see https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration for details

with_executor_preemption_mode(mode: PreemptionModes | str)[source]#

Preemption mode controls whether the spark executor can be scheduled on preemptible nodes. Tolerations, node selector, and affinity are populated on preemptible nodes corresponding to the function spec.

The supported modes are:

allow - The function can be scheduled on preemptible nodes
constrain - The function can only run on preemptible nodes
prevent - The function cannot be scheduled on preemptible nodes
none - No preemptible configuration will be applied on the function

The default preemption mode is configurable in mlrun.mlconf.function_defaults.preemption_mode, by default it's set to prevent

Parameters:: mode -- allow | constrain | prevent | none defined in PreemptionModes

with_executor_requests(mem: str | None = None, cpu: str | None = None, patch: bool = False)[source]#: set executor pod required cpu/memory/gpu resources by default it overrides the whole requests section, if you wish to patch specific resources use patch=True.

with_igz_spark(mount_v3io_to_executor=True)[source]#

Configures the pods (driver and executors) to have V3IO access (via file system and via Hadoop).

Parameters:: mount_v3io_to_executor -- When False, limits the file system mount to driver pod only. Default is True.

with_limits(mem=None, cpu=None, gpus=None, gpu_type='nvidia.com/gpu', patch: bool = False)[source]#

Set pod cpu/memory/gpu limits (max values)

Parameters:

mem -- set limit for memory e.g. '500M', '2G', etc.
cpu -- set limit for cpu e.g. '0.5', '2', etc.
gpus -- set limit for gpu
gpu_type -- set gpu type e.g. "nvidia.com/gpu"
patch -- by default it overrides the whole limits section, if you wish to patch specific resources use patch=True

with_node_selection(node_name: str | None = None, node_selector: dict[str, str] | None = None, affinity: V1Affinity | None = None, tolerations: list[V1Toleration] | None = None)[source]#

Configure Kubernetes node scheduling for this function.

Updates one or more scheduling hints: exact node pinning, label-based selection, affinity/anti-affinity rules, and taint tolerations. Passing None leaves the current value unchanged; pass an empty dict/list (e.g., {}, []) to clear.

Parameters:

node_name -- Exact Kubernetes node name to pin the pod to.
node_selector -- Mapping of label selectors. Use {} to clear.
affinity -- kubernetes.client.V1Affinity constraints.
tolerations -- List of kubernetes.client.V1Toleration. Use [] to clear.

Warns:

PreemptionWarning - Emitted if provided selectors/tolerations/affinity conflict with the function's preemption mode.

Example usage:

Prefer a GPU pool and allow scheduling on spot nodes:

job.with_node_selection(
    node_selector={"nodepool": "gpu"},
    tolerations=[
        k8s_client.V1Toleration(key="spot", operator="Exists")
    ],
)

with_preemption_mode(mode: PreemptionModes | str)[source]#: Use with_driver_preemption_mode / with_executor_preemption_mode to setup preemption_mode for spark operator

with_requests(mem=None, cpu=None, patch: bool = False)[source]#

Set requested (desired) pod cpu/memory resources

Parameters:

mem -- set request for memory e.g. '200M', '1G', etc.
cpu -- set request for cpu e.g. '0.1', '1', etc.
patch -- by default it overrides the whole requests section, if you wish to patch specific resources use patch=True

with_restart_policy(restart_type='OnFailure', retries=0, retry_interval=10, submission_retries=3, submission_retry_interval=20)[source]#: set restart policy restart_type=OnFailure/Never/Always

with_security_context(security_context: V1SecurityContext)[source]#: With security context is not supported for spark runtime. Driver / Executor processes run with uid / gid 1000 as long as security context is not defined. If in the future we want to support setting security context it will work only from spark version 3.2 onwards.

with_source_archive(source, workdir=None, handler=None, pull_at_runtime=True, target_dir=None)[source]#

load the code from git/tar/zip archive at runtime or build

Parameters:

source -- valid path to git, zip, or tar file, e.g. git://github.com/mlrun/something.git http://some/url/file.zip
handler -- default function handler
workdir -- working dir relative to the archive root (e.g. './subdir') or absolute to the image root
pull_at_runtime -- not supported for spark runtime, must be False
target_dir -- target dir on runtime pod for repo clone / archive extraction

mlrun.runtimes.nuclio#

class mlrun.runtimes.nuclio.api_gateway.APIGateway(**kwargs)[source]#

Bases: ModelObj

property authentication#

property description#

classmethod from_scheme(api_gateway: APIGateway)[source]#

property host#

Invoke the API gateway.

Parameters:

method -- (str, optional) The HTTP method for the invocation.
headers -- (dict, optional) The HTTP headers for the invocation.
credentials -- (Optional[tuple[str, str]], optional) The (username,password) for the invocation if required can also be set by the environment variable (_, V3IO_ACCESS_KEY) for access key authentication.
path -- (str, optional) The sub-path for the invocation.
body -- (Optional[Union[str, bytes, dict]]) The body of the invocation.
kwargs -- (dict) Additional keyword arguments.

Returns:

The response from the API gateway invocation.

property invoke_url#

Get the invoke URL.

Returns:: (str) The invoke URL.

is_ready()[source]#

property metadata: APIGatewayMetadata#

property name#

property path#

property project#

property spec: APIGatewaySpec#

property status: APIGatewayStatus#

sync()[source]#: Synchronize the API gateway from the server.

to_scheme() → APIGateway[source]#

wait_for_readiness(max_wait_time=90)[source]#

Wait for the API gateway to become ready within the maximum wait time.

Parameters:: max_wait_time -- int - Maximum time to wait in seconds (default is 90 seconds).
Returns:: True if the entity becomes ready within the maximum wait time, False otherwise
Return type:: bool

with_access_key_auth(**kwargs)#

with_annotations(annotations: dict)[source]#: set a key/value annotations in the metadata of the api gateway

with_basic_auth(username: str, password: str)[source]#

Set basic authentication for the API gateway.

Parameters:

username -- (str) The username for basic authentication.
password -- (str) The password for basic authentication.

Set canary function for the API gateway

Parameters:

functions -- The list of functions associated with the API gateway Can be a list of function names (["my-func1", "my-func2"]) or a list of nuclio functions of types RemoteRuntime OR ServingRuntime OR ApplicationRuntime
canary -- The canary percents for the API gateway of type list[int]; for instance: [20,80]

with_force_ssl_redirect()[source]#: Set SSL redirect annotation for the API gateway.

with_gateway_timeout(gateway_timeout: int)[source]#: Set gateway proxy connect/read/send timeout annotations :param gateway_timeout: The timeout in seconds

with_iguazio_auth(**kwargs)#

with_ports(ports: list[int])[source]#

Set ports for the API gateway

Parameters:: ports -- The ports of the API gateway, as a list of integers that correspond to the functions in the functions list. for instance: [8050] or [8050, 8081]

class mlrun.runtimes.nuclio.api_gateway.APIGatewayAuthenticator(*args, **kwargs)[source]#: Bases: Authenticator, ModelObj

class mlrun.runtimes.nuclio.api_gateway.APIGatewayMetadata(name: str, namespace: str | None = None, labels: dict | None = None, annotations: dict | None = None, creation_timestamp: str | None = None)[source]#

Bases: ModelObj

Parameters:

name -- The name of the API gateway
namespace -- The namespace of the API gateway
labels -- The labels of the API gateway
annotations -- The annotations of the API gateway
creation_timestamp -- The creation timestamp of the API gateway

class mlrun.runtimes.nuclio.api_gateway.APIGatewaySpec(functions: list[str | ~mlrun.runtimes.nuclio.function.RemoteRuntime | ~mlrun.runtimes.nuclio.serving.ServingRuntime | ~mlrun.runtimes.nuclio.application.application.ApplicationRuntime] | ~mlrun.runtimes.nuclio.function.RemoteRuntime | ~mlrun.runtimes.nuclio.serving.ServingRuntime | ~mlrun.runtimes.nuclio.application.application.ApplicationRuntime, project: str | None = None, description: str = '', host: str | None = None, path: str = '/', authentication: ~mlrun.runtimes.nuclio.api_gateway.APIGatewayAuthenticator | None = <mlrun.runtimes.nuclio.api_gateway.NoneAuth object>, canary: list[int] | None = None, ports: list[int] | None = None)[source]#

Bases: ModelObj

Parameters:

functions -- The list of functions associated with the API gateway Can be a list of function names (["my-func1", "my-func2"]) or a list or a single entity of RemoteRuntime OR ServingRuntime OR ApplicationRuntime
project -- The project name
description -- Optional description of the API gateway
path -- Optional path of the API gateway, default value is "/"
authentication -- The authentication for the API gateway of type BasicAuth
host -- The host of the API gateway (optional). If not set, it will be automatically generated
canary -- The canary percents for the API gateway of type list[int]; for instance: [20,80] (optional)
ports -- The ports of the API gateway, as a list of integers that correspond to the functions in the functions list. for instance: [8050] or [8050, 8081] (optional)

enrich()[source]#

validate(project: str, functions: list[str | RemoteRuntime | ServingRuntime | ApplicationRuntime] | RemoteRuntime | ServingRuntime | ApplicationRuntime, canary: list[int] | None = None, ports: list[int] | None = None)[source]#

class mlrun.runtimes.nuclio.api_gateway.APIGatewayStatus(state: APIGatewayState | None = None)[source]#: Bases: ModelObj

class mlrun.runtimes.nuclio.api_gateway.AccessKeyAuth(*args, **kwargs)[source]#

Bases: APIGatewayAuthenticator

An API gateway authenticator with access key authentication.

property authentication_mode: str#

class mlrun.runtimes.nuclio.api_gateway.Authenticator(*args, **kwargs)[source]#

Bases: Protocol

property authentication_mode: str#

classmethod from_scheme(api_gateway_spec: APIGatewaySpec)[source]#

to_scheme() → dict[str, APIGatewayBasicAuth | None] | None[source]#

class mlrun.runtimes.nuclio.api_gateway.BasicAuth(username=None, password=None)[source]#

Bases: APIGatewayAuthenticator

An API gateway authenticator with basic authentication.

Parameters:

username -- (str) The username for basic authentication.
password -- (str) The password for basic authentication.

property authentication_mode: str#

to_scheme() → dict[str, APIGatewayBasicAuth | None] | None[source]#

class mlrun.runtimes.nuclio.api_gateway.IguazioAuth(*args, **kwargs)[source]#

Bases: APIGatewayAuthenticator

An API gateway authenticator with Iguazio authentication.

property authentication_mode: str#

class mlrun.runtimes.nuclio.api_gateway.NoneAuth[source]#

Bases: APIGatewayAuthenticator

An API gateway authenticator with no authentication.

mlrun.runtimes.serving#

class mlrun.runtimes.nuclio.serving.APIHandlerConfig(enabled: bool = True, endpoints: dict[str, dict | EndpointConfig] | None = None, include_url_info: bool = False)[source]#

Bases: ModelObj

Configuration for API handler in serving graph.

Parameters:

enabled -- Whether the API handler step is active.
endpoints -- Map of endpoint key ("METHOD:path") to EndpointConfig (or a raw dict that will be deserialized).
include_url_info --
When True, inject the request's URL info into the handler as keyword arguments — mlrun_request_path (the normalized, URL-decoded matched path) and mlrun_request_method (the HTTP method string, e.g. "GET"). Both are passed together so a dispatcher handler can distinguish endpoints that share a path template but differ by method (e.g. GET vs DELETE on /responses/{id}).

Decoding matches Flask/FastAPI semantics: an encoded slash (%2F) in a segment becomes indistinguishable from a path separator.

The handler signature MUST accept these names (either as explicit parameters or via **kwargs); otherwise Python raises TypeError: unexpected keyword argument.

add_endpoint_config(endpoint: EndpointConfig) → None[source]#

Add a pre-built EndpointConfig directly.

Parameters:: endpoint -- The endpoint configuration to add.

add_endpoint_handler(path: str, http_method: ~http.HTTPMethod | str = <HTTPMethod.POST>, action: ~mlrun.common.schemas.serving.APIHandlerAction = APIHandlerAction.ALLOW, description: str | None = None, input_body_mappings: ~mlrun.serving.endpoint_mapping.BodyMappings | None = None, output_body_mappings: ~mlrun.serving.endpoint_mapping.BodyMappings | None = None) → None[source]#

Add an endpoint handler configuration.

Parameters:

path -- URL path for the endpoint (e.g., /v1/models or /api/v1/*)
http_method -- HTTP method for the endpoint (HTTPMethod enum or string like "GET", "POST")
action -- Action to take for this endpoint (APIHandlerAction)
description -- Optional description of the endpoint
input_body_mappings -- Optional input BodyMappings for this endpoint (REST → graph). If None, the request body is passed through as-is.
output_body_mappings -- Optional output BodyMappings for this endpoint (graph → REST). If None, the response is returned as-is.

Raises:

mlrun.errors.MLRunValueError -- If the path contains an invalid wildcard * pattern

property endpoints: dict[str, EndpointConfig]#: Get the endpoints as a dict keyed by endpoint key ("METHOD:path").

get_endpoint_config(method: HTTPMethod | str, path: str) → EndpointConfig | None[source]#: Get endpoint configuration for a specific method and path.

remove_endpoint_handler(path: str, http_method: ~http.HTTPMethod | str = <HTTPMethod.POST>) → None[source]#

Remove an endpoint handler configuration.

Parameters:

path -- URL path for the endpoint to remove
http_method -- HTTP method for the endpoint to remove (HTTPMethod enum or string)

mlrun.runtimes

Contents

mlrun.runtimes#

mlrun.runtimes.nuclio#

mlrun.runtimes.serving#