mlrun.db

class mlrun.db.httpdb.HTTPRunDB(base_url, user='', password='', token='')[source]

Bases: mlrun.db.base.RunDBInterface

Interface for accessing and manipulating the mlrun persistent store, maintaining the full state and catalog of objects that MLRun uses. The HTTPRunDB class serves as a client-side proxy to the MLRun API service which maintains the actual data-store, accesses the server through REST APIs.

The class provides functions for accessing and modifying the various objects that are used by MLRun in its operation. The functions provided follow some standard guidelines, which are:

  • Every object in MLRun exists in the context of a project (except projects themselves). When referencing an object through any API, a project name must be provided. The default for most APIs is for an empty project name, which will be replaced by the name of the default project (usually default). Therefore, if performing an API to list functions, for example, and not providing a project name - the result will not be functions from all projects but rather from the default project.

  • Many objects can be assigned labels, and listed/queried by label. The label parameter for query APIs allows for listing objects that:

    • Have a specific label, by asking for label="<label_name>". In this case the actual value of the label doesn’t matter and every object with that label will be returned

    • Have a label with a specific value. This is done by specifying label="<label_name>=<label_value>". In this case only objects whose label matches the value will be returned

  • Most objects have a create method as well as a store method. Create can only be called when such an does not exist yet, while store allows for either creating a new object or overwriting an existing object.

  • Some objects have a versioned option, in which case overwriting the same object with a different version of it does not delete the previous version, but rather creates a new version of the object and keeps both versions. Versioned objects usually have a uid property which is based on their content and allows to reference a specific version of an object (other than tagging objects, which also allows for easy referencing).

  • Many objects have both a store function and a patch function. These are used in the same way as the corresponding REST verbs - a store is passed a full object and will basically perform a PUT operation, replacing the full object (if it exists) while patch receives just a dictionary containing the differences to be applied to the object, and will merge those changes to the existing object. The patch operation also has a strategy assigned to it which determines how the merge logic should behave. The strategy can be either replace or additive. For further details on those strategies, refer to https://pypi.org/project/mergedeep/

abort_run(uid, project='', iter=0)[source]

Abort a running run - will remove the run’s runtime resources and mark its state as aborted

api_call(method, path, error=None, params=None, body=None, json=None, headers=None, timeout=45)[source]

Perform a direct REST API call on the mlrun API server.

Caution

For advanced usage - prefer using the various APIs exposed through this class, rather than directly invoking REST calls.

Parameters
  • method – REST method (POST, GET, PUT…)

  • path – Path to endpoint executed, for example "projects"

  • error – Error to return if API invocation fails

  • body – Payload to be passed in the call. If using JSON objects, prefer using the json param

  • json – JSON payload to be passed in the call

  • headers – REST headers, passed as a dictionary: {"<header-name>": "<header-value>"}

  • timeout – API call timeout

Returns

Python HTTP response object

connect(secrets=None)[source]

Connect to the MLRun API server. Must be called prior to executing any other method. The code utilizes the URL for the API server from the configuration - mlconf.dbpath.

For example:

mlconf.dbpath = mlconf.dbpath or 'http://mlrun-api:8080'
db = get_run_db().connect()
create_feature_set(feature_set: Union[dict, mlrun.api.schemas.feature_store.FeatureSet], project='', versioned=True)dict[source]

Create a new FeatureSet and save in the mlrun DB. The feature-set must not previously exist in the DB.

Parameters
  • feature_set – The new FeatureSet to create.

  • project – Name of project this feature-set belongs to.

  • versioned – Whether to maintain versions for this feature-set. All versions of a versioned object will be kept in the DB and can be retrieved until explicitly deleted.

Returns

The FeatureSet object (as dict).

create_feature_vector(feature_vector: Union[dict, mlrun.api.schemas.feature_store.FeatureVector], project='', versioned=True)dict[source]

Create a new FeatureVector and save in the mlrun DB.

Parameters
  • feature_vector – The new FeatureVector to create.

  • project – Name of project this feature-vector belongs to.

  • versioned – Whether to maintain versions for this feature-vector. All versions of a versioned object will be kept in the DB and can be retrieved until explicitly deleted.

Returns

The FeatureVector object (as dict).

create_or_patch(project: str, endpoint_id: str, model_endpoint: mlrun.api.schemas.model_endpoints.ModelEndpoint, access_key: Optional[str] = None)[source]

Creates or updates a KV record with the given model_endpoint record

Parameters
  • project – The name of the project

  • endpoint_id – The id of the endpoint

  • model_endpoint – An object representing the model endpoint

  • access_key – V3IO access key, when None, will be look for in environ

create_project(project: Union[dict, mlrun.projects.project.MlrunProject, mlrun.api.schemas.project.Project])mlrun.projects.project.MlrunProject[source]

Create a new project. A project with the same name must not exist prior to creation.

create_project_secrets(project: str, provider: Union[str, mlrun.api.schemas.secret.SecretProviderName] = <SecretProviderName.vault: 'vault'>, secrets: Optional[dict] = None)[source]

Create project-context secrets using either vault or kubernetes provider. When using with Vault, this will create needed Vault structures for storing secrets in project-context, and store a set of secret values. The method generates Kubernetes service-account and the Vault authentication structures that are required for function Pods to authenticate with Vault and be able to extract secret values passed as part of their context.

Note

This method used with Vault is currently in technical preview, and requires a HashiCorp Vault infrastructure properly set up and connected to the MLRun API server.

When used with Kubernetes, this will make sure that the project-specific k8s secret is created, and will populate it with the secrets provided, replacing their values if they exist.

Parameters
  • project – The project context for which to generate the infra and store secrets.

  • provider – The name of the secrets-provider to work with. Accepts a SecretProviderName enum.

  • secrets

    A set of secret values to store. Example:

    secrets = {'password': 'myPassw0rd', 'aws_key': '111222333'}
    db.create_project_secrets(
        "project1",
        provider=mlrun.api.schemas.SecretProviderName.vault,
        secrets=secrets
    )
    

create_schedule(project: str, schedule: mlrun.api.schemas.schedule.ScheduleInput)[source]

Create a new schedule on the given project. The details on the actual object to schedule as well as the schedule itself are within the schedule object provided. The ScheduleCronTrigger follows the guidelines in https://apscheduler.readthedocs.io/en/v3.6.3/modules/triggers/cron.html. It also supports a from_crontab() function that accepts a crontab-formatted string (see https://en.wikipedia.org/wiki/Cron for more information on the format).

Example:

from mlrun.api import schemas

# Execute the get_data_func function every Tuesday at 15:30
schedule = schemas.ScheduleInput(
    name="run_func_on_tuesdays",
    kind="job",
    scheduled_object=get_data_func,
    cron_trigger=schemas.ScheduleCronTrigger(day_of_week='tue', hour=15, minute=30),
)
db.create_schedule(project_name, schedule)
create_user_secrets(user: str, provider: Union[str, mlrun.api.schemas.secret.SecretProviderName] = <SecretProviderName.vault: 'vault'>, secrets: Optional[dict] = None)[source]

Create user-context secret in Vault. Please refer to create_project_secrets() for more details and status of this functionality.

Note

This method is currently in technical preview, and requires a HashiCorp Vault infrastructure properly set up and connected to the MLRun API server.

Parameters
  • user – The user context for which to generate the infra and store secrets.

  • provider – The name of the secrets-provider to work with. Currently only vault is supported.

  • secrets – A set of secret values to store within the Vault.

del_artifact(key, tag=None, project='')[source]

Delete an artifact.

del_artifacts(name=None, project=None, tag=None, labels=None, days_ago=0)[source]

Delete artifacts referenced by the parameters.

Parameters
  • name – Name of artifacts to delete. Note that this is a like query, and is case-insensitive. See list_artifacts() for more details.

  • project – Project that artifacts belong to.

  • tag – Choose artifacts who are assigned this tag.

  • labels – Choose artifacts which are labeled.

  • days_ago – This parameter is deprecated and not used.

del_run(uid, project='', iter=0)[source]

Delete details of a specific run from DB.

Parameters
  • uid – Unique ID for the specific run to delete.

  • project – Project that the run belongs to.

  • iter – Iteration within a specific task.

del_runs(name=None, project=None, labels=None, state=None, days_ago=0)[source]

Delete a group of runs identified by the parameters of the function.

Example:

db.del_runs(state='completed')
Parameters
  • name – Name of the task which the runs belong to.

  • project – Project to which the runs belong.

  • labels – Filter runs that are labeled using these specific label values.

  • state – Filter only runs which are in this state.

  • days_ago – Filter runs whose start time is newer than this parameter.

delete_endpoint_record(project: str, endpoint_id: str, access_key: Optional[str] = None)[source]

Deletes the KV record of a given model endpoint, project and endpoint_id are used for lookup

Parameters
  • project – The name of the project

  • endpoint_id – The id of the endpoint

  • access_key – V3IO access key, when None, will be look for in environ

delete_feature_set(name, project='', tag=None, uid=None)[source]

Delete a FeatureSet object from the DB. If tag or uid are specified, then just the version referenced by them will be deleted. Using both is not allowed. If none are specified, then all instances of the object whose name is name will be deleted.

delete_feature_vector(name, project='', tag=None, uid=None)[source]

Delete a FeatureVector object from the DB. If tag or uid are specified, then just the version referenced by them will be deleted. Using both is not allowed. If none are specified, then all instances of the object whose name is name will be deleted.

delete_function(name: str, project: str = '')[source]

Delete a function belonging to a specific project.

delete_project(name: str, deletion_strategy: Union[str, mlrun.api.schemas.constants.DeletionStrategy] = <DeletionStrategy.restricted: 'restricted'>)[source]

Delete a project.

Parameters
  • name – Name of the project to delete.

  • deletion_strategy

    How to treat child objects of the project. Possible values are:

    • restrict (default) - Project must not have any child objects when deleted. If using this mode while child objects exist, the operation will fail.

    • cascade - Automatically delete all child objects when deleting the project.

delete_project_secrets(project: str, provider: Union[str, mlrun.api.schemas.secret.SecretProviderName] = <SecretProviderName.kubernetes: 'kubernetes'>, secrets: Optional[List[str]] = None)[source]

Delete project-context secrets from Kubernetes.

Parameters
  • project – The project name.

  • provider – The name of the secrets-provider to work with. Currently only kubernetes is supported.

  • secrets – A list of secret names to delete. An empty list will delete all secrets assigned to this specific project.

delete_runtime(kind: str, label_selector: Optional[str] = None, force: bool = False, grace_period: int = '14400')[source]

Delete runtimes of a specific kind. See delete_runtimes() for more details.

delete_runtime_object(kind: str, object_id: str, label_selector: Optional[str] = None, force: bool = False, grace_period: int = '14400')[source]

Delete a specific runtime object identified by its ID. The object ID can be retrieved from the runtime query functions, and used to target a specific runtime to delete. The parameters are the same as those used in delete_runtimes().

delete_runtimes(label_selector: Optional[str] = None, force: bool = False, grace_period: int = '14400')[source]

Delete all runtimes which are matching the specific label selector provided. This will delete runtimes of all applicable kinds. For deleting runtimes of a specific kind, use the delete_runtime() function.

Parameters
  • label_selector – Delete runtimes with this label assigned.

  • force – Force deletion. This parameter is passed to the Kubernetes deletion API for force-delete of pods.

  • grace_period – Grace period for the deleted resources before they are evacuated. This is passed to the Kubernetes deletion API.

delete_schedule(project: str, name: str)[source]

Delete a specific schedule by name.

get_background_task(project: str, name: str)mlrun.api.schemas.background_task.BackgroundTask[source]

Retrieve updated information on a background task being executed.

get_builder_status(func, offset=0, logs=True, last_log_timestamp=0, verbose=False)[source]

Retrieve the status of a build operation currently in progress.

Parameters
  • func – Function object that is being built.

  • offset – Offset into the build logs to retrieve logs from.

  • logs – Should build logs be retrieved.

  • last_log_timestamp – Last timestamp of logs that were already retrieved. Function will return only logs later than this parameter.

  • verbose – Add verbose logs into the output.

Returns

The following parameters:

  • Text of builder logs.

  • Timestamp of last log retrieved, to be used in subsequent calls to this function.

The function also updates internal members of the func object to reflect build process info.

get_endpoint(project: str, endpoint_id: str, start: Optional[str] = None, end: Optional[str] = None, metrics: Optional[List[str]] = None, feature_analysis: bool = False, access_key: Optional[str] = None)mlrun.api.schemas.model_endpoints.ModelEndpoint[source]

Returns a ModelEndpoint object with additional metrics and feature related data.

Parameters
  • project – The name of the project

  • endpoint_id – The id of the model endpoint

  • metrics – A list of metrics to return for each endpoint, read more in ‘TimeMetric’

  • start – The start time of the metrics

  • end – The end time of the metrics

  • feature_analysis – When True, the base feature statistics and current feature statistics will be added to

the output of the resulting object :param access_key: V3IO access key, when None, will be look for in environ

get_feature_set(name: str, project: str = '', tag: Optional[str] = None, uid: Optional[str] = None)mlrun.feature_store.feature_set.FeatureSet[source]

Retrieve a ~mlrun.feature_store.FeatureSet` object. If both tag and uid are not specified, then the object tagged latest will be retrieved.

Parameters
  • name – Name of object to retrieve.

  • project – Project the FeatureSet belongs to.

  • tag – Tag of the specific object version to retrieve.

  • uid – uid of the object to retrieve (can only be used for versioned objects).

get_feature_vector(name: str, project: str = '', tag: Optional[str] = None, uid: Optional[str] = None)mlrun.feature_store.feature_vector.FeatureVector[source]

Return a specific feature-vector referenced by its tag or uid. If none are provided, latest tag will be used.

get_function(name, project='', tag=None, hash_key='')[source]

Retrieve details of a specific function, identified by its name and potentially a tag or function hash.

get_log(uid, project='', offset=0, size=- 1)[source]

Retrieve a log.

Parameters
  • uid – Log unique ID

  • project – Project name for which the log belongs

  • offset – Retrieve partial log, get up to size bytes starting at offset offset from beginning of log

  • size – See offset. If set to -1 (the default) will retrieve all data to end of log.

Returns

The following objects:

  • state - The state of the runtime object which generates this log, if it exists. In case no known state exists, this will be unknown.

  • content - The actual log content.

get_pipeline(run_id: str, namespace: Optional[str] = None, timeout: int = 10)[source]

Retrieve details of a specific pipeline using its run ID (as provided when the pipeline was executed).

get_project(name: str)mlrun.projects.project.MlrunProject[source]

Get details for a specific project.

get_runtime(kind: str, label_selector: Optional[str] = None)Dict[source]

Return a list of runtime resources of a given kind, and potentially matching a specified label. There may be multiple runtime resources returned from this function. This function is similar to the list_runtimes() function, only it focuses on a specific kind, rather than list all runtimes of all kinds which generate runtime pods.

Example:

project_pods = db.get_runtime('job', label_selector='mlrun/project=iris')['resources']['pod_resources']
for pod in project_pods:
    print(pod["name"])
Parameters
  • kind – The kind of runtime to query. May be one of ['dask', 'job', 'spark', 'mpijob']

  • label_selector – A label filter that will be passed to Kubernetes for filtering the results according to their labels.

get_schedule(project: str, name: str, include_last_run: bool = False)mlrun.api.schemas.schedule.ScheduleOutput[source]

Retrieve details of the schedule in question. Besides returning the details of the schedule object itself, this function also returns the next scheduled run for this specific schedule, as well as potentially the results of the last run executed through this schedule.

Parameters
  • project – Project name.

  • name – Name of the schedule object to query.

  • include_last_run – Whether to include the results of the schedule’s last run in the response.

invoke_schedule(project: str, name: str)[source]

Execute the object referenced by the schedule immediately.

kind = 'http'
list_artifact_tags(project=None)[source]

Return a list of all the tags assigned to artifacts in the scope of the given project.

list_artifacts(name=None, project=None, tag=None, labels=None, since=None, until=None, iter: Optional[int] = None, best_iteration: bool = False)[source]

List artifacts filtered by various parameters.

Examples:

# Show latest version of all artifacts in project
latest_artifacts = db.list_artifacts('', tag='latest', project='iris')
# check different artifact versions for a specific artifact
result_versions = db.list_artifacts('results', tag='*', project='iris')
Parameters
  • name – Name of artifacts to retrieve. Name is used as a like query, and is not case-sensitive. This means that querying for name may return artifacts named my_Name_1 or surname.

  • project – Project name.

  • tag – Return artifacts assigned this tag.

  • labels – Return artifacts that have these labels.

  • since – Not in use in HTTPRunDB.

  • until – Not in use in HTTPRunDB.

  • iter – Return artifacts from a specific iteration (where iter=0 means the root iteration). If None (default) return artifacts from all iterations.

  • best_iteration – Returns the artifact which belongs to the best iteration of a given run, in the case of artifacts generated from a hyper-param run. If only a single iteration exists, will return the artifact from that iteration. If using best_iter, the iter parameter must not be used.

list_endpoints(project: str, model: Optional[str] = None, function: Optional[str] = None, labels: Optional[List[str]] = None, start: str = 'now-1h', end: str = 'now', metrics: Optional[List[str]] = None, access_key: Optional[str] = None)mlrun.api.schemas.model_endpoints.ModelEndpointList[source]

Returns a list of ModelEndpointState objects. Each object represents the current state of a model endpoint. This functions supports filtering by the following parameters: 1) model 2) function 3) labels By default, when no filters are applied, all available endpoints for the given project will be listed.

In addition, this functions provides a facade for listing endpoint related metrics. This facade is time-based and depends on the ‘start’ and ‘end’ parameters. By default, when the metrics parameter is None, no metrics are added to the output of this function.

Parameters
  • project – The name of the project

  • model – The name of the model to filter by

  • function – The name of the function to filter by

  • labels – A list of labels to filter by. Label filters work by either filtering a specific value of a label

(i.e. list(“key==value”)) or by looking for the existence of a given key (i.e. “key”) :param metrics: A list of metrics to return for each endpoint, read more in ‘TimeMetric’ :param start: The start time of the metrics :param end: The end time of the metrics :param access_key: V3IO access key, when None, will be look for in environ

list_entities(project: str, name: Optional[str] = None, tag: Optional[str] = None, labels: Optional[List[str]] = None)List[dict][source]

Retrieve a list of entities and their mapping to the containing feature-sets. This function is similar to the list_features() function, and uses the same logic. However, the entities are matched against the name rather than the features.

list_feature_sets(project: str = '', name: Optional[str] = None, tag: Optional[str] = None, state: Optional[str] = None, entities: Optional[List[str]] = None, features: Optional[List[str]] = None, labels: Optional[List[str]] = None, partition_by: Optional[Union[mlrun.api.schemas.constants.FeatureStorePartitionByField, str]] = None, rows_per_partition: int = 1, partition_sort_by: Optional[Union[mlrun.api.schemas.constants.SortField, str]] = None, partition_order: Union[mlrun.api.schemas.constants.OrderType, str] = <OrderType.desc: 'desc'>)List[mlrun.feature_store.feature_set.FeatureSet][source]

Retrieve a list of feature-sets matching the criteria provided.

Parameters
  • project – Project name.

  • name – Name of feature-set to match. This is a like query, and is case-insensitive.

  • tag – Match feature-sets with specific tag.

  • state – Match feature-sets with a specific state.

  • entities – Match feature-sets which contain entities whose name is in this list.

  • features – Match feature-sets which contain features whose name is in this list.

  • labels – Match feature-sets which have these labels.

  • partition_by – Field to group results by. Only allowed value is name. When partition_by is specified, the partition_sort_by parameter must be provided as well.

  • rows_per_partition – How many top rows (per sorting defined by partition_sort_by and partition_order) to return per group. Default value is 1.

  • partition_sort_by – What field to sort the results by, within each partition defined by partition_by. Currently the only allowed value is updated.

  • partition_order – Order of sorting within partitions - asc or desc. Default is desc.

Returns

List of matching FeatureSet objects.

list_feature_vectors(project: str = '', name: Optional[str] = None, tag: Optional[str] = None, state: Optional[str] = None, labels: Optional[List[str]] = None, partition_by: Optional[Union[mlrun.api.schemas.constants.FeatureStorePartitionByField, str]] = None, rows_per_partition: int = 1, partition_sort_by: Optional[Union[mlrun.api.schemas.constants.SortField, str]] = None, partition_order: Union[mlrun.api.schemas.constants.OrderType, str] = <OrderType.desc: 'desc'>)List[mlrun.feature_store.feature_vector.FeatureVector][source]

Retrieve a list of feature-vectors matching the criteria provided.

Parameters
  • project – Project name.

  • name – Name of feature-vector to match. This is a like query, and is case-insensitive.

  • tag – Match feature-vectors with specific tag.

  • state – Match feature-vectors with a specific state.

  • labels – Match feature-vectors which have these labels.

  • partition_by – Field to group results by. Only allowed value is name. When partition_by is specified, the partition_sort_by parameter must be provided as well.

  • rows_per_partition – How many top rows (per sorting defined by partition_sort_by and partition_order) to return per group. Default value is 1.

  • partition_sort_by – What field to sort the results by, within each partition defined by partition_by. Currently the only allowed value is updated.

  • partition_order – Order of sorting within partitions - asc or desc. Default is desc.

Returns

List of matching FeatureVector objects.

list_features(project: str, name: Optional[str] = None, tag: Optional[str] = None, entities: Optional[List[str]] = None, labels: Optional[List[str]] = None)List[dict][source]

List feature-sets which contain specific features. This function may return multiple versions of the same feature-set if a specific tag is not requested. Note that the various filters of this function actually refer to the feature-set object containing the features, not to the features themselves.

Parameters
  • project – Project which contains these features.

  • name – Name of the feature to look for. The name is used in a like query, and is not case-sensitive. For example, looking for feat will return features which are named MyFeature as well as defeat.

  • tag – Return feature-sets which contain the features looked for, and are tagged with the specific tag.

  • entities – Return only feature-sets which contain an entity whose name is contained in this list.

  • labels – Return only feature-sets which are labeled as requested.

Returns

A list of mapping from feature to a digest of the feature-set, which contains the feature-set meta-data. Multiple entries may be returned for any specific feature due to multiple tags or versions of the feature-set.

list_functions(name=None, project=None, tag=None, labels=None)[source]

Retrieve a list of functions, filtered by specific criteria.

Parameters
  • name – Return only functions with a specific name.

  • project – Return functions belonging to this project. If not specified, the default project is used.

  • tag – Return function versions with specific tags.

  • labels – Return functions that have specific labels assigned to them.

Returns

List of function objects (as dictionary).

list_pipelines(project: str, namespace: Optional[str] = None, sort_by: str = '', page_token: str = '', filter_: str = '', format_: Union[str, mlrun.api.schemas.constants.Format] = <Format.metadata_only: 'metadata_only'>, page_size: Optional[int] = None)mlrun.api.schemas.pipeline.PipelinesOutput[source]

Retrieve a list of KFP pipelines. This function can be invoked to get all pipelines from all projects, by specifying project=*, in which case pagination can be used and the various sorting and pagination properties can be applied. If a specific project is requested, then the pagination options cannot be used and pagination is not applied.

Parameters
  • project – Project name. Can be * for query across all projects.

  • namespace – Kubernetes namespace in which the pipelines are executing.

  • sort_by – Field to sort the results by.

  • page_token – Use for pagination, to retrieve next page.

  • filter – Kubernetes filter to apply to the query, can be used to filter on specific object fields.

  • format

    Result format. Can be one of:

    • full - return the full objects.

    • metadata_only (default) - return just metadata of the pipelines objects.

    • name_only - return just the names of the pipeline objects.

  • page_size – Size of a single page when applying pagination.

list_project_secret_keys(project: str, provider: Union[str, mlrun.api.schemas.secret.SecretProviderName] = <SecretProviderName.vault: 'vault'>, token: Optional[str] = None)mlrun.api.schemas.secret.SecretKeysData[source]

Retrieve project-context secret keys from Vault or Kubernetes.

Note

This method for Vault functionality is currently in technical preview, and requires a HashiCorp Vault infrastructure properly set up and connected to the MLRun API server.

Parameters
  • project – The project name.

  • provider – The name of the secrets-provider to work with. Accepts a SecretProviderName enum.

  • token – Vault token to use for retrieving secrets. Only in use if provider is vault. Must be a valid Vault token, with permissions to retrieve secrets of the project in question.

list_project_secrets(project: str, token: Optional[str] = None, provider: Union[str, mlrun.api.schemas.secret.SecretProviderName] = <SecretProviderName.vault: 'vault'>, secrets: Optional[List[str]] = None)mlrun.api.schemas.secret.SecretsData[source]

Retrieve project-context secrets from Vault.

Note

This method for Vault functionality is currently in technical preview, and requires a HashiCorp Vault infrastructure properly set up and connected to the MLRun API server.

Parameters
  • project – The project name.

  • token – Vault token to use for retrieving secrets. Must be a valid Vault token, with permissions to retrieve secrets of the project in question.

  • provider – The name of the secrets-provider to work with. Currently only vault is accepted.

  • secrets – A list of secret names to retrieve. An empty list [] will retrieve all secrets assigned to this specific project. kubernetes provider only supports an empty list.

list_projects(owner: Optional[str] = None, format_: Union[str, mlrun.api.schemas.constants.Format] = <Format.full: 'full'>, labels: Optional[List[str]] = None, state: Optional[Union[str, mlrun.api.schemas.project.ProjectState]] = None)List[Union[mlrun.projects.project.MlrunProject, str]][source]

Return a list of the existing projects, potentially filtered by specific criteria.

Parameters
  • owner – List only projects belonging to this specific owner.

  • format

    Format of the results. Possible values are:

    • full (default value) - Return full project objects.

    • name_only - Return just the names of the projects.

  • labels – Filter by labels attached to the project.

  • state – Filter by project’s state. Can be either online or archived.

list_runs(name=None, uid=None, project=None, labels=None, state=None, sort=True, last=0, iter=False, start_time_from: Optional[datetime.datetime] = None, start_time_to: Optional[datetime.datetime] = None, last_update_time_from: Optional[datetime.datetime] = None, last_update_time_to: Optional[datetime.datetime] = None)[source]

Retrieve a list of runs, filtered by various options. Example:

runs = db.list_runs(name='download', project='iris', labels='owner=admin')
# If running in Jupyter, can use the .show() function to display the results
db.list_runs(name='', project=project_name).show()
Parameters
  • name – Name of the run to retrieve.

  • uid – Unique ID of the run.

  • project – Project that the runs belongs to.

  • labels – List runs that have a specific label assigned. Currently only a single label filter can be applied, otherwise result will be empty.

  • state – List only runs whose state is specified.

  • sort – Whether to sort the result according to their start time. Otherwise results will be returned by their internal order in the DB (order will not be guaranteed).

  • last – Deprecated - currently not used.

  • iter – If True return runs from all iterations. Otherwise, return only runs whose iter is 0.

  • start_time_from – Filter by run start time in [start_time_from, start_time_to].

  • start_time_to – Filter by run start time in [start_time_from, start_time_to].

  • last_update_time_from – Filter by run last update time in (last_update_time_from, last_update_time_to).

  • last_update_time_to – Filter by run last update time in (last_update_time_from, last_update_time_to).

list_runtimes(label_selector: Optional[str] = None)List[source]

List current runtime resources, which are usually (but not limited to) Kubernetes pods or CRDs. Function applies for runs of type ['dask', 'job', 'spark', 'mpijob'], and will return per runtime kind a list of the resources (which may have already completed their execution).

Parameters

label_selector – A label filter that will be passed to Kubernetes for filtering the results according to their labels.

list_schedules(project: str, name: Optional[str] = None, kind: Optional[mlrun.api.schemas.schedule.ScheduleKinds] = None, include_last_run: bool = False)mlrun.api.schemas.schedule.SchedulesOutput[source]

Retrieve list of schedules of specific name or kind.

Parameters
  • project – Project name.

  • name – Name of schedule to retrieve. Can be omitted to list all schedules.

  • kind – Kind of schedule objects to retrieve, can be either job or pipeline.

  • include_last_run – Whether to return for each schedule returned also the results of the last run of that schedule.

patch_feature_set(name, feature_set_update: dict, project='', tag=None, uid=None, patch_mode: Union[str, mlrun.api.schemas.constants.PatchMode] = <PatchMode.replace: 'replace'>)[source]

Modify (patch) an existing FeatureSet object. The object is identified by its name (and project it belongs to), as well as optionally a tag or its uid (for versioned object). If both tag and uid are omitted then the object with tag latest is modified.

Parameters
  • name – Name of the object to patch.

  • feature_set_update

    The modifications needed in the object. This parameter only has the changes in it, not a full object. Example:

    feature_set_update = {"status": {"processed" : True}}
    

    Will apply the field status.processed to the existing object.

  • project – Project which contains the modified object.

  • tag – The tag of the object to modify.

  • uid – uid of the object to modify.

  • patch_mode – The strategy for merging the changes with the existing object. Can be either replace or additive.

patch_feature_vector(name, feature_vector_update: dict, project='', tag=None, uid=None, patch_mode: Union[str, mlrun.api.schemas.constants.PatchMode] = <PatchMode.replace: 'replace'>)[source]

Modify (patch) an existing FeatureVector object. The object is identified by its name (and project it belongs to), as well as optionally a tag or its uid (for versioned object). If both tag and uid are omitted then the object with tag latest is modified.

Parameters
  • name – Name of the object to patch.

  • feature_vector_update – The modifications needed in the object. This parameter only has the changes in it, not a full object.

  • project – Project which contains the modified object.

  • tag – The tag of the object to modify.

  • uid – uid of the object to modify.

  • patch_mode – The strategy for merging the changes with the existing object. Can be either replace or additive.

patch_project(name: str, project: dict, patch_mode: Union[str, mlrun.api.schemas.constants.PatchMode] = <PatchMode.replace: 'replace'>)mlrun.projects.project.MlrunProject[source]

Patch an existing project object.

Parameters
  • name – Name of project to patch.

  • project – The actual changes to the project object.

  • patch_mode – The strategy for merging the changes with the existing object. Can be either replace or additive.

read_artifact(key, tag=None, iter=None, project='')[source]

Read an artifact, identified by its key, tag and iteration.

read_run(uid, project='', iter=0)[source]

Read the details of a stored run from the DB.

Parameters
  • uid – The run’s unique ID.

  • project – Project name.

  • iter – Iteration within a specific execution.

remote_builder(func, with_mlrun, mlrun_version_specifier=None, skip_deployed=False)[source]

Build the pod image for a function, for execution on a remote cluster. This is executed by the MLRun API server, and creates a Docker image out of the function provided and any specific build instructions provided within. This is a pre-requisite for remotely executing a function, unless using a pre-deployed image.

Parameters
  • func – Function to build.

  • with_mlrun – Whether to add MLRun package to the built package. This is not required if using a base image that already has MLRun in it.

  • mlrun_version_specifier – Version of MLRun to include in the built image.

  • skip_deployed – Skip the build if we already have an image for the function.

remote_start(func_url)mlrun.api.schemas.background_task.BackgroundTask[source]

Execute a function remotely, Used for dask functions.

Parameters

func_url – URL to the function to be executed.

Returns

A BackgroundTask object, with details on execution process and its status.

remote_status(kind, selector)[source]

Retrieve status of a function being executed remotely (relevant to dask functions).

Parameters
  • kind – The kind of the function, currently dask is supported.

  • selector – Selector clause to be applied to the Kubernetes status query to filter the results.

store_artifact(key, artifact, uid, iter=None, tag=None, project='')[source]

Store an artifact in the DB.

Parameters
  • key – Identifying key of the artifact.

  • artifact – The actual artifact to store.

  • uid – A unique ID for this specific version of the artifact.

  • iter – The task iteration which generated this artifact. If iter is not None the iteration will be added to the key provided to generate a unique key for the artifact of the specific iteration.

  • tag – Tag of the artifact.

  • project – Project that the artifact belongs to.

store_feature_set(feature_set: Union[dict, mlrun.api.schemas.feature_store.FeatureSet], name=None, project='', tag=None, uid=None, versioned=True)dict[source]

Save a FeatureSet object in the mlrun DB. The feature-set can be either a new object or a modification to existing object referenced by the params of the function.

Parameters
  • feature_set – The FeatureSet to store.

  • project – Name of project this feature-set belongs to.

  • tag – The tag of the object to replace in the DB, for example latest.

  • uid – The uid of the object to replace in the DB. If using this parameter, the modified object must have the same uid of the previously-existing object. This cannot be used for non-versioned objects.

  • versioned – Whether to maintain versions for this feature-set. All versions of a versioned object will be kept in the DB and can be retrieved until explicitly deleted.

Returns

The FeatureSet object (as dict).

store_feature_vector(feature_vector: Union[dict, mlrun.api.schemas.feature_store.FeatureVector], name=None, project='', tag=None, uid=None, versioned=True)dict[source]

Store a FeatureVector object in the mlrun DB. The feature-vector can be either a new object or a modification to existing object referenced by the params of the function.

Parameters
  • feature_vector – The FeatureVector to store.

  • project – Name of project this feature-vector belongs to.

  • tag – The tag of the object to replace in the DB, for example latest.

  • uid – The uid of the object to replace in the DB. If using this parameter, the modified object must have the same uid of the previously-existing object. This cannot be used for non-versioned objects.

  • versioned – Whether to maintain versions for this feature-vector. All versions of a versioned object will be kept in the DB and can be retrieved until explicitly deleted.

Returns

The FeatureVector object (as dict).

store_function(function, name, project='', tag=None, versioned=False)[source]

Store a function object. Function is identified by its name and tag, and can be versioned.

store_log(uid, project='', body=None, append=False)[source]

Save a log persistently.

Parameters
  • uid – Log unique ID

  • project – Project name for which this log belongs

  • body – The actual log to store

  • append – Whether to append the log provided in body to an existing log with the same uid or to create a new log. If set to False, an existing log with same uid will be overwritten

store_project(name: str, project: Union[dict, mlrun.projects.project.MlrunProject, mlrun.api.schemas.project.Project])mlrun.projects.project.MlrunProject[source]

Store a project in the DB. This operation will overwrite existing project of the same name if exists.

store_run(struct, uid, project='', iter=0)[source]

Store run details in the DB. This method is usually called from within other mlrun flows and not called directly by the user.

submit_job(runspec, schedule: Optional[Union[str, mlrun.api.schemas.schedule.ScheduleCronTrigger]] = None)[source]

Submit a job for remote execution.

Parameters
  • runspec – The runtime object spec (Task) to execute.

  • schedule – Whether to schedule this job using a Cron trigger. If not specified, the job will be submitted immediately.

submit_pipeline(pipeline, arguments=None, experiment=None, run=None, namespace=None, artifact_path=None, ops=None, ttl=None)[source]

Submit a KFP pipeline for execution.

Parameters
  • pipeline – Pipeline function or path to .yaml/.zip pipeline file.

  • arguments – A dictionary of arguments to pass to the pipeline.

  • experiment – A name to assign for the specific experiment.

  • run – A name for this specific run.

  • namespace – Kubernetes namespace to execute the pipeline in.

  • artifact_path – A path to artifacts used by this pipeline.

  • ops – Transformers to apply on all ops in the pipeline.

  • ttl – Set the TTL for the pipeline after its completion.

update_run(updates: dict, uid, project='', iter=0)[source]

Update the details of a stored run in the DB.

update_schedule(project: str, name: str, schedule: mlrun.api.schemas.schedule.ScheduleUpdate)[source]

Update an existing schedule, replace it with the details contained in the schedule object.

watch_log(uid, project='', watch=True, offset=0)[source]

Retrieve logs of a running process, and watch the progress of the execution until it completes. This method will print out the logs and continue to periodically poll for, and print, new logs as long as the state of the runtime which generates this log is either pending or running.

Parameters
  • uid – The uid of the log object to watch.

  • project – Project that the log belongs to.

  • watch – If set to True will continue tracking the log as described above. Otherwise this function is practically equivalent to the get_log() function.

  • offset – Minimal offset in the log to watch.

Returns

The final state of the log being watched.

class mlrun.api.schemas.secret.SecretProviderName(value)[source]

Bases: str, enum.Enum

Enum containing names of valid providers for secrets.

kubernetes = 'kubernetes'
vault = 'vault'