Install MLRun on Kubernetes#
Note
These instructions install the community edition, which currently includes MLRun v1.4.0. See the release documentation.
In this section
Prerequisites#
Access to a Kubernetes cluster. To install MLRun on your cluster, you must have administrator permissions. MLRun fully supports k8s releases 1.22, 1.23, and 1.26. For local installation on Windows or Mac, Docker Desktop is recommended.
The Kubernetes command-line tool (kubectl) compatible with your Kubernetes cluster is installed. Refer to the kubectl installation instructions for more information.
Helm 3.6 CLI is installed. Refer to the Helm installation instructions for more information.
An accessible docker-registry (such as Docker Hub). The registry's URL and credentials are consumed by the applications via a pre-created secret.
Storage:
8Gi
Set a default storage class for the kubernetes cluster, in order for the pods to have persistent storage. See the Kubernetes documentation for more information.
RAM: A minimum of 8Gi is required for running all the initial MLRun components. The amount of RAM required for running MLRun jobs depends on the job's requirements.
Note
The MLRun Community Edition resources are configured initially with the default cluster/namespace resource limits. You can modify the resources from outside if needed.
Community Edition flavors#
The MLRun CE (Community Edition) includes the following components:
MLRun - mlrun/mlrun
MLRun API
MLRun UI
MLRun DB (MySQL)
Nuclio - nuclio/nuclio
Jupyter - jupyter/notebook (+MLRun integrated)
MPI Operator - kubeflow/mpi-operator
Minio - minio/minio
Spark Operator - GoogleCloudPlatform/spark-on-k8s-operator
Pipelines - kubeflow/pipelines
Prometheus stack - prometheus-community/helm-charts
Prometheus
Grafana
Installing the chart#
Note
These instructions use mlrun
as the namespace (-n
parameter). You can choose a different namespace in your kubernetes cluster.
Create a namespace for the deployed components:
kubectl create namespace mlrun
Add the Community Edition helm chart repo:
helm repo add mlrun-ce https://mlrun.github.io/ce
Run the following command to ensure that the repo is installed and available:
helm repo list
It should output something like:
NAME URL
mlrun-ce https://github.com/mlrun/ce
Update the repo to make sure you're getting the latest chart:
helm repo update
Create a secret with your docker-registry named registry-credentials
:
kubectl --namespace mlrun create secret docker-registry registry-credentials \
--docker-server <your-registry-server> \
--docker-username <your-username> \
--docker-password <your-password> \
--docker-email <your-email>
Note: If using docker hub, the registry server is
https://registry.hub.docker.com/
. Refer to the Docker ID documentation for creating a user with login to configure in the secret.
Where:
<your-registry-server>
is your Private Docker Registry FQDN. (https://index.docker.io/v1/ for Docker Hub).<your-username>
is your Docker username.<your-password>
is your Docker password.<your-email>
is your Docker email.
Note
First-time MLRun users experience a relatively longer installation time because all required images are pulled locally for the first time (it takes an average of 10-15 minutes, mostly depending on your internet speed).
To install the chart with the release name mlrun-ce
use the following command.
Note the reference to the pre-created registry-credentials
secret in global.registry.secretName
:
helm --namespace mlrun \
install mlrun-ce \
--wait \
--timeout 960s \
--set global.registry.url=<registry-url> \
--set global.registry.secretName=registry-credentials \
--set global.externalHostAddress=<host-machine-address> \
mlrun-ce/mlrun-ce
Where:
<registry-url>
is the registry URL that can be authenticated by theregistry-credentials
secret (e.g.,index.docker.io/<your-username>
for Docker Hub).<host-machine-address>
is the IP address of the host machine (or$(minikube ip)
if using minikube).
When the installation is complete, the helm command prints the URLs and ports of all the MLRun CE services.
Note: There is currently a known issue with installing the chart on Macs using Apple silicon (M1/M2). The current pipelines mysql database fails to start. The workaround for now is to opt out of pipelines by installing the chart with the
--set pipelines.enabled=false
.
Configuring the online feature store#
The MLRun Community Edition now supports the online feature store. To enable it, you need to first deploy a Redis service that is accessible to your MLRun CE cluster. To deploy a Redis service, refer to the Redis documentation.
When you have a Redis service deployed, you can configure MLRun CE to use it by adding the following helm value configuration to your helm install command:
--set mlrun.api.extraEnvKeyValue.MLRUN_REDIS__URL=<redis-address>
Usage#
Your applications are now available in your local browser:
Jupyter Notebook -
http://<host-machine-address>:30040
Nuclio -
http://<host-machine-address>:30050
MLRun UI -
http://<host-machine-address>:30060
MLRun API (external) -
http://<host-machine-address>:30070
MinIO API -
http://<host-machine-address>:30080
MinIO UI -
http://<host-machine-address>:30090
Pipeline UI -
http://<host-machine-address>:30100
Grafana UI -
http://<host-machine-address>:30110
Check state
You can check the current state of the installation via the command kubectl -n mlrun get pods
, where the main information
is in columns Ready
and State
. If all images have already been pulled locally, typically it takes
a minute for all services to start.
Note
You can change the ports by providing values to the helm install command. You can add and configure a Kubernetes ingress-controller for better security and control over external access.
Start working#
Open the Jupyter notebook on jupyter-notebook UI and run the code in the examples/mlrun_basics.ipynb notebook.
Important
Make sure to save your changes in the data
folder within the Jupyter Lab. The root folder and any other folders do not retain the changes when you restart the Jupyter Lab.
Configuring the remote environment#
You can use your code on a local machine while running your functions on a remote cluster. Refer to Set up your environment for more information.
Advanced chart configuration#
Configurable values are documented in the values.yaml
, and the values.yaml
of all sub charts. Override those in the normal methods.
Opt out of components#
The chart installs many components. You may not need them all in your deployment depending on your use cases. To opt out of some of the components, use the following helm values:
...
--set pipelines.enabled=false \
--set kube-prometheus-stack.enabled=false \
--set sparkOperator.enabled=false \
...
Installing on Docker Desktop#
If you are using Docker Desktop, you can install MLRun CE on your local machine. Docker Desktop is available for Mac and Windows. For download information, system requirements, and installation instructions, see:
Install Docker Desktop on Windows. Note that WSL 2 backend was tested, Hyper-V was not tested.
Configuring Docker Desktop#
Docker Desktop includes a standalone Kubernetes server and client, as well as Docker CLI integration that runs on your machine. The
Kubernetes server runs locally within your Docker instance. To enable Kubernetes support and install a standalone instance of Kubernetes
running as a Docker container, go to Preferences > Kubernetes and then press Enable Kubernetes. Press Apply & Restart to
save the settings and then press Install to confirm. This instantiates the images that are required to run the Kubernetes server as
containers, and installs the /usr/local/bin/kubectl
command on your machine. For more information, see the Kubernetes documentation.
It's recommended to limit the amount of memory allocated to Kubernetes. If you're using Windows and WSL 2, you can configure global WSL options by placing a .wslconfig
file into the root directory of
your users folder: C:\Users\<yourUserName>\.wslconfig
. Keep in mind that you might need to run wsl --shutdown
to shut down the WSL 2 VM and then restart your WSL instance for these changes to take effect.
[wsl2]
memory=8GB # Limits VM memory in WSL 2 to 8 GB
To learn about the various UI options and their usage, see:
Storage resources#
When installing the MLRun Community Edition, several storage resources are created:
PVs via default configured storage class: Holds the file system of the stacks pods, including the MySQL database of MLRun, Minio for artifacts and Pipelines Storage and more. These are not deleted when the stack is uninstalled, which allows upgrading without losing data.
Container Images in the configured docker-registry: When building and deploying MLRun and Nuclio functions via the MLRun Community Edition, the function images are stored in the given configured docker registry. These images persist in the docker registry and are not deleted.
Uninstalling the chart#
The following command deletes the pods, deployments, config maps, services and roles+role bindings associated with the chart and release.
helm --namespace mlrun uninstall mlrun-ce
Notes on dangling resources#
The created CRDs are not deleted by default and should be manually cleaned up.
The created PVs and PVCs are not deleted by default and should be manually cleaned up.
As stated above, the images in the docker registry are not deleted either and should be cleaned up manually.
If you installed the chart in its own namespace, it's also possible to delete the entire namespace to clean up all resources (apart from the docker registry images).
Note on terminating pods and hanging resources#
This chart generates several persistent volume claims that provide persistency (via PVC) out of the box. Upon uninstallation, any hanging / terminating pods hold the PVCs and PVs respectively, as those prevent their safe removal. Since pods that are stuck in terminating state seem to be a never-ending plague in Kubernetes, note this, and remember to clean the remaining PVs and PVCs.
Handing stuck-at-terminating pods:#
kubectl --namespace mlrun delete pod --force --grace-period=0 <pod-name>
Reclaim dangling persistency resources:#
WARNING
This will result in data loss!
# To list PVCs
$ kubectl --namespace mlrun get pvc
...
# To remove a PVC
$ kubectl --namespace mlrun delete pvc <pvc-name>
...
# To list PVs
$ kubectl --namespace mlrun get pv
...
# To remove a PVC
$ kubectl --namespace mlrun delete pvc <pv-name>
...
Upgrading the chart#
To upgrade to the latest version of the chart, first make sure you have the latest helm repo
helm repo update
Then try to upgrade the chart:
helm upgrade --install --reuse-values mlrun-ce —namespace mlrun mlrun-ce/mlrun-ce
If it fails, you should reinstall the chart:
remove current mlrun-ce
mkdir ~/tmp
helm get values -n mlrun mlrun-ce > ~/tmp/mlrun-ce-values.yaml
helm uninstall mlrun-ce
reinstall mlrun-ce, reuse values
helm install -n mlrun --values ~/tmp/mlrun-ce-values.yaml mlrun-ce mlrun-ce/mlrun-ce --devel
Note
If your values have fixed mlrun service versions (e.g.: mlrun:1.3.0) then you might want to remove it from the values file to allow newer chart defaults to kick in
Storing artifacts in AWS S3 storage#
MLRun CE uses a Minio service as shared storage for artifacts, and accesses it using S3 protocol. This means that
any path that begins with s3://
is automatically directed by MLRun to the Minio service. The default artifact
path is also configured as s3://mlrun/projects/{{run.project}}/artifacts
which is a path on the mlrun
bucket in the
Minio service.
To store artifacts in AWS S3 buckets instead of the local Minio service, these configurations need to be overridden to
make s3://
paths lead to AWS buckets instead.
Note
These configurations are only required for AWS S3 storage, due to the usage of the same S3 protocol in Minio. For other storage options (such as GCS, Azure blobs etc.) only the artifact path needs to be modified, and credentials need to be provided.
Setting up S3 credentials and endpoint#
Set up the following project-secrets (refer to Data stores and Project secrets) for any project used:
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
— S3 credentialsS3_ENDPOINT_URL
— the AWS S3 endpoint to use, depending on the region. For example:S3_ENDPOINT_URL = https://s3.us-east-2.amazonaws.com/
Disabling auto-mount#
Before running any MLRun job that writes to S3 bucket, make sure auto-mount is disabled for it, since by default auto-mount adds S3 configurations that point at the Minio service (refer to Function storage for more details on auto-mount). This can be done in one of following ways:
Set the client-side MLRun configuration to disable auto-mount. This disables auto-mount for any function run after this command:
from mlrun.config import config as mlconf mlconf.storage.auto_mount_type = "none"
If running MLRun from an IDE, the configuration can be overridden using an environment variable. Set the following environment variable for your IDE environment:
MLRUN_STORAGE__AUTO_MOUNT_TYPE = "none"
Disable auto-mount for a specific function. This must be done before running the function for the first time:
function.spec.disable_auto_mount = True
Changing the artifact path#
The artifact path needs to be modified since the bucket name is set to mlrun
by default. It is recommended to keep
the same path structure as the default, while modifying the bucket name. For example:
s3://<bucket name>/projects/{{run.project}}/artifacts
The artifact path can be set in several ways, refer to Artifact path for more details.
Note
If your values have fixed mlrun service versions (e.g.: mlrun:1.5.0) then you might want to remove it from the values file to allow newer chart defaults to kick in.