Install MLRun locally using Docker#

You can install and use MLRun and Nuclio locally on your computer. This does not include all the services and elastic scaling capabilities, which you can get with the Kubernetes based deployment, but it is much simpler to start with.

Note

Using Docker is limited to local, Nuclio, serving runtimes, and local pipelines.

Prerequisites#

  • Memory: 8GB

  • Storage: 7GB

Overview#

Use docker compose to install MLRun. It deploys the MLRun service, MLRun UI, Nuclio serverless engine, and optionally the Jupyter server. The MLRun service, MLRun UI, Nuclio, and Jupyter, do not have default resources. This means that they are set with the default cluster/namespace resources limits. These can be modified.

There are two installation options:

In both cases you need to set the SHARED_DIR environment variable to point to a host path for storing MLRun artifacts and DB, for example export SHARED_DIR=~/mlrun-data (or use set SHARED_DIR=c:\mlrun-data in windows). Make sure the directory exists.

You also need to set the HOST_IP variable with your computer IP address (required for Nuclio dashboard). You can select a specific MLRun version with the TAG variable and Nuclio version with the NUCLIO_TAG variable.

Note

Support for running as a non-root user was added in 1.0.5, hence the underlying exposed port was changed. If you want to use previous mlrun versions, modify the mlrun-ui port from 8090 back to 80.

If you are running more than one instance of MLRun, change the exposed port.

Watch the installation:

Use MLRun with your own client#

The following commands install MLRun and Nuclio for work with your own IDE or notebook.

[Download here] the compose.yaml file, save it to the working dir and type:

show the compose.yaml file
services:
  init_nuclio:
    image: alpine:3.16
    command:
      - "/bin/sh"
      - "-c"
      - |
        mkdir -p /etc/nuclio/config/platform; \
        cat << EOF | tee /etc/nuclio/config/platform/platform.yaml
        runtime:
          common:
            env:
              MLRUN_DBPATH: http://${HOST_IP:?err}:8080
        local:
          defaultFunctionContainerNetworkName: mlrun
          defaultFunctionRestartPolicy:
            name: always
            maxRetryCount: 0
          defaultFunctionVolumes:
            - volume:
                name: mlrun-stuff
                hostPath:
                  path: ${SHARED_DIR:?err}
              volumeMount:
                name: mlrun-stuff
                mountPath: /home/jovyan/data/
        logger:
          sinks:
            myStdoutLoggerSink:
              kind: stdout
          system:
            - level: debug
              sink: myStdoutLoggerSink
          functions:
            - level: debug
              sink: myStdoutLoggerSink
        EOF
    volumes:
      - nuclio-platform-config:/etc/nuclio/config

  mlrun-api:
    image: "mlrun/mlrun-api:${TAG:-1.3.0}"
    ports:
      - "8080:8080"
    environment:
      MLRUN_ARTIFACT_PATH: "${SHARED_DIR}/{{project}}"
      # using local storage, meaning files / artifacts are stored locally, so we want to allow access to them
      MLRUN_HTTPDB__REAL_PATH: /data
      MLRUN_HTTPDB__DATA_VOLUME: "${SHARED_DIR}"
      MLRUN_LOG_LEVEL: DEBUG
      MLRUN_NUCLIO_DASHBOARD_URL: http://nuclio:8070
      MLRUN_HTTPDB__DSN: "sqlite:////data/mlrun.db?check_same_thread=false"
      MLRUN_UI__URL: http://localhost:8060
      # not running on k8s meaning no need to store secrets
      MLRUN_SECRET_STORES__KUBERNETES__AUTO_ADD_PROJECT_SECRETS: "false"
      # let mlrun control nuclio resources
      MLRUN_HTTPDB__PROJECTS__FOLLOWERS: "nuclio"
    volumes:
      - "${SHARED_DIR:?err}:/data"
    networks:
      - mlrun

  mlrun-ui:
    image: "mlrun/mlrun-ui:${TAG:-1.3.0}"
    ports:
      - "8060:8090"
    environment:
      MLRUN_API_PROXY_URL: http://mlrun-api:8080
      MLRUN_NUCLIO_MODE: enable
      MLRUN_NUCLIO_API_URL: http://nuclio:8070
      MLRUN_NUCLIO_UI_URL: http://localhost:8070
    networks:
      - mlrun

  nuclio:
    image: "quay.io/nuclio/dashboard:${NUCLIO_TAG:-stable-amd64}"
    ports:
      - "8070:8070"
    environment:
      NUCLIO_DASHBOARD_EXTERNAL_IP_ADDRESSES: "${HOST_IP:?err}"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - nuclio-platform-config:/etc/nuclio/config
    depends_on:
      - init_nuclio
    networks:
      - mlrun

volumes:
  nuclio-platform-config: {}

networks:
  mlrun:
    name: mlrun
export HOST_IP=<your host IP address>
export SHARED_DIR=~/mlrun-data
mkdir $SHARED_DIR -p
docker-compose -f compose.yaml up -d

Your HOST_IP address can be found using the ip addr or ifconfig commands (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the bridge interface).

set HOST_IP=<your host IP address>
set SHARED_DIR=c:\mlrun-data
mkdir %SHARED_DIR%
docker-compose -f compose.yaml up -d

Your HOST_IP address can be found using the ipconfig shell command (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the vEthernet interface).

$Env:HOST_IP=<your host IP address>
$Env:SHARED_DIR="~/mlrun-data"
mkdir $Env:SHARED_DIR
docker-compose -f compose.yaml up -d

Your HOST_IP address can be found using the Get-NetIPConfiguration cmdlet (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the vEthernet interface).

This creates 3 services:

After installing MLRun service, set your client environment to work with the service, by setting the MLRun path env variable to MLRUN_DBPATH=http://localhost:8080 or using .env files (see setting client environment).

Use MLRun with MLRun Jupyter image#

For the quickest experience with MLRun you can deploy MLRun with a pre-integrated Jupyter server loaded with various ready-to-use MLRun examples.

[Download here] the compose.with-jupyter.yaml file, save it to the working dir and type:

services:
  init_nuclio:
    image: alpine:3.16
    command:
      - "/bin/sh"
      - "-c"
      - |
        mkdir -p /etc/nuclio/config/platform; \
        cat << EOF | tee /etc/nuclio/config/platform/platform.yaml
        runtime:
          common:
            env:
              MLRUN_DBPATH: http://${HOST_IP:?err}:8080
        local:
          defaultFunctionContainerNetworkName: mlrun
          defaultFunctionRestartPolicy:
            name: always
            maxRetryCount: 0
          defaultFunctionVolumes:
            - volume:
                name: mlrun-stuff
                hostPath:
                  path: ${SHARED_DIR:?err}
              volumeMount:
                name: mlrun-stuff
                mountPath: /home/jovyan/data/
        logger:
          sinks:
            myStdoutLoggerSink:
              kind: stdout
          system:
            - level: debug
              sink: myStdoutLoggerSink
          functions:
            - level: debug
              sink: myStdoutLoggerSink
        EOF
    volumes:
      - nuclio-platform-config:/etc/nuclio/config

  jupyter:
    image: "mlrun/jupyter:${TAG:-1.3.0}"
    ports:
      - "8080:8080"
      - "8888:8888"
    environment:
      MLRUN_ARTIFACT_PATH: "/home/jovyan/data/{{project}}"
      MLRUN_LOG_LEVEL: DEBUG
      MLRUN_NUCLIO_DASHBOARD_URL: http://nuclio:8070
      MLRUN_HTTPDB__DSN: "sqlite:////home/jovyan/data/mlrun.db?check_same_thread=false"
      MLRUN_UI__URL: http://localhost:8060
      # using local storage, meaning files / artifacts are stored locally, so we want to allow access to them
      MLRUN_HTTPDB__REAL_PATH: "/home/jovyan/data"
      # not running on k8s meaning no need to store secrets
      MLRUN_SECRET_STORES__KUBERNETES__AUTO_ADD_PROJECT_SECRETS: "false"
      # let mlrun control nuclio resources
      MLRUN_HTTPDB__PROJECTS__FOLLOWERS: "nuclio"
    volumes:
      - "${SHARED_DIR:?err}:/home/jovyan/data"
    networks:
      - mlrun

  mlrun-ui:
    image: "mlrun/mlrun-ui:${TAG:-1.3.0}"
    ports:
      - "8060:8090"
    environment:
      MLRUN_API_PROXY_URL: http://jupyter:8080
      MLRUN_NUCLIO_MODE: enable
      MLRUN_NUCLIO_API_URL: http://nuclio:8070
      MLRUN_NUCLIO_UI_URL: http://localhost:8070
    networks:
      - mlrun

  nuclio:
    image: "quay.io/nuclio/dashboard:${NUCLIO_TAG:-stable-amd64}"
    ports:
      - "8070:8070"
    environment:
      NUCLIO_DASHBOARD_EXTERNAL_IP_ADDRESSES: "${HOST_IP:?err}"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - nuclio-platform-config:/etc/nuclio/config
    depends_on:
      - init_nuclio
    networks:
      - mlrun

volumes:
  nuclio-platform-config: {}

networks:
  mlrun:
    name: mlrun
export HOST_IP=<your host IP address>
export SHARED_DIR=~/mlrun-data
mkdir -p $SHARED_DIR
docker-compose -f compose.with-jupyter.yaml up -d

Your HOST_IP address can be found using the ip addr or ifconfig commands (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the bridge interface).

set HOST_IP=<your host IP address>
set SHARED_DIR=c:\mlrun-data
mkdir %SHARED_DIR%
docker-compose -f compose.with-jupyter.yaml up -d

Your HOST_IP address can be found using the ipconfig shell command (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the vEthernet interface).

$Env:HOST_IP=<your host IP address>
$Env:SHARED_DIR="~/mlrun-data"
mkdir $Env:SHARED_DIR
docker-compose -f compose.with-jupyter.yaml up -d

Your HOST_IP address can be found using the Get-NetIPConfiguration cmdlet (do not use localhost or 127.0.0.1). It is recommended to select an address that does not change dynamically (for example the IP of the vEthernet interface).

This creates 4 services:

After the installation, access the Jupyter server (in http://localhost:8888) and run through the quick-start tutorial and demos. You can see the projects, tasks, and artifacts in MLRun UI (in http://localhost:8060).

The Jupyter environment is pre-configured to work with the local MLRun and Nuclio services. You can switch to a remote or managed MLRun cluster by editing the mlrun.env file in the Jupyter files tree.

The artifacts and DB are stored under /home/jovyan/data (/data in Jupyter tree).