Install MLRun on AWS#
For AWS users, the easiest way to install MLRun is to use a native AWS deployment. This option deploys MLRun on an AWS EKS service using a CloudFormation stack.
Note
These instructions install the community edition, which currently includes MLRun v1.4.0. See the release documentation.
In this section
Prerequisites#
An AWS account with permissions that include the ability to:
Run a CloudFormation stack
Create an EKS cluster
Create EC2 instances
Create VPC
Create S3 buckets
Deploy and pull images from ECR
For the full set of required permissions,
download the IAM policy
or expand & copy the IAM policy below:show the IAM policy
{ "Version": "2012-10-17", "Statement": [ { "Sid": "BasicServices", "Effect": "Allow", "Action": [ "autoscaling:*", "cloudwatch:*", "elasticloadbalancing:*", "sns:*", "ec2:*", "s3:*", "s3-object-lambda:*", "eks:*", "elasticfilesystem:*", "cloudformation:*", "acm:*", "route53:*" ], "Resource": "*" }, { "Sid": "ServiceLinkedRoles", "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "*", "Condition": { "StringEquals": { "iam:AWSServiceName": [ "autoscaling.amazonaws.com", "ec2scheduled.amazonaws.com", "elasticloadbalancing.amazonaws.com", "spot.amazonaws.com", "spotfleet.amazonaws.com", "transitgateway.amazonaws.com" ] } } }, { "Sid": "IAMPermissions", "Effect": "Allow", "Action": [ "iam:AddRoleToInstanceProfile", "iam:AttachRolePolicy", "iam:TagOpenIDConnectProvider", "iam:CreateInstanceProfile", "iam:CreateOpenIDConnectProvider", "iam:CreateRole", "iam:CreateServiceLinkedRole", "iam:DeleteInstanceProfile", "iam:DeleteOpenIDConnectProvider", "iam:DeleteRole", "iam:DeleteRolePolicy", "iam:DetachRolePolicy", "iam:GenerateServiceLastAccessedDetails", "iam:GetAccessKeyLastUsed", "iam:GetAccountPasswordPolicy", "iam:GetAccountSummary", "iam:GetGroup", "iam:GetInstanceProfile", "iam:GetLoginProfile", "iam:GetOpenIDConnectProvider", "iam:GetPolicy", "iam:GetPolicyVersion", "iam:GetRole", "iam:GetRolePolicy", "iam:GetServiceLastAccessedDetails", "iam:GetUser", "iam:ListAccessKeys", "iam:ListAccountAliases", "iam:ListAttachedGroupPolicies", "iam:ListAttachedRolePolicies", "iam:ListAttachedUserPolicies", "iam:ListGroupPolicies", "iam:ListGroups", "iam:ListGroupsForUser", "iam:ListInstanceProfilesForRole", "iam:ListMFADevices", "iam:ListOpenIDConnectProviders", "iam:ListPolicies", "iam:ListPoliciesGrantingServiceAccess", "iam:ListRolePolicies", "iam:ListRoles", "iam:ListRoleTags", "iam:ListSAMLProviders", "iam:ListSigningCertificates", "iam:ListUserPolicies", "iam:ListUsers", "iam:ListUserTags", "iam:PassRole", "iam:PutRolePolicy", "iam:RemoveRoleFromInstanceProfile", "kms:CreateGrant", "kms:CreateKey", "kms:Decrypt", "kms:DescribeKey", "kms:Encrypt", "kms:GenerateDataKeyWithoutPlaintext", "kms:GetKeyPolicy", "kms:GetKeyRotationStatus", "kms:ListResourceTags", "kms:PutKeyPolicy", "kms:ScheduleKeyDeletion", "kms:TagResource" ], "Resource": "*" }, { "Sid": "AllowLanbda", "Effect": "Allow", "Action": [ "lambda:CreateAlias", "lambda:CreateCodeSigningConfig", "lambda:CreateEventSourceMapping", "lambda:CreateFunction", "lambda:CreateFunctionUrlConfig", "lambda:Delete*", "lambda:Get*", "lambda:InvokeAsync", "lambda:InvokeFunction", "lambda:InvokeFunctionUrl", "lambda:List*", "lambda:PublishLayerVersion", "lambda:PublishVersion", "lambda:PutFunctionCodeSigningConfig", "lambda:PutFunctionConcurrency", "lambda:PutFunctionEventInvokeConfig", "lambda:PutProvisionedConcurrencyConfig", "lambda:TagResource", "lambda:UntagResource", "lambda:UpdateAlias", "lambda:UpdateCodeSigningConfig", "lambda:UpdateEventSourceMapping", "lambda:UpdateFunctionCode", "lambda:UpdateFunctionCodeSigningConfig", "lambda:UpdateFunctionConfiguration", "lambda:UpdateFunctionEventInvokeConfig", "lambda:UpdateFunctionUrlConfig" ], "Resource": "*" }, { "Sid": "CertificateService", "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::*:role/aws-service-role/acm.amazonaws.com/AWSServiceRoleForCertificateManager*", "Condition": { "StringEquals": { "iam:AWSServiceName": "acm.amazonaws.com" } } }, { "Sid": "DeleteRole", "Effect": "Allow", "Action": [ "iam:DeleteServiceLinkedRole", "iam:GetServiceLinkedRoleDeletionStatus", "iam:GetRole" ], "Resource": "arn:aws:iam::*:role/aws-service-role/acm.amazonaws.com/AWSServiceRoleForCertificateManager*" }, { "Sid": "SSM", "Effect": "Allow", "Action": [ "logs:*", "ssm:AddTagsToResource", "ssm:GetParameter", "ssm:DeleteParameter", "ssm:PutParameter", "cloudtrail:GetTrail", "cloudtrail:ListTrails" ], "Resource": "*" } ] }
For more information, see how to create a new AWS account and policies and permissions in IAM.
A Route53 domain configured in the same AWS account, and with the full domain name specified in Route 53 hosted DNS domain configuration (See Step 11 below). External domain registration is currently not supported. For more information see What is Amazon Route 53?.
Notes
The MLRun software is free of charge, however, there is a cost for the AWS infrastructure services such as EKS, EC2, S3 and ECR. The actual pricing depends on a large set of factors including, for example, the region, the number of EC2 instances, the amount of storage consumed, and the data transfer costs. Other factors include, for example, reserved instance configuration, saving plan, and AWS credits you have associated with your account. It is recommended to use the AWS pricing calculator to calculate the expected cost, as well as the AWS Cost Explorer to manage the cost, monitor, and set-up alerts.
Post deployment expectations#
The key components deployed on your EKS cluster are:
MLRun server (including the feature store and the MLRun graph)
MLRun UI
Kubeflow pipeline
Real time serverless framework (Nuclio)
Spark operator
Jupyter lab
Grafana
Configuration settings#
Make sure you are logged in to the correct AWS account.
Click the button below to deploy MLRun.
After clicking the icon, the browser directs you to the CloudFormation stack page in your AWS account, or redirects you to the AWS login page if you are not currently logged in.
Note
You must fill in fields marked as mandatory (m) for the configuration to complete. Fields marked as optional (o) can be left blank.
Stack name (m) — the name of the stack. You cannot continue if left blank. This field becomes the logical id of the stack. Stack name can include letters (A-Z and a-z), numbers (0-9), and dashes (-). For example: “John-1”.
Parameters
EKS cluster name (m) — the name of EKS cluster created. The EKS cluster is used to run the MLRun services. For example: “John-1”.
VPC network Configuration
Number of Availability Zones (m) — The default is set to 3. Choose from the dropdown to change the number. The minimum is 2.
Availability zones (m) — select a zone from the dropdown. The list is based on the region of the instance. The number of zones must match the number of zones Number of Availability Zones.
Allowed external access CIDR (m) — range of IP addresses allowed to access the cluster. Addresses that are not in this range are not able to access the cluster. Contact your IT manager/network administrator if you are not sure what to fill in here.
Amazon EKS configuration
Additional EKS admin ARN (IAM user) (o) — add an additional admin user to the instance. Users can be added after the stack has been created. For more information see Create a kubeconfig for Amazon EKS.
Instance type (m) — select from the dropdown list. The default is m5.4xlarge. For size considerations see Amazon EC2 Instance Types.
Maximum Number of Nodes (m) — maximum number of nodes in the cluster. The number of nodes combined with the Instance type determines the AWS infrastructure cost.
Amazon EC2 configuration
SSH key name (o) — To access the EC2 instance via SSH, enter an existing key. If left empty, it is possible to access the EC2 instance using the AWS Systems Manager Session Manager. For more information about SSH Keys see Amazon EC2 key pairs and Linux instances.
Provision bastion host (m) — create a bastion host for SSH access to the Kubernetes nodes. The default is enabled. This allows SSH access to your EKS EC2 instances through a public IP.
Iguazio MLRun configuration
Route 53 hosted DNS domain (m) — Enter the name of your registered Route53 domain. Only route53 domains are acceptable.
The URL of your REDIS database (o) — This is only required if you’re using Redis with the online feature store. See how to configure the online feature store for more details.
Other parameters
MLRun CE Helm Chart version (m) — the MLRun Community Edition version to install. Leave the default value for the latest CE release.
Capabilities
Check all the capabilities boxes (m).
Press Create Stack to continue the deployment. The stack creates a VPC with an EKS cluster and deploys all the services on top of it.
Note
It could take up to 2 hours for your stack to be created.
Getting started#
When the stack is complete, go to the output tab for the stack you created. There are links for the MLRun UI, Jupyter, and the Kubeconfig command.
It’s recommended to go through the quick-start and the other tutorials in the documentation. These tutorials and demos come built-in with Jupyter under the root folder of Jupyter.
Storage resources#
When installing the MLRun Community Edition via Cloud Formation, several storage resources are created:
PVs via AWS storage provider: Used to hold the file system of the stacks pods, including the MySQL database of MLRun. These are deleted when the stack is uninstalled.
S3 Bucket: A bucket named
<EKS cluster name>-<Random string>
is created in the AWS account that installs the stack (where<EKS cluster name>
is the name of the EKS cluster you chose and<Random string>
is part of the CloudFormation stack ID). You can see the bucket name in the output tab of the stack. The bucket is used for MLRun’s artifact storage, and is not deleted when uninstalling the stack. The user must empty the bucket and delete it.Container Images in ECR: When building and deploying MLRun and Nuclio functions via the MLRun Community Edition, the function images are stored in an ECR belonging to the AWS account that installs the stack. These images persist in the account’s ECR and are not deleted either.
Configuring the online feature store#
The feature store can store data on a fast key-value database table for quick serving. This online feature store capability requires an external key-value database.
Currently the MLRun feature store supports the following options:
Redis
Iguazio key-value database
To use Redis, you must install Redis separately and provide the Redis URL when configuring the AWS CloudFormation stack. Refer to the Redis getting-started page for information about Redis installation.
Streaming support#
For online serving, it is often convenient to use MLRun graph with a streaming engine. This allows managing queues between steps and functions. MLRun supports Kafka streams as well as Iguazio V3IO streams. See the examples on how to configure the MLRun serving graph with Kafka and V3IO.
Cleanup#
To free up the resources used by MLRun:
Delete the stack. See instructions for deleting a stack on the AWS CloudFormation console for more details.
Delete the S3 bucket that begins with the same name as your EKS cluster. The S3 bucket name is available in the CloudFormation stack output tab.
Delete any remaining images in ECR.
You may also need to check any external storage that you used.