Real-time serving#

MLRun can produce managed real-time serverless pipelines from various tasks, including MLRun models or standard model files. The pipelines use a real-time serverless engine, called Nuclio, which can be deployed anywhere and is capable of delivering intensive data, I/O, and compute workloads.

Serving a model begins by creating a serving function. This function can run one or more models. To load and call a model, one needs to provide a serving class. MLRun has built-in support for commonly used frameworks and therefore it is often convenient to start with built-in classes. You can also create your own custom model serving class. You can also find an example notebook that shows how to build and run a serving class.

MLRun serving supports advanced real-time data processing and model serving pipelines. For more details and examples, see the MLRun serving pipelines documentation.

In this section

Using built-in model serving classes
Build your own model serving class
Test and deploy a model server
Model serving API