Using LLM prompt templates and artifacts#
This tutorial illustrates how easy it is to use LLMs and prompt templates, inside a complete workflow using the llm-prompt artifact.
Whenever an LLM-Prompt artifact is used, there MUST be a definition of:
What is the prompt template
Which LLM is used
What the model’s generation configuration is (if not using the default)
The model we are using is gpt-4o-mini from OpenAI with the default configuration (see section 3 for available model params), this case covers using a remote model directly from the configured datasource without having to download it first.
We use streamlit to create a chat front-end and deploy it as application runtime.
In this tutorial
Define the prompt templates and the prompt artifact template
Define the function graph and add ModelRunnerStep with proxy models for the shared model
!pip install streamlit
Set up the environment#
This section sets up the environment variables required for OpenAI API access, including the base URL and API key.
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv("ai_gateway.env")
# Validate OpenAI credentials
missing_vars = [
var for var in ("OPENAI_API_KEY", "OPENAI_BASE_URL") if not os.getenv(var)
]
if missing_vars:
raise EnvironmentError(
f"Missing required environment variables: {', '.join(missing_vars)}. "
"Please ensure they are set in 'ai_gateway.env' or your system environment."
)
# Set additional configuration
os.environ["OPENAI_MAX_RETRIES"] = "100"
Import mlrun library and initialize the project#
This initializes the MLRun project
%config Completer.use_jedi = False
import mlrun
from mlrun import get_or_create_project
image = "mlrun/mlrun"
project_name = "llm-openai-bot"
project = get_or_create_project(
project_name, context="./", user_project=True, allow_cross_project=True
)
This section sets up the necessary datastore profiles for time-series database (TSDB) and stream data.
which are essential for monitoring model performance and detecting drift.
You can use a data store profile to manage datastore credentials.
A data store profile holds all the information required to address an external data source, including credentials.
The DatastoreProfileV3io is used for V3IO storages while DatastoreProfileTDEngine, DatastoreProfileKafkaSource are used in community edition.
Notice that recommended base period is 10 minutes, for demo purposes we set base period to 1 minute.
from src.model_monitoring_utils import enable_model_monitoring
enable_model_monitoring(
project=project, deploy_histogram_data_drift_app=False, base_period=1
)
Configure OpenAI profile#
This section sets up an openAI profile (credentials and environment variables), and specifies the model. This tutorial uses the model gpt-4o-mini. You can change it to any model you want to use.
from mlrun.datastore.datastore_profile import OpenAIProfile
open_ai_profile = OpenAIProfile(
name="openai_profile",
api_key=os.environ.get("OPENAI_API_KEY"),
organization=os.environ.get("OPENAI_ORG_ID"),
project=os.environ.get("OPENAI_PROJECT_ID"),
base_url=os.environ.get("OPENAI_BASE_URL"),
timeout=os.environ.get("OPENAI_TIMEOUT"),
max_retries=os.environ.get("OPENAI_MAX_RETRIES"),
)
project.register_datastore_profile(open_ai_profile)
model_url = f"ds://openai_profile/gpt-4o-mini"
Define the prompt templates and the prompt artifact template#
The prompt templates are defined in the src/llm_prompts.py file and include templates for the finance and sport domains.
These templates - finance_prompt_template and sport_prompt_template - are structured to guide the LLM in generating responses based on user queries.
Each template includes a system message that sets the context for the LLM and a user message that includes the user's ID, tone, depth level, and question.
Use the prompt_legend parameter to specify how to map input fields to the corresponding prompt placeholders and to provide descriptive metadata for each placeholder.
For reference, see log_llm_prompt() for how the LLM prompt artifacts are logged as part of the project.
from src.llm_prompts import finance_prompt_template, sport_prompt_template
model_artifact = project.log_model(
"open-ai",
model_url=model_url,
)
# Create and log the finance prompt template as an LLM prompt artifact, capturing its definition and metadata
finance_llm_prompt_artifact = project.log_llm_prompt(
"finance_llm_prompt",
prompt_template=finance_prompt_template,
model_artifact=model_artifact,
invocation_config={
"temperature": 0.7,
"max_tokens": 256,
}, # Invocation config will be add to each invocation
prompt_legend={
"question": {
"field": "user_query",
"description": "The main financial question or request the user is asking.",
},
"depth_level": {
"field": "response_detail_level",
"description": "Indicates the level of detail in the answer (e.g., basic, intermediate, advanced).",
},
"user_id": {
"field": "customer_id",
"description": "Unique identifier of the user, useful for personalization and tracking.",
},
"tone": {
"field": "reply_style",
"description": "The desired style of the response (e.g., formal, friendly, concise, detailed).",
},
},
)
sport_llm_prompt_artifact = project.log_llm_prompt(
"sport_llm_prompt",
prompt_template=sport_prompt_template,
model_artifact=model_artifact,
prompt_legend={
"question": {
"field": "user_query",
"description": "The main sports or fitness-related question from the user.",
},
"depth_level": {
"field": "response_detail_level",
"description": "Indicates how in-depth the explanation should be (e.g., beginner, intermediate, expert).",
},
"user_id": {
"field": "customer_id",
"description": "Unique identifier of the user, used for personalization or tracking.",
},
"tone": {
"field": "reply_style",
"description": "The preferred style or tone of the response (e.g., motivational, professional, casual).",
},
},
)
Enable tracking and deploy the function#
This section enables experiment tracking, deploys the function, and visualizes the workflow of the LLM model using a graph within the Streamlit app.
Note: The deploy_endpoint provides the URL to interact with the Streamlit interface.
function.set_tracking(enable_tracking=True)
graph.plot()
deploy_endpoint = function.deploy()
Deploy the model monitoring application#
This section deploys the model monitoring application, which is responsible for monitoring the performance of the LLMs that were deployed in the previous step. It uses the monitoring_application script to define the monitoring logic. The application is deployed using the deploy_function method, which makes it available for monitoring the LLMs in real time.
llm_monitoring_app = project.set_model_monitoring_function(
func="./src/monitoring_application.py",
application_class="ModelMonitoringApplication",
name="llm-monitoring",
image=image,
)
project.deploy_function(llm_monitoring_app)
import json
payload = {
"model_name": "sport_endpoint",
"user_query": "What can you tell me about finance ?",
"response_detail_level": "basic overview",
"customer_id": 12345,
"reply_style": "casual",
}
function.invoke("", body=json.dumps(payload).encode("utf-8"))
Configure the Streamlit chatbot application#
This section sets up a Streamlit app that enables you to interact with the LLMs deployed in the previous steps. The app provides a user interface for selecting different models, tones, and depth levels, and allows users to submit questions to the LLMs.
!tar -czvf frontend_ui.tar.gz ./src/streamlit_ui.py
# Log the streamlit tar file as project artifact and use it as source archive
frontend_source = project.log_artifact(
"frontend_source", local_path="./frontend_ui.tar.gz", upload=True
)
ui_fn = project.set_function(
name="frontend",
kind="application",
image="mlrun/mlrun",
requirements=["streamlit==1.49.1"],
)
API_URL = function.get_url()
# Set application spec and envs
ui_fn.set_env("API_URL", API_URL)
ui_fn.with_source_archive(frontend_source.target_path, pull_at_runtime=False)
ui_fn.set_internal_application_port(8000)
ui_fn.spec.command = "streamlit"
ui_fn.spec.args = [
"run",
"--server.port",
"8000",
"/home/mlrun_code/src/streamlit_ui.py",
]
Launch the Streamlit Chatbot to Interact with the LLM Model#
This section launches the Streamlit chatbot, providing a user-friendly interface for interacting with the deployed LLM models. Users can select the model, tone, and depth level, submit questions, and view responses in a chat-style format.
ui_fn.deploy(with_mlrun=False, create_default_api_gateway=False)
ui_fn.create_api_gateway(
name="llm-prompt-artifact-ui",
path="/",
direct_port_access=True,
ssl_redirect=True,
set_as_default=False,
authentication_mode="none",
)
print(
f"Use this address to interact with your new chatbot ! https://{ui_fn.status.address}"
)
