Skip to main content

Assistants API

Covers Threads, Messages, Assistants.

LiteLLM currently covers:

  • Get Assistants
  • Create Thread
  • Get Thread
  • Add Messages
  • Get Messages
  • Run Thread

Providers​

LiteLLM currently supports three providers for Assistants:

Models and Embeddings​

OpenAI and Azure support OpenAI models.

Astra Assistants supports all the models and embeddings available in LiteLLM. When calling Astra Assistants through LiteLLM, make sure to configure the provider credentials that match the models configured in your existing files and assistants as environment variables in addition to your Astra credentials.

For example: if your assistant uses the claude-3-haiku-20240307 model, make sure to pass your Anthropic keys:

os.environ["ANTHROPIC_API_KEY"] = "sk-.."

and

os.environ["ASTRA_DB_APPLICATION_TOKEN"] = "AstraCS:..."

Quick Start​

Call an existing Assistant.

  • Get the Assistant

  • Create a Thread when a user starts a conversation.

  • Add Messages to the Thread as the user asks questions.

  • Run the Assistant on the Thread to generate a response by calling the model and the tools.

Get the Assistant

from litellm import get_assistants, aget_assistants
import os

# setup env
os.environ["OPENAI_API_KEY"] = "sk-.."

assistants = get_assistants(custom_llm_provider="openai")

### ASYNC USAGE ###
# assistants = await aget_assistants(custom_llm_provider="openai")

Create a Thread

from litellm import create_thread, acreate_thread
import os

os.environ["OPENAI_API_KEY"] = "sk-.."

new_thread = create_thread(
custom_llm_provider="openai",
messages=[{"role": "user", "content": "Hey, how's it going?"}], # type: ignore
)

### ASYNC USAGE ###
# new_thread = await acreate_thread(custom_llm_provider="openai",messages=[{"role": "user", "content": "Hey, how's it going?"}])

Add Messages to the Thread

from litellm import create_thread, get_thread, aget_thread, add_message, a_add_message
import os

os.environ["OPENAI_API_KEY"] = "sk-.."

## CREATE A THREAD
_new_thread = create_thread(
custom_llm_provider="openai",
messages=[{"role": "user", "content": "Hey, how's it going?"}], # type: ignore
)

## OR retrieve existing thread
received_thread = get_thread(
custom_llm_provider="openai",
thread_id=_new_thread.id,
)

### ASYNC USAGE ###
# received_thread = await aget_thread(custom_llm_provider="openai", thread_id=_new_thread.id,)

## ADD MESSAGE TO THREAD
message = {"role": "user", "content": "Hey, how's it going?"}
added_message = add_message(
thread_id=_new_thread.id, custom_llm_provider="openai", **message
)

### ASYNC USAGE ###
# added_message = await a_add_message(thread_id=_new_thread.id, custom_llm_provider="openai", **message)

Run the Assistant on the Thread

from litellm import get_assistants, create_thread, add_message, run_thread, arun_thread
import os

os.environ["OPENAI_API_KEY"] = "sk-.."
assistants = get_assistants(custom_llm_provider="openai")

## get the first assistant ###
assistant_id = assistants.data[0].id

## GET A THREAD
_new_thread = create_thread(
custom_llm_provider="openai",
messages=[{"role": "user", "content": "Hey, how's it going?"}], # type: ignore
)

## ADD MESSAGE
message = {"role": "user", "content": "Hey, how's it going?"}
added_message = add_message(
thread_id=_new_thread.id, custom_llm_provider="openai", **message
)

## 🚨 RUN THREAD
response = run_thread(
custom_llm_provider="openai", thread_id=thread_id, assistant_id=assistant_id
)

### ASYNC USAGE ###
# response = await arun_thread(custom_llm_provider="openai", thread_id=thread_id, assistant_id=assistant_id)

print(f"run_thread: {run_thread}")

Streaming​

from litellm import run_thread_stream 
import os

os.environ["OPENAI_API_KEY"] = "sk-.."

message = {"role": "user", "content": "Hey, how's it going?"}

data = {"custom_llm_provider": "openai", "thread_id": _new_thread.id, "assistant_id": assistant_id, **message}

run = run_thread_stream(**data)
with run as run:
assert isinstance(run, AssistantEventHandler)
for chunk in run:
print(f"chunk: {chunk}")
run.until_done()

👉 Proxy API Reference​