LocalRuntime
Overview
LocalRuntime
is the simplest way to connect your own custom backend to assistant-ui. It manages all chat state internally while providing a clean adapter interface to connect with any REST API, OpenAI, or custom language model.
LocalRuntime
provides:
- Built-in state management for messages, threads, and conversation history
- Automatic features like message editing, reloading, and branch switching
- Multi-thread support through Assistant Cloud or your own database using
useRemoteThreadListRuntime
- Simple adapter pattern to connect any backend API
While LocalRuntime manages state in-memory by default, it offers multiple persistence options through adapters - use the history adapter for single-thread persistence, Assistant Cloud for managed multi-thread support, or implement your own storage with useRemoteThreadListRuntime
.
When to Use
Use LocalRuntime
if you need:
- Quick setup with minimal configuration - Get a fully functional chat interface with just a few lines of code
- Built-in state management - No need to manage messages, threads, or conversation history yourself
- Automatic features - Branch switching, message editing, and regeneration work out of the box
- API flexibility - Connect to any REST endpoint, OpenAI, or custom model with a simple adapter
- Multi-thread support - Full thread management with Assistant Cloud or custom database
- Thread persistence - Via history adapter, Assistant Cloud, or custom thread list adapter
Key Features
Built-in State Management
Automatic handling of messages, threads, and conversation history
Multi-Thread Support
Full thread management capabilities with Assistant Cloud or custom database adapter
Adapter System
Extend with attachments, speech, feedback, persistence, and suggestions
Tool Calling
Support for function calling with human-in-the-loop approval
Getting Started
Create a Next.js project
npx create-next-app@latest my-app
cd my-app
Install @assistant-ui/react
npm install @assistant-ui/react
Define a MyRuntimeProvider
component
Update the MyModelAdapter
below to integrate with your own custom API.
See LocalRuntimeOptions
API Reference for available configuration options.
"use client";
import type { ReactNode } from "react";
import {
AssistantRuntimeProvider,
useLocalRuntime,
type ChatModelAdapter,
} from "@assistant-ui/react";
const MyModelAdapter: ChatModelAdapter = {
async run({ messages, abortSignal }) {
// TODO replace with your own API
const result = await fetch("<YOUR_API_ENDPOINT>", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
// forward the messages in the chat to the API
body: JSON.stringify({
messages,
}),
// if the user hits the "cancel" button or escape keyboard key, cancel the request
signal: abortSignal,
});
const data = await result.json();
return {
content: [
{
type: "text",
text: data.text,
},
],
};
},
};
export function MyRuntimeProvider({
children,
}: Readonly<{
children: ReactNode;
}>) {
const runtime = useLocalRuntime(MyModelAdapter);
return (
<AssistantRuntimeProvider runtime={runtime}>
{children}
</AssistantRuntimeProvider>
);
}
Wrap your app in MyRuntimeProvider
import type { ReactNode } from "react";
import { MyRuntimeProvider } from "@/app/MyRuntimeProvider";
export default function RootLayout({
children,
}: Readonly<{
children: ReactNode;
}>) {
return (
<MyRuntimeProvider>
<html lang="en">
<body>{children}</body>
</html>
</MyRuntimeProvider>
);
}
Use the Thread component
import { Thread } from "@assistant-ui/react";
export default function Page() {
return <Thread />;
}
Streaming Responses
Implement streaming by declaring the run
function as an AsyncGenerator
.
const MyModelAdapter: ChatModelAdapter = {
async *run({ messages, abortSignal, context }) {
const stream = await backendApi({ messages, abortSignal, context });
let text = "";
for await (const part of stream) {
text += part.choices[0]?.delta?.content || "";
yield {
content: [{ type: "text", text }],
};
}
},
};
Streaming with Tool Calls
Handle streaming responses that include function calls:
const MyModelAdapter: ChatModelAdapter = {
async *run({ messages, abortSignal, context }) {
const stream = await openai.chat.completions.create({
model: "gpt-4o",
messages: convertToOpenAIMessages(messages),
tools: context.tools,
stream: true,
signal: abortSignal,
});
let content = "";
const toolCalls: any[] = [];
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta;
// Handle text content
if (delta?.content) {
content += delta.content;
}
// Handle tool calls
if (delta?.tool_calls) {
for (const toolCall of delta.tool_calls) {
if (!toolCalls[toolCall.index]) {
toolCalls[toolCall.index] = {
id: toolCall.id,
type: "function",
function: { name: "", arguments: "" },
};
}
if (toolCall.function?.name) {
toolCalls[toolCall.index].function.name = toolCall.function.name;
}
if (toolCall.function?.arguments) {
toolCalls[toolCall.index].function.arguments +=
toolCall.function.arguments;
}
}
}
// Yield current state
yield {
content: [
...(content ? [{ type: "text" as const, text: content }] : []),
...toolCalls.map((tc) => ({
type: "tool-call" as const,
toolCallId: tc.id,
toolName: tc.function.name,
args: JSON.parse(tc.function.arguments || "{}"),
})),
],
};
}
},
};
Tool Calling
LocalRuntime
supports OpenAI-compatible function calling with automatic or human-in-the-loop execution.
Basic Tool Definition
const tools = [
{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather in a location",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "The city and state, e.g. San Francisco, CA",
},
unit: {
type: "string",
enum: ["celsius", "fahrenheit"],
},
},
required: ["location"],
},
},
},
];
const runtime = useLocalRuntime(MyModelAdapter, {
// Tools are passed via context
context: { tools },
});
Human-in-the-Loop Approval
Require user confirmation before executing certain tools:
const runtime = useLocalRuntime(MyModelAdapter, {
unstable_humanToolNames: ["delete_file", "send_email"],
});
Tool Execution
Tools are executed automatically by the runtime. The model adapter receives tool results in subsequent messages:
// Messages will include tool calls and results:
[
{ role: "user", content: "What's the weather in SF?" },
{
role: "assistant",
content: [
{
type: "tool-call",
toolCallId: "call_123",
toolName: "get_weather",
args: { location: "San Francisco, CA" },
},
],
},
{
role: "tool",
content: [
{
type: "tool-result",
toolCallId: "call_123",
result: { temperature: 72, condition: "sunny" },
},
],
},
{
role: "assistant",
content: "The weather in San Francisco is sunny and 72°F.",
},
];
Multi-Thread Support
LocalRuntime
supports multiple conversation threads through two approaches:
1. Assistant Cloud Integration
import { useLocalRuntime } from "@assistant-ui/react";
import { AssistantCloud } from "assistant-cloud";
const cloud = new AssistantCloud({
apiKey: process.env.ASSISTANT_CLOUD_API_KEY,
});
const runtime = useLocalRuntime(MyModelAdapter, {
cloud, // Enables multi-thread support
});
With Assistant Cloud, you get:
- Multiple conversation threads
- Thread persistence across sessions
- Thread management (create, switch, rename, archive, delete)
- Automatic synchronization across devices
- Built-in user authentication
2. Custom Database with useRemoteThreadListRuntime
For custom thread storage, use useRemoteThreadListRuntime
with your own adapter:
import {
useLocalThreadRuntime,
unstable_useRemoteThreadListRuntime as useRemoteThreadListRuntime,
useThreadListItem,
RuntimeAdapterProvider,
AssistantRuntimeProvider,
type RemoteThreadListAdapter,
type ThreadHistoryAdapter,
} from "@assistant-ui/react";
// Implement your custom adapter with proper message persistence
const myDatabaseAdapter: RemoteThreadListAdapter = {
async list() {
const threads = await db.threads.findAll();
return {
threads: threads.map((t) => ({
status: t.archived ? "archived" : "regular",
remoteId: t.id,
title: t.title,
})),
};
},
async initialize(threadId) {
const thread = await db.threads.create({ id: threadId });
return { remoteId: thread.id };
},
async rename(remoteId, newTitle) {
await db.threads.update(remoteId, { title: newTitle });
},
async archive(remoteId) {
await db.threads.update(remoteId, { archived: true });
},
async unarchive(remoteId) {
await db.threads.update(remoteId, { archived: false });
},
async delete(remoteId) {
// Delete thread and its messages
await db.messages.deleteByThreadId(remoteId);
await db.threads.delete(remoteId);
},
async generateTitle(remoteId, messages) {
// Generate title from messages using your AI
const title = await generateTitle(messages);
await db.threads.update(remoteId, { title });
return new ReadableStream(); // Return empty stream
},
};
// Complete implementation with message persistence using Provider pattern
export function MyRuntimeProvider({ children }) {
const runtime = useRemoteThreadListRuntime({
runtimeHook: () => {
return useLocalThreadRuntime(MyModelAdapter);
},
adapter: {
...myDatabaseAdapter,
// The Provider component adds thread-specific adapters
unstable_Provider: ({ children }) => {
// This runs in the context of each thread
const threadListItem = useThreadListItem();
const remoteId = threadListItem.remoteId;
// Create thread-specific history adapter
const history = useMemo<ThreadHistoryAdapter>(
() => ({
async load() {
if (!remoteId) return { messages: [] };
const messages = await db.messages.findByThreadId(remoteId);
return {
messages: messages.map((m) => ({
role: m.role,
content: m.content,
id: m.id,
createdAt: new Date(m.createdAt),
})),
};
},
async append(message) {
if (!remoteId) {
console.warn("Cannot save message - thread not initialized");
return;
}
await db.messages.create({
threadId: remoteId,
role: message.role,
content: message.content,
id: message.id,
createdAt: message.createdAt,
});
},
}),
[remoteId],
);
const adapters = useMemo(() => ({ history }), [history]);
return (
<RuntimeAdapterProvider adapters={adapters}>
{children}
</RuntimeAdapterProvider>
);
},
},
});
return (
<AssistantRuntimeProvider runtime={runtime}>
{children}
</AssistantRuntimeProvider>
);
}
Understanding the Architecture
Key Insight: The unstable_Provider
component in your adapter runs in the
context of each thread, giving you access to thread-specific information like
remoteId
. This is where you add the history adapter for message persistence.
The complete multi-thread implementation requires:
- RemoteThreadListAdapter - Manages thread metadata (list, create, rename, archive, delete)
- unstable_Provider - Component that provides thread-specific adapters (like history)
- ThreadHistoryAdapter - Persists messages for each thread (load, append)
- runtimeHook - Creates a basic
LocalRuntime
(adapters are added by Provider)
Without the history adapter, threads would have no message persistence, making them effectively useless. The Provider pattern allows you to add thread-specific functionality while keeping the runtime creation simple.
Database Schema Example
// Example database schema for thread persistence
interface ThreadRecord {
id: string;
title: string;
archived: boolean;
createdAt: Date;
updatedAt: Date;
}
interface MessageRecord {
id: string;
threadId: string;
role: "user" | "assistant" | "system";
content: any; // Store as JSON
createdAt: Date;
}
Both approaches provide full multi-thread support. Choose Assistant Cloud for a managed solution or implement your own adapter for custom storage requirements.
Adapters
Extend LocalRuntime
capabilities with adapters. The runtime automatically enables/disables UI features based on which adapters are provided.
Attachment Adapter
Enable file and image uploads:
const attachmentAdapter: AttachmentAdapter = {
accept: "image/*,application/pdf",
async add(file) {
const formData = new FormData();
formData.append("file", file);
const response = await fetch("/api/upload", {
method: "POST",
body: formData,
});
const { id, url } = await response.json();
return {
id,
type: file.type.startsWith("image/") ? "image" : "document",
name: file.name,
url,
};
},
async remove(attachment) {
await fetch(`/api/upload/${attachment.id}`, {
method: "DELETE",
});
},
};
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: { attachments: attachmentAdapter },
});
// For multiple file types, use CompositeAttachmentAdapter:
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: {
attachments: new CompositeAttachmentAdapter([
new SimpleImageAttachmentAdapter(),
new SimpleTextAttachmentAdapter(),
customPDFAdapter,
]),
},
});
Thread History Adapter
Persist and resume conversations:
const historyAdapter: ThreadHistoryAdapter = {
async load() {
// Load messages from your storage
const response = await fetch(`/api/thread/current`);
const { messages } = await response.json();
return { messages };
},
async append(message) {
// Save new message to storage
await fetch(`/api/thread/messages`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ message }),
});
},
// Optional: Resume interrupted conversations
async resume({ messages }) {
const lastMessage = messages[messages.length - 1];
if (lastMessage?.role === "user") {
// Resume generating assistant response
const response = await fetch("/api/chat/resume", {
method: "POST",
body: JSON.stringify({ messages }),
});
return response.body; // Return stream
}
},
};
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: { history: historyAdapter },
});
The history adapter handles persistence for the current thread's messages. For
multi-thread support with custom storage, use either
useRemoteThreadListRuntime
with LocalRuntime
or ExternalStoreRuntime
with a thread list adapter.
Speech Synthesis Adapter
Add text-to-speech capabilities:
const speechAdapter: SpeechSynthesisAdapter = {
speak(text) {
const utterance = new SpeechSynthesisUtterance(text);
utterance.rate = 1.0;
utterance.pitch = 1.0;
speechSynthesis.speak(utterance);
},
stop() {
speechSynthesis.cancel();
},
};
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: { speech: speechAdapter },
});
Feedback Adapter
Collect user feedback on messages:
const feedbackAdapter: FeedbackAdapter = {
async submit(feedback) {
await fetch("/api/feedback", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
messageId: feedback.messageId,
rating: feedback.type, // "positive" or "negative"
}),
});
},
};
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: { feedback: feedbackAdapter },
});
Suggestion Adapter
Provide follow-up suggestions:
const suggestionAdapter: SuggestionAdapter = {
async *get({ messages }) {
// Analyze conversation context
const lastMessage = messages[messages.length - 1];
// Generate suggestions
const suggestions = await generateSuggestions(lastMessage);
yield suggestions.map((text) => ({
id: crypto.randomUUID(),
text,
}));
},
};
const runtime = useLocalRuntime(MyModelAdapter, {
adapters: { suggestion: suggestionAdapter },
});
Advanced Features
Resuming a Run
The unstable_resumeRun
method is experimental and may change in future
releases.
Resume a conversation with a custom stream:
import { useThreadRuntime, type ChatModelRunResult } from "@assistant-ui/react";
// Get the thread runtime
const thread = useThreadRuntime();
// Create a custom stream
async function* createCustomStream(): AsyncGenerator<ChatModelRunResult> {
let text = "Initial response";
yield {
content: [{ type: "text", text }],
};
// Simulate delay
await new Promise((resolve) => setTimeout(resolve, 500));
text = "Initial response. And here's more content...";
yield {
content: [{ type: "text", text }],
};
}
// Resume a run with the custom stream
thread.unstable_resumeRun({
parentId: "message-id", // ID of the message to respond to
stream: createCustomStream(), // The stream to use for resuming
});
Custom Thread Management
Access thread runtime for advanced control with useThreadRuntime
:
import { useThreadRuntime } from "@assistant-ui/react";
function MyComponent() {
const thread = useThreadRuntime();
// Cancel current generation
const handleCancel = () => {
thread.cancelRun();
};
// Switch to a different branch
const handleSwitchBranch = (messageId: string, branchIndex: number) => {
thread.switchToBranch(messageId, branchIndex);
};
// Reload a message
const handleReload = (messageId: string) => {
thread.reload(messageId);
};
return (
// Your UI
);
}
Custom Runtime Implementation
useLocalThreadRuntime
provides the core single-thread runtime for building custom implementations:
import {
useLocalThreadRuntime,
unstable_useRemoteThreadListRuntime as useRemoteThreadListRuntime,
AssistantRuntimeProvider,
} from "@assistant-ui/react";
// Build your own multi-thread runtime
function MyCustomRuntimeProvider({ children }) {
const runtime = useRemoteThreadListRuntime({
runtimeHook: () => useLocalThreadRuntime(MyModelAdapter, options),
adapter: myCustomThreadListAdapter,
});
return (
<AssistantRuntimeProvider runtime={runtime}>
{children}
</AssistantRuntimeProvider>
);
}
useLocalRuntime
internally uses useLocalThreadRuntime
+
useRemoteThreadListRuntime
for multi-thread support.
useThreadRuntime
vs useLocalThreadRuntime
:
useThreadRuntime
- Access the current thread's runtime from within componentsuseLocalThreadRuntime
- Create a new single-thread runtime instance
Integration Examples
OpenAI Integration
import { OpenAI } from "openai";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
dangerouslyAllowBrowser: true, // Use server-side in production
});
const OpenAIAdapter: ChatModelAdapter = {
async *run({ messages, abortSignal, context }) {
const stream = await openai.chat.completions.create({
model: "gpt-4o",
messages: messages.map((m) => ({
role: m.role,
content: m.content
.filter((c) => c.type === "text")
.map((c) => c.text)
.join("\n"),
})),
stream: true,
signal: abortSignal,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
yield {
content: [{ type: "text", text: content }],
};
}
}
},
};
Custom REST API Integration
const CustomAPIAdapter: ChatModelAdapter = {
async run({ messages, abortSignal }) {
const response = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
messages: messages.map((m) => ({
role: m.role,
content: m.content,
})),
}),
signal: abortSignal,
});
if (!response.ok) {
throw new Error(`API error: ${response.statusText}`);
}
const data = await response.json();
return {
content: [{ type: "text", text: data.message }],
};
},
};
Best Practices
-
Error Handling - Always handle API errors gracefully:
async *run({ messages, abortSignal }) { try { const response = await fetchAPI(messages, abortSignal); yield response; } catch (error) { if (error.name === 'AbortError') { // User cancelled - this is normal return; } // Re-throw other errors to display in UI throw error; } }
-
Abort Signal - Always pass the abort signal to fetch requests:
fetch(url, { signal: abortSignal });
-
Memory Management - For long conversations, consider implementing message limits:
const recentMessages = messages.slice(-20); // Keep last 20 messages
-
Type Safety - Use TypeScript for better development experience:
import type { ChatModelAdapter, ThreadMessage } from "@assistant-ui/react";
Comparison with ExternalStoreRuntime
Feature | LocalRuntime | ExternalStoreRuntime |
---|---|---|
State Management | Built-in | You manage |
Setup Complexity | Simple | More complex |
Flexibility | Extensible via adapters | Full control |
Message Editing | Automatic | Requires onEdit handler |
Branch Switching | Automatic | Requires setMessages handler |
Multi-Thread Support | Yes (with Assistant Cloud or custom adapter) | Yes (with thread list adapter) |
Custom Thread Storage | Yes (with useRemoteThreadListRuntime) | Yes |
Persistence | Via history adapter or Assistant Cloud | Your implementation |
Best For | Quick prototypes, standard apps, cloud-based | Complex state requirements, custom storage needs |
Troubleshooting
Common Issues
Messages not appearing: Ensure your adapter returns the correct format:
return {
content: [{ type: "text", text: "response" }]
};
Streaming not working: Make sure to use async *run
(note the asterisk):
async *run({ messages }) { // ✅ Correct
async run({ messages }) { // ❌ Wrong for streaming
Debug Tips
-
Log adapter calls to trace execution:
async *run(options) { console.log("Adapter called with:", options); // ... rest of implementation }
-
Check network requests in browser DevTools
-
Verify message format matches ThreadMessage structure
API Reference
ChatModelAdapter
The main interface for connecting your API to LocalRuntime
.
ChatModelAdapter
run:
Function that sends messages to your API and returns the response
ChatModelRunOptions
Parameters passed to the run
function.
ChatModelRunOptions
messages:
The conversation history to send to your API
abortSignal:
Signal to cancel the request if user interrupts
context:
Additional context including configuration and tools
LocalRuntimeOptions
Configuration options for the LocalRuntime
.
LocalRuntimeOptions
initialMessages?:
Pre-populate the thread with messages
maxSteps:
Maximum number of sequential tool calls before requiring user input
cloud?:
Enable Assistant Cloud integration for multi-thread support and persistence
adapters?:
Additional capabilities through adapters. Features are automatically enabled based on provided adapters
adapters
attachments?:
Enable file/image attachments
speech?:
Enable text-to-speech for messages
feedback?:
Enable message feedback (thumbs up/down)
history?:
Enable thread persistence and resumption
suggestions?:
Enable follow-up suggestions
unstable_humanToolNames?:
Tool names that require human approval before execution (experimental API)
RemoteThreadListAdapter
Interface for implementing custom thread list storage.
RemoteThreadListAdapter
list:
Returns list of all threads (regular and archived)
initialize:
Creates a new thread with the given ID
rename:
Updates the title of a thread
archive:
Archives a thread
unarchive:
Unarchives a thread
delete:
Deletes a thread permanently
generateTitle:
Generates a title for the thread based on the conversation
Related Runtime APIs
- AssistantRuntime API - Core runtime interface and methods
- ThreadRuntime API - Thread-specific operations and state management