# AI SDK URL: /docs/cloud/ai-sdk Add cloud persistence to your existing AI SDK app with a single hook. Overview \[#overview] The `@assistant-ui/cloud-ai-sdk` package provides a single hook that adds full message and thread persistence to any [AI SDK](https://sdk.vercel.ai/) application: * **`useCloudChat`** — wraps `useChat` with automatic cloud persistence and built-in thread management This hook works with any React UI. You keep full control of your components. See [AI SDK + assistant-ui](/docs/cloud/ai-sdk-assistant-ui) for the full integration with assistant-ui's primitives and runtime. Prerequisites \[#prerequisites] You need an assistant-cloud account to follow this guide. [Sign up here](https://cloud.assistant-ui.com/) to get started. Setup \[#setup] Create a Cloud Project \[#create-a-cloud-project] Create a new project in the [assistant-cloud dashboard](https://cloud.assistant-ui.com/) and from the settings page, copy your **Frontend API URL** (`https://proj-[ID].assistant-api.com`). Configure Environment Variables \[#configure-environment-variables] ```bash title=".env.local" NEXT_PUBLIC_ASSISTANT_BASE_URL=https://proj-[YOUR-ID].assistant-api.com ``` Install Dependencies \[#install-dependencies] Integrate \[#integrate] ```tsx title="app/page.tsx" "use client"; import { useState } from "react"; import { useCloudChat } from "@assistant-ui/cloud-ai-sdk"; export default function Chat() { // Zero-config: auto-initializes anonymous cloud from env var with built-in threads. // For custom config, pass: { cloud, threads: useThreads(...), onSyncError } const { messages, sendMessage, threads } = useCloudChat(); const [input, setInput] = useState(""); const handleSubmit = () => { if (!input.trim()) return; sendMessage({ text: input }); setInput(""); }; return (
{/* Thread list */}
    {threads.threads.map((t) => (
  • threads.selectThread(t.id)}> {t.title || "New conversation"}
  • ))}
  • threads.selectThread(null)}>New chat
{/* Chat messages */}
{messages.map((m) => (
{m.parts.map((p) => p.type === "text" && p.text)}
))}
{/* Composer */}
{ e.preventDefault(); handleSubmit(); }}> setInput(e.target.value)} />
); } ```
That's it. Messages persist automatically as they complete, and switching threads loads the full history. API Reference \[#api-reference] `useCloudChat(options?)` \[#usecloudchatoptions] Wraps AI SDK's `useChat` with automatic cloud persistence and built-in thread management. Messages are persisted as they finish streaming. Thread creation is automatic on the first message — the hook will auto-create the thread, select it, refresh the thread list, and generate a title after the first response. Configuration Modes \[#configuration-modes] **1. Zero-config** — Set `NEXT_PUBLIC_ASSISTANT_BASE_URL` env var, call with no args: ```tsx const chat = useCloudChat(); ``` **2. Custom cloud instance** — For authenticated users or custom configuration: ```tsx const cloud = new AssistantCloud({ baseUrl, authToken }); const chat = useCloudChat({ cloud }); ``` **3. External thread management** — When threads need to be accessed from a separate component or you need custom thread options like `includeArchived`: ```tsx // In a context provider or parent component const myThreads = useThreads({ cloud, includeArchived: true }); // Pass to useCloudChat - it will use your thread state const chat = useCloudChat({ threads: myThreads }); ``` Parameters \[#parameters] | Parameter | Type | Description | | --------------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | | `options.cloud` | `AssistantCloud` | Cloud instance (optional — auto-creates anonymous instance from `NEXT_PUBLIC_ASSISTANT_BASE_URL` env var if not provided) | | `options.threads` | `UseThreadsResult` | External thread management from `useThreads()`. Use when you need thread operations in a separate component or custom thread options like `includeArchived` | | `options.onSyncError` | `(error: Error) => void` | Callback invoked when a sync error occurs | A subset of [AI SDK `useChat` options](https://sdk.vercel.ai/docs/reference/ai-sdk-ui/use-chat) are also accepted (those defined on `ChatInit`). Some options available on `useChat` such as `experimental_throttle` and `resume` are not supported. **Returns:** `UseCloudChatResult` | Value | Type | Description | | ------------- | -------------------------------------- | ------------------------------------------------------------------ | | `messages` | `UIMessage[]` | Chat messages (from AI SDK) | | `status` | `string` | Chat status: `"ready"`, `"submitted"`, `"streaming"`, or `"error"` | | `sendMessage` | `(message, options?) => Promise` | Send a message (auto-creates thread if needed) | | `stop` | `() => void` | Stop the current stream | | `threads` | `UseThreadsResult` | Thread management (see below) | Plus all other properties from AI SDK's [`UseChatHelpers`](https://sdk.vercel.ai/docs/reference/ai-sdk-ui/use-chat). **Thread management (`threads`):** | Value | Type | Description | | ----------------------- | ----------------------------------------------------------------------- | ------------------------------------------------- | | `threads.cloud` | `AssistantCloud` | The cloud instance used for thread operations | | `threads.threads` | `CloudThread[]` | Active threads sorted by recency | | `threads.threadId` | `string \| null` | Current thread ID (`null` for a new unsaved chat) | | `threads.selectThread` | `(id: string \| null) => void` | Switch threads or pass `null` for a new chat | | `threads.isLoading` | `boolean` | `true` during initial load or refresh | | `threads.error` | `Error \| null` | Last error, if any | | `threads.refresh` | `() => Promise` | Re-fetch the thread list | | `threads.get` | `(id: string) => Promise` | Fetch a single thread by ID | | `threads.create` | `(options?: \{ externalId?: string \}) => Promise` | Create a new thread | | `threads.delete` | `(id: string) => Promise` | Delete a thread | | `threads.rename` | `(id: string, title: string) => Promise` | Rename a thread | | `threads.archive` | `(id: string) => Promise` | Archive a thread | | `threads.unarchive` | `(id: string) => Promise` | Unarchive a thread | | `threads.generateTitle` | `(threadId: string) => Promise` | Generate a title using AI | `useThreads(options)` \[#usethreadsoptions] Thread list management for use with `useCloudChat`. Call this explicitly and pass to `useCloudChat({ threads })` when you need access to thread operations outside the chat context (e.g., in a separate sidebar component). ```tsx const myThreads = useThreads({ cloud: myCloud }); const { messages, sendMessage } = useCloudChat({ threads: myThreads }); ``` **Parameters:** | Parameter | Type | Description | | ------------------------- | ---------------- | ------------------------------------------- | | `options.cloud` | `AssistantCloud` | Cloud client instance | | `options.includeArchived` | `boolean` | Include archived threads (default: `false`) | | `options.enabled` | `boolean` | Enable thread fetching (default: `true`) | **Returns:** `UseThreadsResult` — same shape as `threads` from `useCloudChat()`. Telemetry \[#telemetry] The `useCloudChat` hook automatically reports run telemetry to Assistant Cloud after each assistant response. This includes: **Automatically captured:** * `status` — `"completed"` or `"incomplete"` based on response content * `tool_calls` — Tool invocations with name, arguments, and results. MCP tool calls are explicitly tagged with `tool_source: "mcp"` * `total_steps` — Number of reasoning/tool steps in the response * `output_text` — Full response text (truncated at 50K characters) **Requires route configuration:** * `model_id` — The model used for the response * `input_tokens` / `output_tokens` — Token usage statistics * `reasoning_tokens` — Tokens used for chain-of-thought reasoning (e.g. o1/o3 models) * `cached_input_tokens` — Input tokens served from the provider's prompt cache To capture model and usage data, configure the `messageMetadata` callback in your AI SDK route: ```tsx title="app/api/chat/route.ts" import { streamText } from "ai"; import { openai } from "@ai-sdk/openai"; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: openai("gpt-4o-mini"), messages, }); return result.toUIMessageStreamResponse({ messageMetadata: ({ part }) => { if (part.type === "finish") { return { usage: part.totalUsage, }; } if (part.type === "finish-step") { return { modelId: part.response.modelId, }; } return undefined; }, }); } ``` The standalone hook captures message metadata when it is JSON-serializable, but it does not capture `duration_ms`, per-step breakdowns (`steps`), or `"error"` status. Those require the full runtime integration available via [`useChatRuntime`](/docs/cloud/ai-sdk-assistant-ui). Customizing Reports \[#customizing-reports] Use the `beforeReport` hook to enrich or filter telemetry: ```tsx const cloud = new AssistantCloud({ baseUrl: process.env.NEXT_PUBLIC_ASSISTANT_BASE_URL!, anonymous: true, telemetry: { beforeReport: (report) => ({ ...report, metadata: { environment: "production", version: "1.0.0" }, }), }, }); ``` Return `null` from `beforeReport` to skip reporting a specific run. To disable telemetry entirely, pass `telemetry: false`. Sub-Agent Model Tracking \[#sub-agent-model-tracking] In multi-agent setups where tool calls delegate to a different model (e.g., the main run uses GPT but a tool invokes Gemini), you can track the delegated model's usage by passing sampling call data through `messageMetadata`. **Step 1: Collect sampling data on the server** Use `createSamplingCollector` and `wrapSamplingHandler` from `assistant-cloud` to capture LLM calls made during tool execution: ```ts title="app/api/chat/route.ts" import { streamText } from "ai"; import { openai } from "@ai-sdk/openai"; import { createSamplingCollector, wrapSamplingHandler, } from "assistant-cloud"; export async function POST(req: Request) { const { messages } = await req.json(); // Collect sub-agent sampling calls per tool call const samplingCalls: Record = {}; const result = streamText({ model: openai("gpt-4o"), messages, tools: { delegate_to_gemini: tool({ parameters: z.object({ task: z.string() }), execute: async ({ task }, { toolCallId }) => { const collector = createSamplingCollector(); // Your sub-agent logic that calls another model const result = await runSubAgent(task, { onSamplingCall: collector.collect, }); samplingCalls[toolCallId] = collector.getCalls(); return result; }, }), }, }); return result.toUIMessageStreamResponse({ messageMetadata: ({ part }) => { if (part.type === "finish") { return { usage: part.totalUsage, samplingCalls, // attach collected sampling data }; } if (part.type === "finish-step") { return { modelId: part.response.modelId }; } return undefined; }, }); } ``` **Step 2: That's it.** The telemetry reporter automatically reads `samplingCalls` from message metadata and attaches the data to matching tool calls in the report. The Cloud dashboard will show each delegated model in the model distribution chart with its own token and cost breakdown. For MCP tools that use the sampling protocol, `wrapSamplingHandler` can wrap the MCP client's sampling handler directly to capture all nested LLM calls transparently. **On older versions** that don't yet read `samplingCalls` from metadata, use `beforeReport` to inject the data manually: ```ts telemetry: { beforeReport: (report) => ({ ...report, tool_calls: report.tool_calls?.map((tc) => ({ ...tc, sampling_calls: samplingCalls[tc.tool_call_id], })), }), } ``` Authentication \[#authentication] The example above uses anonymous mode (browser session-based user ID) via the env var. For production apps with user accounts, pass an explicit cloud instance: ```tsx import { useMemo } from "react"; import { useAuth } from "@clerk/nextjs"; import { AssistantCloud } from "assistant-cloud"; import { useCloudChat } from "@assistant-ui/cloud-ai-sdk"; function Chat() { const { getToken } = useAuth(); const cloud = useMemo(() => new AssistantCloud({ baseUrl: process.env.NEXT_PUBLIC_ASSISTANT_BASE_URL!, authToken: async () => getToken({ template: "assistant-ui" }), }), [getToken]); const { messages, sendMessage, threads } = useCloudChat({ cloud }); // ... } ``` See the [Cloud Authorization](/docs/cloud/authorization) guide for other auth providers. Next Steps \[#next-steps] * If you want pre-built UI components, see [AI SDK + assistant-ui](/docs/cloud/ai-sdk-assistant-ui) for the full integration * Learn about [user authentication](/docs/cloud/authorization) for multi-user applications * Check out the [complete example](https://github.com/assistant-ui/assistant-ui/tree/main/examples/with-cloud-standalone) on GitHub