# Realtime Voice URL: /docs/guides/voice Bidirectional realtime voice conversations with AI agents. assistant-ui supports realtime bidirectional voice via the `RealtimeVoiceAdapter` interface. This enables live voice conversations where the user speaks into their microphone and the AI agent responds with audio, with transcripts appearing in the thread in real time. How It Works \[#how-it-works] Unlike [Speech Synthesis](/docs/guides/speech) (text-to-speech) and [Dictation](/docs/guides/dictation) (speech-to-text), the voice adapter handles **both directions simultaneously** — the user's microphone audio is streamed to the agent, and the agent's audio response is played back, all while transcripts are appended to the message thread. | Feature | Adapter | Direction | | --------------------------------------- | ------------------------ | ------------------------------------ | | [Speech Synthesis](/docs/guides/speech) | `SpeechSynthesisAdapter` | Text → Audio (one message at a time) | | [Dictation](/docs/guides/dictation) | `DictationAdapter` | Audio → Text (into composer) | | **Realtime Voice** | `RealtimeVoiceAdapter` | Audio ↔ Audio (bidirectional, live) | Configuration \[#configuration] Pass a `RealtimeVoiceAdapter` implementation to the runtime via `adapters.voice`: ```tsx const runtime = useChatRuntime({ adapters: { voice: new MyVoiceAdapter({ /* ... */ }), }, }); ``` When a voice adapter is provided, `capabilities.voice` is automatically set to `true`. Hooks \[#hooks] useVoiceState \[#usevoicestate] Returns the current voice session state, or `undefined` when no session is active. ```tsx import { useVoiceState, useVoiceVolume } from "@assistant-ui/react"; const voiceState = useVoiceState(); // voiceState?.status.type — "starting" | "running" | "ended" // voiceState?.isMuted — boolean // voiceState?.mode — "listening" | "speaking" const volume = useVoiceVolume(); // volume — number (0–1, real-time audio level via separate subscription) ``` useVoiceControls \[#usevoicecontrols] Returns methods to control the voice session. ```tsx import { useVoiceControls } from "@assistant-ui/react"; const { connect, disconnect, mute, unmute } = useVoiceControls(); ``` UI Example \[#ui-example] ```tsx import { useVoiceState, useVoiceControls } from "@assistant-ui/react"; import { PhoneIcon, PhoneOffIcon, MicIcon, MicOffIcon } from "lucide-react"; function VoiceControls() { const voiceState = useVoiceState(); const { connect, disconnect, mute, unmute } = useVoiceControls(); const isRunning = voiceState?.status.type === "running"; const isStarting = voiceState?.status.type === "starting"; const isMuted = voiceState?.isMuted ?? false; if (!isRunning && !isStarting) { return ( ); } return ( <> ); } ``` Custom Adapters \[#custom-adapters] Implement the `RealtimeVoiceAdapter` interface to integrate with any voice provider. RealtimeVoiceAdapter Interface \[#realtimevoiceadapter-interface] ```tsx import type { RealtimeVoiceAdapter } from "@assistant-ui/react"; class MyVoiceAdapter implements RealtimeVoiceAdapter { connect(options: { abortSignal?: AbortSignal; }): RealtimeVoiceAdapter.Session { // Establish connection to your voice service return { get status() { /* ... */ }, get isMuted() { /* ... */ }, disconnect: () => { /* ... */ }, mute: () => { /* ... */ }, unmute: () => { /* ... */ }, onStatusChange: (callback) => { // Status: { type: "starting" } → { type: "running" } → { type: "ended", reason } return () => {}; // Return unsubscribe }, onTranscript: (callback) => { // callback({ role: "user" | "assistant", text: "...", isFinal: true }) // Transcripts are automatically appended as messages in the thread. return () => {}; }, // Report who is speaking (drives the VoiceOrb speaking animation) onModeChange: (callback) => { // callback("listening") — user's turn // callback("speaking") — agent's turn return () => {}; }, // Report real-time audio level (0–1) for visual feedback onVolumeChange: (callback) => { // callback(0.72) — drives VoiceOrb amplitude and waveform bar heights return () => {}; }, }; } } ``` Session Lifecycle \[#session-lifecycle] The session status follows the same pattern as other adapters: ``` starting → running → ended ``` The `ended` status includes a `reason`: * `"finished"` — session ended normally * `"cancelled"` — session was cancelled by the user * `"error"` — session ended due to an error (includes `error` field) Mode and Volume \[#mode-and-volume] All adapters must implement `onModeChange` and `onVolumeChange`. If your provider doesn't support these, return a no-op unsubscribe: * **`onModeChange`** — Reports `"listening"` (user's turn) or `"speaking"` (agent's turn). The `VoiceOrb` switches to the active speaking animation. * **`onVolumeChange`** — Reports a real-time audio level (`0`–`1`). The `VoiceOrb` modulates its amplitude and glow, and waveform bars scale to match. When using `createVoiceSession`, these are handled automatically — call `session.emitMode()` and `session.emitVolume()` when your provider delivers data. Transcript Handling \[#transcript-handling] Transcripts emitted via `onTranscript` are automatically appended to the message thread: * **User transcripts** (`role: "user"`, `isFinal: true`) are appended as user messages. * **Assistant transcripts** (`role: "assistant"`) are streamed into an assistant message. The message shows a "running" status until `isFinal: true` is received. Example: ElevenLabs Conversational AI \[#example-elevenlabs-conversational-ai] [ElevenLabs Conversational AI](https://elevenlabs.io/docs/agents-platform/overview) provides realtime voice agents via WebRTC. Install Dependencies \[#install-dependencies] ```bash npm install @elevenlabs/client ``` Adapter \[#adapter] ```tsx title="lib/elevenlabs-voice-adapter.ts" import type { RealtimeVoiceAdapter, Unsubscribe } from "@assistant-ui/react"; import { VoiceConversation } from "@elevenlabs/client"; export class ElevenLabsVoiceAdapter implements RealtimeVoiceAdapter { private _agentId: string; constructor(options: { agentId: string }) { this._agentId = options.agentId; } connect(options: { abortSignal?: AbortSignal; }): RealtimeVoiceAdapter.Session { const statusCallbacks = new Set<(s: RealtimeVoiceAdapter.Status) => void>(); const transcriptCallbacks = new Set<(t: RealtimeVoiceAdapter.TranscriptItem) => void>(); const modeCallbacks = new Set<(m: RealtimeVoiceAdapter.Mode) => void>(); const volumeCallbacks = new Set<(v: number) => void>(); let currentStatus: RealtimeVoiceAdapter.Status = { type: "starting" }; let isMuted = false; let conversation: VoiceConversation | null = null; let disposed = false; const updateStatus = (status: RealtimeVoiceAdapter.Status) => { if (disposed) return; currentStatus = status; for (const cb of statusCallbacks) cb(status); }; const cleanup = () => { disposed = true; conversation = null; statusCallbacks.clear(); transcriptCallbacks.clear(); modeCallbacks.clear(); volumeCallbacks.clear(); }; const session: RealtimeVoiceAdapter.Session = { get status() { return currentStatus; }, get isMuted() { return isMuted; }, disconnect: () => { conversation?.endSession(); cleanup(); }, mute: () => { conversation?.setMicMuted(true); isMuted = true; }, unmute: () => { conversation?.setMicMuted(false); isMuted = false; }, onStatusChange: (cb): Unsubscribe => { statusCallbacks.add(cb); return () => statusCallbacks.delete(cb); }, onTranscript: (cb): Unsubscribe => { transcriptCallbacks.add(cb); return () => transcriptCallbacks.delete(cb); }, onModeChange: (cb): Unsubscribe => { modeCallbacks.add(cb); return () => modeCallbacks.delete(cb); }, onVolumeChange: (cb): Unsubscribe => { volumeCallbacks.add(cb); return () => volumeCallbacks.delete(cb); }, }; if (options.abortSignal) { options.abortSignal.addEventListener("abort", () => { conversation?.endSession(); cleanup(); }, { once: true }); } const doConnect = async () => { if (disposed) return; try { conversation = await VoiceConversation.startSession({ agentId: this._agentId, onConnect: () => updateStatus({ type: "running" }), onDisconnect: () => { updateStatus({ type: "ended", reason: "finished" }); cleanup(); }, onError: (msg) => { updateStatus({ type: "ended", reason: "error", error: new Error(msg) }); cleanup(); }, onModeChange: ({ mode }) => { if (disposed) return; for (const cb of modeCallbacks) cb(mode === "speaking" ? "speaking" : "listening"); }, onMessage: (msg) => { if (disposed) return; for (const cb of transcriptCallbacks) { cb({ role: msg.role === "user" ? "user" : "assistant", text: msg.message, isFinal: true }); } }, }); } catch (error) { updateStatus({ type: "ended", reason: "error", error }); cleanup(); } }; doConnect(); return session; } } ``` Usage \[#usage] ```tsx import { ElevenLabsVoiceAdapter } from "@/lib/elevenlabs-voice-adapter"; const runtime = useChatRuntime({ adapters: { voice: new ElevenLabsVoiceAdapter({ agentId: process.env.NEXT_PUBLIC_ELEVENLABS_AGENT_ID!, }), }, }); ``` Example: LiveKit \[#example-livekit] [LiveKit](https://livekit.io/) provides realtime voice via WebRTC rooms with transcription support. Install Dependencies \[#install-dependencies-1] ```bash npm install livekit-client ``` Usage \[#usage-1] ```tsx import { LiveKitVoiceAdapter } from "@/lib/livekit-voice-adapter"; const runtime = useChatRuntime({ adapters: { voice: new LiveKitVoiceAdapter({ url: process.env.NEXT_PUBLIC_LIVEKIT_URL!, token: async () => { const res = await fetch("/api/livekit-token", { method: "POST" }); const { token } = await res.json(); return token; }, }), }, }); ``` See the `examples/with-livekit` directory in the repository for a complete implementation including the adapter and token endpoint.