# Realtime Voice
URL: /docs/guides/voice

Bidirectional realtime voice conversations with AI agents.

assistant-ui supports realtime bidirectional voice via the `RealtimeVoiceAdapter` interface. This enables live voice conversations where the user speaks into their microphone and the AI agent responds with audio, with transcripts appearing in the thread in real time.

<VoiceSample />

How It Works \[#how-it-works]

Unlike [Speech Synthesis](/docs/guides/speech) (text-to-speech) and [Dictation](/docs/guides/dictation) (speech-to-text), the voice adapter handles **both directions simultaneously** — the user's microphone audio is streamed to the agent, and the agent's audio response is played back, all while transcripts are appended to the message thread.

| Feature                                 | Adapter                  | Direction                            |
| --------------------------------------- | ------------------------ | ------------------------------------ |
| [Speech Synthesis](/docs/guides/speech) | `SpeechSynthesisAdapter` | Text → Audio (one message at a time) |
| [Dictation](/docs/guides/dictation)     | `DictationAdapter`       | Audio → Text (into composer)         |
| **Realtime Voice**                      | `RealtimeVoiceAdapter`   | Audio ↔ Audio (bidirectional, live)  |

Configuration \[#configuration]

Pass a `RealtimeVoiceAdapter` implementation to the runtime via `adapters.voice`:

```tsx
const runtime = useChatRuntime({
  adapters: {
    voice: new MyVoiceAdapter({ /* ... */ }),
  },
});
```

When a voice adapter is provided, `capabilities.voice` is automatically set to `true`.

Hooks \[#hooks]

useVoiceState \[#usevoicestate]

Returns the current voice session state, or `undefined` when no session is active.

```tsx
import { useVoiceState, useVoiceVolume } from "@assistant-ui/react";

const voiceState = useVoiceState();
// voiceState?.status.type — "starting" | "running" | "ended"
// voiceState?.isMuted — boolean
// voiceState?.mode — "listening" | "speaking"

const volume = useVoiceVolume();
// volume — number (0–1, real-time audio level via separate subscription)
```

useVoiceControls \[#usevoicecontrols]

Returns methods to control the voice session.

```tsx
import { useVoiceControls } from "@assistant-ui/react";

const { connect, disconnect, mute, unmute } = useVoiceControls();
```

UI Example \[#ui-example]

```tsx
import { useVoiceState, useVoiceControls } from "@assistant-ui/react";
import { PhoneIcon, PhoneOffIcon, MicIcon, MicOffIcon } from "lucide-react";

function VoiceControls() {
  const voiceState = useVoiceState();
  const { connect, disconnect, mute, unmute } = useVoiceControls();

  const isRunning = voiceState?.status.type === "running";
  const isStarting = voiceState?.status.type === "starting";
  const isMuted = voiceState?.isMuted ?? false;

  if (!isRunning && !isStarting) {
    return (
      <button onClick={() => connect()}>
        <PhoneIcon /> Connect
      </button>
    );
  }

  return (
    <>
      <button onClick={() => (isMuted ? unmute() : mute())} disabled={!isRunning}>
        {isMuted ? <MicOffIcon /> : <MicIcon />}
        {isMuted ? "Unmute" : "Mute"}
      </button>
      <button onClick={() => disconnect()}>
        <PhoneOffIcon /> Disconnect
      </button>
    </>
  );
}
```

Custom Adapters \[#custom-adapters]

Implement the `RealtimeVoiceAdapter` interface to integrate with any voice provider.

RealtimeVoiceAdapter Interface \[#realtimevoiceadapter-interface]

```tsx
import type { RealtimeVoiceAdapter } from "@assistant-ui/react";

class MyVoiceAdapter implements RealtimeVoiceAdapter {
  connect(options: {
    abortSignal?: AbortSignal;
  }): RealtimeVoiceAdapter.Session {
    // Establish connection to your voice service
    return {
      get status() { /* ... */ },
      get isMuted() { /* ... */ },

      disconnect: () => { /* ... */ },
      mute: () => { /* ... */ },
      unmute: () => { /* ... */ },

      onStatusChange: (callback) => {
        // Status: { type: "starting" } → { type: "running" } → { type: "ended", reason }
        return () => {}; // Return unsubscribe
      },

      onTranscript: (callback) => {
        // callback({ role: "user" | "assistant", text: "...", isFinal: true })
        // Transcripts are automatically appended as messages in the thread.
        return () => {};
      },

      // Report who is speaking (drives the VoiceOrb speaking animation)
      onModeChange: (callback) => {
        // callback("listening") — user's turn
        // callback("speaking") — agent's turn
        return () => {};
      },

      // Report real-time audio level (0–1) for visual feedback
      onVolumeChange: (callback) => {
        // callback(0.72) — drives VoiceOrb amplitude and waveform bar heights
        return () => {};
      },
    };
  }
}
```

Session Lifecycle \[#session-lifecycle]

The session status follows the same pattern as other adapters:

```
starting → running → ended
```

The `ended` status includes a `reason`:

* `"finished"` — session ended normally
* `"cancelled"` — session was cancelled by the user
* `"error"` — session ended due to an error (includes `error` field)

Mode and Volume \[#mode-and-volume]

All adapters must implement `onModeChange` and `onVolumeChange`. If your provider doesn't support these, return a no-op unsubscribe:

* **`onModeChange`** — Reports `"listening"` (user's turn) or `"speaking"` (agent's turn). The `VoiceOrb` switches to the active speaking animation.
* **`onVolumeChange`** — Reports a real-time audio level (`0`–`1`). The `VoiceOrb` modulates its amplitude and glow, and waveform bars scale to match.

When using `createVoiceSession`, these are handled automatically — call `session.emitMode()` and `session.emitVolume()` when your provider delivers data.

Transcript Handling \[#transcript-handling]

Transcripts emitted via `onTranscript` are automatically appended to the message thread:

* **User transcripts** (`role: "user"`, `isFinal: true`) are appended as user messages.
* **Assistant transcripts** (`role: "assistant"`) are streamed into an assistant message. The message shows a "running" status until `isFinal: true` is received.

Example: ElevenLabs Conversational AI \[#example-elevenlabs-conversational-ai]

[ElevenLabs Conversational AI](https://elevenlabs.io/docs/agents-platform/overview) provides realtime voice agents via WebRTC.

Install Dependencies \[#install-dependencies]

```bash
npm install @elevenlabs/client
```

Adapter \[#adapter]

```tsx title="lib/elevenlabs-voice-adapter.ts"
import type { RealtimeVoiceAdapter, Unsubscribe } from "@assistant-ui/react";
import { VoiceConversation } from "@elevenlabs/client";

export class ElevenLabsVoiceAdapter implements RealtimeVoiceAdapter {
  private _agentId: string;

  constructor(options: { agentId: string }) {
    this._agentId = options.agentId;
  }

  connect(options: {
    abortSignal?: AbortSignal;
  }): RealtimeVoiceAdapter.Session {
    const statusCallbacks = new Set<(s: RealtimeVoiceAdapter.Status) => void>();
    const transcriptCallbacks = new Set<(t: RealtimeVoiceAdapter.TranscriptItem) => void>();
    const modeCallbacks = new Set<(m: RealtimeVoiceAdapter.Mode) => void>();
    const volumeCallbacks = new Set<(v: number) => void>();

    let currentStatus: RealtimeVoiceAdapter.Status = { type: "starting" };
    let isMuted = false;
    let conversation: VoiceConversation | null = null;
    let disposed = false;

    const updateStatus = (status: RealtimeVoiceAdapter.Status) => {
      if (disposed) return;
      currentStatus = status;
      for (const cb of statusCallbacks) cb(status);
    };

    const cleanup = () => {
      disposed = true;
      conversation = null;
      statusCallbacks.clear();
      transcriptCallbacks.clear();
      modeCallbacks.clear();
      volumeCallbacks.clear();
    };

    const session: RealtimeVoiceAdapter.Session = {
      get status() { return currentStatus; },
      get isMuted() { return isMuted; },
      disconnect: () => { conversation?.endSession(); cleanup(); },
      mute: () => { conversation?.setMicMuted(true); isMuted = true; },
      unmute: () => { conversation?.setMicMuted(false); isMuted = false; },
      onStatusChange: (cb): Unsubscribe => {
        statusCallbacks.add(cb);
        return () => statusCallbacks.delete(cb);
      },
      onTranscript: (cb): Unsubscribe => {
        transcriptCallbacks.add(cb);
        return () => transcriptCallbacks.delete(cb);
      },
      onModeChange: (cb): Unsubscribe => {
        modeCallbacks.add(cb);
        return () => modeCallbacks.delete(cb);
      },
      onVolumeChange: (cb): Unsubscribe => {
        volumeCallbacks.add(cb);
        return () => volumeCallbacks.delete(cb);
      },
    };

    if (options.abortSignal) {
      options.abortSignal.addEventListener("abort", () => {
        conversation?.endSession(); cleanup();
      }, { once: true });
    }

    const doConnect = async () => {
      if (disposed) return;
      try {
        conversation = await VoiceConversation.startSession({
          agentId: this._agentId,
          onConnect: () => updateStatus({ type: "running" }),
          onDisconnect: () => { updateStatus({ type: "ended", reason: "finished" }); cleanup(); },
          onError: (msg) => { updateStatus({ type: "ended", reason: "error", error: new Error(msg) }); cleanup(); },
          onModeChange: ({ mode }) => {
            if (disposed) return;
            for (const cb of modeCallbacks) cb(mode === "speaking" ? "speaking" : "listening");
          },
          onMessage: (msg) => {
            if (disposed) return;
            for (const cb of transcriptCallbacks) {
              cb({ role: msg.role === "user" ? "user" : "assistant", text: msg.message, isFinal: true });
            }
          },
        });
      } catch (error) {
        updateStatus({ type: "ended", reason: "error", error }); cleanup();
      }
    };

    doConnect();
    return session;
  }
}
```

Usage \[#usage]

```tsx
import { ElevenLabsVoiceAdapter } from "@/lib/elevenlabs-voice-adapter";

const runtime = useChatRuntime({
  adapters: {
    voice: new ElevenLabsVoiceAdapter({
      agentId: process.env.NEXT_PUBLIC_ELEVENLABS_AGENT_ID!,
    }),
  },
});
```

Example: LiveKit \[#example-livekit]

[LiveKit](https://livekit.io/) provides realtime voice via WebRTC rooms with transcription support.

Install Dependencies \[#install-dependencies-1]

```bash
npm install livekit-client
```

Usage \[#usage-1]

```tsx
import { LiveKitVoiceAdapter } from "@/lib/livekit-voice-adapter";

const runtime = useChatRuntime({
  adapters: {
    voice: new LiveKitVoiceAdapter({
      url: process.env.NEXT_PUBLIC_LIVEKIT_URL!,
      token: async () => {
        const res = await fetch("/api/livekit-token", { method: "POST" });
        const { token } = await res.json();
        return token;
      },
    }),
  },
});
```

See the `examples/with-livekit` directory in the repository for a complete implementation including the adapter and token endpoint.