Message Timing & Token Stats

Display stream metadata in AI chat — generation duration, tokens per second, and time to first token, rendered via assistant-ui's React components.

Display stream performance metrics — duration, tokens per second, TTFT — on assistant messages.

This feature is experimental. The useMessageTiming() API and the set of tracked fields may change in future versions.

The MessageTiming registry component provides a ready-made badge + popover UI. This guide covers the underlying useMessageTiming() hook for custom implementations and runtime-specific setup.

Reading Timing Data

Use useMessageTiming() inside a message component to access timing data:

import type { FC } from "react";
import { useMessageTiming } from "@assistant-ui/react";

const MessageTimingDisplay: FC = () => {
  const timing = useMessageTiming();
  if (!timing?.totalStreamTime) return null;

  const formatMs = (ms: number) =>
    ms < 1000 ? `${Math.round(ms)}ms` : `${(ms / 1000).toFixed(2)}s`;

  return (
    <span className="text-xs text-muted-foreground">
      {formatMs(timing.totalStreamTime)}
      {timing.tokensPerSecond !== undefined &&
        ` · ${timing.tokensPerSecond.toFixed(1)} tok/s`}
    </span>
  );
};

Place it inside MessagePrimitive.Root, typically near the action bar:

const AssistantMessage: FC = () => {
  return (
    <MessagePrimitive.Root>
      <MessagePrimitive.Parts>{...}</MessagePrimitive.Parts>
      <ActionBarPrimitive.Root>
        <ActionBarPrimitive.Copy />
        <ActionBarPrimitive.Reload />
        <MessageTimingDisplay />
      </ActionBarPrimitive.Root>
    </MessagePrimitive.Root>
  );
};

useMessageTiming() Return Fields

FieldTypeDescription
streamStartTimenumberUnix timestamp when stream started
firstTokenTimenumber?Time to first text token (ms)
totalStreamTimenumber?Total stream duration (ms)
tokenCountnumber?Output token count from message metadata usage
tokensPerSecondnumber?Throughput (tokens/sec), when token usage is available
totalChunksnumberTotal stream chunks received
toolCallCountnumberNumber of tool calls

Runtime Support

RuntimeSupportedNotes
Data StreamYesAutomatic via AssistantMessageAccumulator
AI SDK (useChatRuntime)YesAutomatic via client-side tracking
Local (useLocalRuntime)YesPass timing in ChatModelRunResult.metadata
ExternalStoreYesPass timing in ThreadMessageLike.metadata
LangGraphYesAutomatic via client-side tracking
AG-UIYesAutomatic via client-side tracking
OpenCodeYesAutomatic via client-side tracking

Data Stream

Timing is tracked automatically inside AssistantMessageAccumulator. No setup required.

import { useDataStreamRuntime } from "@assistant-ui/react-data-stream";

const runtime = useDataStreamRuntime({ api: "/api/chat" });
// useMessageTiming() works out of the box

AI SDK (useChatRuntime)

Timing is tracked automatically on the client side by observing streaming state transitions and content changes. Timing is finalized when each stream completes.

tokenCount and tokensPerSecond require usage metadata from finish or finish-step in your AI SDK route. If usage metadata is not emitted, duration and TTFT metrics still work, but token-based metrics are omitted.

import { useChatRuntime } from "@assistant-ui/react-ai-sdk";

const runtime = useChatRuntime();
// useMessageTiming() works out of the box

Local (useLocalRuntime)

Pass timing in the metadata field of your ChatModelRunResult:

import type { ChatModelAdapter } from "@assistant-ui/react";

const myAdapter: ChatModelAdapter = {
  async run({ messages, abortSignal }) {
    const startTime = Date.now();
    const result = await callMyAPI(messages, abortSignal);
    const totalStreamTime = Date.now() - startTime;

    return {
      content: [{ type: "text", text: result.text }],
      metadata: {
        timing: {
          streamStartTime: startTime,
          totalStreamTime,
          tokenCount: result.usage?.completionTokens,
          tokensPerSecond:
            result.usage?.completionTokens
              ? result.usage.completionTokens / (totalStreamTime / 1000)
              : undefined,
          totalChunks: 1,
          toolCallCount: 0,
        },
      },
    };
  },
};

ExternalStore (useExternalStoreRuntime)

Pass timing in the metadata.timing field of your ThreadMessageLike messages:

import type { ThreadMessageLike } from "@assistant-ui/react";

const message: ThreadMessageLike = {
  role: "assistant",
  content: [{ type: "text", text: fullText }],
  metadata: {
    timing: {
      streamStartTime: startTime,
      firstTokenTime,
      totalStreamTime,
      tokenCount,
      tokensPerSecond,
      totalChunks: chunks,
      toolCallCount: 0,
    },
  },
};

LangGraph (useLangGraphRuntime)

Timing is tracked automatically on the client side by observing streaming state transitions and LangChainMessage content changes. No setup required.

import { useLangGraphRuntime } from "@assistant-ui/react-langgraph";

const runtime = useLangGraphRuntime({ stream: myStream });
// useMessageTiming() works out of the box

AG-UI (useAgUiThreadRuntime)

Timing is tracked automatically on the client side by the AG-UI run aggregator. Each emitted message includes timing metadata computed from stream chunk observations.

import { useAgUiThreadRuntime } from "@assistant-ui/react-ag-ui";

const runtime = useAgUiThreadRuntime({ runtimeUrl: "..." });
// useMessageTiming() works out of the box

OpenCode (useOpenCodeRuntime)

Timing is tracked automatically on the client side by observing OpenCodeThreadState transitions and assistant message content deltas. No setup required.

import { useOpenCodeRuntime } from "@assistant-ui/react-opencode";

const runtime = useOpenCodeRuntime();
// useMessageTiming() works out of the box

API Reference

useMessageTiming()

const timing: MessageTiming | undefined = useMessageTiming();

Returns timing metadata for the current assistant message, or undefined for non-assistant messages or when no timing data is available.

Must be used inside a MessagePrimitive.Root context.