Skip to main content
In a live voice agent the end user often starts speaking while the agent is still talking. When your VAD (voice-activity detection) detects this, call cancelCurrent() to stop generation for the current turn immediately — without closing the WebSocket. This differs from endSession() / {"close": true}, which ends the turn gracefully: it flushes whatever text is still buffered and drains the remaining audio (see Turn lifecycle). A barge-in does the opposite — it abandons the turn:
  • The actively-generating sentence is cancelled mid-stream.
  • Any text that was buffered or queued but not yet spoken is dropped.
  • No further audio chunks for the cancelled turn are emitted after the acknowledgement. No final (end-of-audio) frame is sent for a cancelled turn — the interrupted ack takes its place.
  • The WebSocket stays open, so you can send() the next user turn immediately (session config is re-sent automatically on that first send).
The call resolves once the server acknowledges with {"interrupted": true} (or after a short quiet timeout if the server has gone silent).
async with client.tts.streaming_session(voice_id=1071) as session:
    async for chunk in session.send("This is a very long answer that the user "):
        play_audio(chunk)

    # VAD detected the user speaking over the agent — barge in:
    await session.cancel_current()

    # Socket is still open — start the next turn right away:
    async for chunk in session.send("Sure, what would you like instead?", flush=True):
        play_audio(chunk)
Stop your local audio playback as soon as you call cancelCurrent() — don’t wait for the acknowledgement. A few audio frames already in transit may still arrive before the server confirms the cancel; the onInterrupted callback (JS/Java) marks the point after which no more frames for the cancelled turn will come.

Barge-in on multi-context sessions

For multi-context sessions, barge-in is per context: call closeContext(contextId, true) (JS/Java) / close_context(context_id, immediate=True) (Python), or send {"close_context": true, "context_id": "...", "immediate": true} on the raw socket. The targeted context’s in-flight generation is cancelled and its buffered text dropped; other contexts and the connection stay open.