> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kugelaudio.com/llms.txt
> Use this file to discover all available pages before exploring further.

# LiveKit Integration

> Use KugelAudio TTS with the LiveKit Agents framework

KugelAudio provides an official plugin for the [LiveKit Agents](https://docs.livekit.io/agents/) framework, enabling ultra-low latency text-to-speech in your voice AI agents.

## Why Use KugelAudio with LiveKit?

* **Native plugin:** Drop-in TTS provider for LiveKit's `AgentSession`
* **Streaming support:** Real-time WebSocket-based audio streaming
* **Ultra-low latency:** streaming TTS built for real-time agents — see [Latency](/latency) for current TTFA figures
* **Simple setup:** Works with `VoicePipelineAgent` and the new `AgentSession` API

## Installation

```bash theme={null}
pip install kugelaudio[livekit]
```

This installs the KugelAudio SDK along with the required LiveKit Agents dependencies (`livekit-agents>=1.0.0`).

## Quick Start

### Minimal Voice Agent

```python theme={null}
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import deepgram, openai, silero
from kugelaudio.livekit import TTS as KugelAudioTTS

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    participant = await ctx.wait_for_participant()

    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=KugelAudioTTS(
            model="kugel-3",
            voice_id=1071,
            sample_rate=24000,
        ),
        vad=silero.VAD.load(),
    )

    agent = Agent(
        instructions="You are a helpful voice assistant."
    )

    await session.start(room=ctx.room, agent=agent)
    await session.say("Hello! How can I help you?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
```

<Note>
  Set the `KUGELAUDIO_API_KEY` environment variable or pass `api_key` directly to the `TTS` constructor.
</Note>

## Configuration

### TTS Parameters

| Parameter         | Type                    | Default                      | Description                                                                                   |
| ----------------- | ----------------------- | ---------------------------- | --------------------------------------------------------------------------------------------- |
| `api_key`         | `str`                   | `KUGELAUDIO_API_KEY` env     | Your KugelAudio API key                                                                       |
| `model`           | `str`                   | `kugel-3`                    | TTS model (`kugel-3`)                                                                         |
| `voice_id`        | `int \| None`           | `None`                       | Voice ID to use (server default if `None`)                                                    |
| `sample_rate`     | `int`                   | `24000`                      | Output sample rate in Hz                                                                      |
| `cfg_scale`       | `float`                 | `2.0`                        | CFG scale for generation quality                                                              |
| `max_new_tokens`  | `int`                   | `2048`                       | Maximum tokens to generate                                                                    |
| `normalize`       | `bool`                  | `True`                       | Apply loudness normalization to output audio                                                  |
| `language`        | `str \| None`           | `None`                       | ISO 639-1 language code (e.g. `"de"`, `"en"`). Skips auto-detection — see [Latency](/latency) |
| `base_url`        | `str`                   | `https://api.kugelaudio.com` | API base URL                                                                                  |
| `word_timestamps` | `bool`                  | `False`                      | Enable word-level time alignments (opt-in; required for aligned transcript)                   |
| `http_session`    | `ClientSession \| None` | `None`                       | Optional aiohttp session to reuse                                                             |

### Supported Sample Rates

| Rate    | Notes                     |
| ------- | ------------------------- |
| `24000` | Native rate (recommended) |
| `22050` | CD quality                |
| `16000` | Wideband telephony        |
| `8000`  | Narrowband telephony      |

<Tip>
  Use the native `24000` Hz sample rate for best quality and lowest latency. Lower rates use server-side resampling with negligible impact — see [Latency](/latency).
</Tip>

### Models

Use `kugel-3` — the current production model for all use cases (voice agents,
narration, brand voices). See [Models](/models) for capabilities and
[Latency](/latency) for TTFA figures.

See [Models](/models) for the full comparison.

## Usage Patterns

### Non-Streaming Synthesis

Use `synthesize()` for one-shot text-to-speech:

```python theme={null}
from kugelaudio.livekit import TTS

tts = TTS(model="kugel-3", voice_id=1071)

# Synthesize a complete text
stream = tts.synthesize("Hello, this is a complete sentence.")
async for event in stream:
    # Process audio frames
    pass
```

### Streaming Synthesis

Use `stream()` for real-time text input (e.g., from an LLM):

```python theme={null}
from kugelaudio.livekit import TTS

tts = TTS(model="kugel-3", voice_id=1071)

# Create a streaming session
stream = tts.stream()

# Send text chunks as they arrive from an LLM
stream.push_text("Hello, ")
stream.push_text("how are you today?")
stream.flush()
stream.end_input()

# Receive audio frames
async for event in stream:
    # Process audio frames
    pass
```

### Setting the Language

Set `language` to skip server-side auto-detection on every request (see [Latency](/latency)):

```python theme={null}
tts = KugelAudioTTS(
    model="kugel-3",
    voice_id=1071,
    language="de",  # German text normalization (e.g. "123" → "einhundertdreiundzwanzig")
)
```

Supported languages: `de`, `en`, `fr`, `es`, `it`, `pt`, `nl`, `pl`, `sv`, `da`, `no`, `fi`, `cs`, `hu`, `ro`, `el`, `uk`, `bg`, `tr`, `vi`, `ar`, `hi`, `zh`, `ja`, `ko`.

<Tip>
  Always set `language` when you know the output language in advance. This is especially important for real-time voice agents where every millisecond counts.
</Tip>

### Updating Options at Runtime

You can change TTS options dynamically without creating a new instance:

```python theme={null}
tts = KugelAudioTTS(model="kugel-3", voice_id=1071)

# Switch voice mid-conversation
tts.update_options(voice_id=300)

# Switch to higher quality model
tts.update_options(model="kugel-3")

# Set or change language
tts.update_options(language="de")

# Adjust generation parameters
tts.update_options(cfg_scale=1.5, max_new_tokens=4096)
```

### Word-Level Alignment

Word timestamps are **off by default** (including for `kugel-3`), which avoids server-side post-processing errors on models where alignment is not yet supported.

When you set `word_timestamps=True`, the server performs forced alignment on each audio chunk and delivers per-word timing alongside the audio. LiveKit's `AgentSession` uses these timings for barge-in and transcript sync via the `aligned_transcript` capability (advertised only when timestamps are enabled).

```python theme={null}
tts = KugelAudioTTS(
    model="kugel-3",
    voice_id=1071,
    word_timestamps=True,  # opt-in
)

# LiveKit receives TimedString objects with word boundaries automatically
```

<Note>
  Word alignments add no extra audio latency when supported. Timestamps are delivered shortly after each audio chunk — see [Word timestamps](/streaming/word-timestamps).
</Note>

<Tip>
  If synthesis fails with "Audio post-processing failed", keep `word_timestamps=False` (the default) or switch to a model that supports alignment.
</Tip>

### Plugin Registration

You can also register KugelAudio as a LiveKit plugin namespace:

```python theme={null}
from kugelaudio.livekit import register_plugin

# Register the plugin
register_plugin()

# Now available via livekit.plugins namespace
from livekit.plugins import kugelaudio
tts = kugelaudio.TTS(model="kugel-3")
```

## Complete Voice Agent Example

Here's a production-ready voice agent with metrics logging:

```python theme={null}
import logging
import os
from livekit.agents import (
    Agent, AgentSession, JobContext,
    WorkerOptions, cli, metrics,
)
from livekit.agents.voice import MetricsCollectedEvent
from livekit.plugins import deepgram, openai, silero
from kugelaudio.livekit import TTS as KugelAudioTTS

logger = logging.getLogger("voice-agent")

async def entrypoint(ctx: JobContext):
    await ctx.connect()
    participant = await ctx.wait_for_participant()

    # Initialize components
    tts = KugelAudioTTS(
        voice_id=int(os.environ.get("KUGELAUDIO_VOICE_ID", "280")),
        model="kugel-3",
        sample_rate=24000,
    )

    session = AgentSession(
        stt=deepgram.STT(),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=tts,
        vad=silero.VAD.load(),
    )

    # Log TTS metrics
    @session.on("metrics_collected")
    def on_metrics(ev: MetricsCollectedEvent):
        for metric in ev.metrics:
            if hasattr(metric, "ttfb") and hasattr(metric, "characters_count"):
                logger.info(
                    f"TTS: ttfb={metric.ttfb:.3f}s, "
                    f"duration={metric.duration:.3f}s, "
                    f"chars={metric.characters_count}"
                )
        metrics.log_metrics(ev.metrics)

    agent = Agent(
        instructions="""You are a helpful voice assistant. 
Keep responses concise (1-3 sentences) for natural conversation."""
    )

    await session.start(room=ctx.room, agent=agent)
    await session.say("Hello! How can I help you today?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
```

### Running the Agent

```bash theme={null}
# Set environment variables
export KUGELAUDIO_API_KEY="your-api-key"
export LIVEKIT_URL="wss://your-livekit-server.com"
export LIVEKIT_API_KEY="your-livekit-key"
export LIVEKIT_API_SECRET="your-livekit-secret"

# Run in console mode (for testing)
python voice_agent.py console

# Run as a worker (for production)
python voice_agent.py start
```

## Environment Variables

| Variable              | Required | Description                    |
| --------------------- | -------- | ------------------------------ |
| `KUGELAUDIO_API_KEY`  | Yes      | Your KugelAudio API key        |
| `LIVEKIT_URL`         | Yes      | Your LiveKit server URL        |
| `LIVEKIT_API_KEY`     | Yes      | LiveKit API key                |
| `LIVEKIT_API_SECRET`  | Yes      | LiveKit API secret             |
| `KUGELAUDIO_VOICE_ID` | No       | Default voice ID to use        |
| `DEEPGRAM_API_KEY`    | Yes\*    | Required if using Deepgram STT |
| `OPENAI_API_KEY`      | Yes\*    | Required if using OpenAI LLM   |

## Troubleshooting

<AccordionGroup>
  <Accordion title="API key not found">
    Make sure `KUGELAUDIO_API_KEY` is set in your environment or pass `api_key` directly:

    ```python theme={null}
    tts = KugelAudioTTS(api_key="your-api-key")
    ```
  </Accordion>

  <Accordion title="WebSocket connection fails">
    Verify your `base_url` is correct and the KugelAudio API is reachable. The plugin connects via WebSocket (`wss://`) for audio streaming.
  </Accordion>

  <Accordion title="Audio quality issues">
    * Use the native `24000` Hz sample rate for best results
    * Try increasing `cfg_scale` (e.g., `2.5`) for more expressive output
    * Switch to `kugel-3` model for premium quality
  </Accordion>

  <Accordion title="High latency">
    * Set `language` explicitly (e.g. `language="de"`) to skip auto-detection — see [Latency](/latency)
    * Use `kugel-3` for real-time conversations when latency matters more than prosody
    * Lower `cfg_scale` (e.g., `1.5`) trades slight quality for speed
    * Reuse `http_session` across requests to avoid connection overhead
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="PipeCat Integration" icon="pipe-section" href="/integrations/pipecat">
    Use KugelAudio with PipeCat pipelines
  </Card>

  <Card title="Streaming" icon="wave-pulse" href="/streaming/overview">
    Advanced streaming techniques
  </Card>
</CardGroup>
