Why Use KugelAudio with PipeCat?
- Native service: Drop-in
TTSServicefor PipeCat pipelines - Persistent WebSocket: Connection reuse eliminates ~100-220ms handshake overhead per request
- Built-in metrics: Automatic TTFB and usage metrics tracking
- Ultra-low latency: ~39ms time-to-first-audio with
kugel-1-turbo
Installation
pipecat-ai>=0.0.60).
The PipeCat integration requires Python 3.10 or higher.
Quick Start
Basic Pipeline
Set the
KUGELAUDIO_API_KEY environment variable or pass api_key directly to the constructor.Configuration
Service Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key | str | KUGELAUDIO_API_KEY env | Your KugelAudio API key |
model | str | kugel-1-turbo | TTS model (kugel-1-turbo or kugel-1) |
voice_id | int | required | Voice ID to use for synthesis |
sample_rate | int | 24000 | Output sample rate in Hz |
cfg_scale | float | 2.0 | CFG scale for generation quality |
max_new_tokens | int | 2048 | Maximum tokens to generate |
language | str | None | None | ISO 639-1 language code (e.g., en, de). Skips server-side auto-detection, saving ~60-150ms per request |
normalize | bool | True | Apply text normalization |
base_url | str | https://api.kugelaudio.com | API base URL |
Supported Sample Rates
| Rate | Notes |
|---|---|
24000 | Native rate (recommended) |
22050 | CD quality |
16000 | Wideband telephony |
8000 | Narrowband telephony |
Models
| Model | Latency | Quality | Use Case |
|---|---|---|---|
kugel-1-turbo | ~39ms TTFA | High | Real-time conversations |
kugel-1 | ~77ms TTFA | Exceptional | Premium quality applications |
Performance Optimization
Pre-warming the Connection
Callprewarm() during pipeline setup to establish the WebSocket connection before the first synthesis request. This eliminates ~100-220ms of TCP+TLS+WebSocket handshake latency from the first call.
Setting the Language
When you know the language of your input text, always set thelanguage parameter. Without it, the server auto-detects the language on each request, adding ~60-150ms to time-to-first-audio.
Connection Reuse
The service automatically reuses a persistent WebSocket connection acrossrun_tts() calls. This avoids the ~100-220ms TCP+TLS+WebSocket handshake overhead on every request. If the connection drops, a new one is established transparently on the next call.
Usage Patterns
Updating Voice and Model at Runtime
You can change the voice or model dynamically during a pipeline session:Pipeline Frame Flow
TheKugelAudioTTSService emits standard PipeCat frames:
TTSStartedFrame- Audio generation has begunTTSAudioRawFrame- Raw PCM audio chunks (16-bit, mono)TTSStoppedFrame- Audio generation is completeErrorFrame- If an error occurs during synthesis
Metrics Support
KugelAudio’s PipeCat service automatically tracks performance metrics:Complete Voice Bot Example
Here’s a complete voice bot using PipeCat with Daily as the transport:Running the Bot
Environment Variables
| Variable | Required | Description |
|---|---|---|
KUGELAUDIO_API_KEY | Yes | Your KugelAudio API key |
DAILY_ROOM_URL | Yes* | Daily room URL (if using Daily transport) |
DAILY_TOKEN | Yes* | Daily room token |
DEEPGRAM_API_KEY | Yes* | Required if using Deepgram STT |
OPENAI_API_KEY | Yes* | Required if using OpenAI LLM |
Troubleshooting
API key not found
API key not found
Make sure
KUGELAUDIO_API_KEY is set in your environment or pass api_key directly:Unsupported sample rate error
Unsupported sample rate error
KugelAudio supports these sample rates:
24000, 22050, 16000, 8000. Make sure your transport output sample rate matches:WebSocket connection fails
WebSocket connection fails
Verify your
base_url is correct and the KugelAudio API is reachable. The service connects via WebSocket (wss://) for audio streaming. If a persistent connection drops, the service automatically reconnects on the next run_tts() call.High latency (~200ms+)
High latency (~200ms+)
This is usually caused by missing the
language parameter (triggers auto-detection) or not calling prewarm(). See the Performance Optimization section for details.Python version incompatibility
Python version incompatibility
The PipeCat integration requires Python 3.10 or higher. Check your version:
Next Steps
LiveKit Integration
Use KugelAudio with LiveKit Agents
Streaming
Advanced streaming techniques