Skip to main content
The livekit-plugins-kugelaudio package provides seamless integration with LiveKit Agents, enabling you to build real-time voice agents with ultra-low latency TTS.

Installation

pip install livekit-plugins-kugelaudio

Quick Start

from livekit.agents import VoicePipelineAgent
from livekit.plugins import kugelaudio

# Create TTS instance
tts = kugelaudio.TTS(
    api_key="your-api-key",  # or set KUGELAUDIO_API_KEY env var
    model="kugel-1-turbo",   # Fast model (1.5B params)
    voice_id=123,            # Optional voice ID
)

# Use with VoicePipelineAgent
agent = VoicePipelineAgent(
    tts=tts,
    # ... other options
)

Configuration

Environment Variables

Set your API key via environment variable:
export KUGELAUDIO_API_KEY=your-api-key

TTS Options

ParameterTypeDefaultDescription
api_keystrNoneAPI key (or use env var)
modelstr"kugel-1-turbo"Model: "kugel-1-turbo" (fast) or "kugel-1" (premium)
voice_idintNoneVoice ID to use
sample_rateint24000Output sample rate in Hz
cfg_scalefloat2.0CFG scale for generation quality
max_new_tokensint2048Maximum tokens to generate
speaker_prefixboolTrueWhether to add speaker prefix
base_urlstr"https://api.kugelaudio.com"API base URL

Models

  • kugel-1-turbo: Fast model with 1.5B parameters. Best for real-time applications with ~39ms time-to-first-audio.
  • kugel-1: Premium model with 7B parameters. Higher quality but slightly higher latency (~77ms TTFA).

Features

  • ✅ Streaming TTS via WebSocket
  • ✅ Non-streaming (chunked) TTS
  • ✅ Multiple voice support
  • ✅ Configurable generation parameters
  • ✅ Compatible with LiveKit Agents VoicePipelineAgent

Complete Example

Here’s a complete example of a voice agent using KugelAudio with LiveKit:
import asyncio
from livekit import agents
from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.agents.voice_assistant import VoiceAssistant
from livekit.plugins import kugelaudio, silero, openai

async def entrypoint(ctx: JobContext):
    # Connect to the room
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
    
    # Wait for a participant
    participant = await ctx.wait_for_participant()
    
    # Create the voice assistant
    assistant = VoiceAssistant(
        # Voice Activity Detection
        vad=silero.VAD.load(),
        
        # Speech-to-Text
        stt=openai.STT(),
        
        # Language Model
        llm=openai.LLM(model="gpt-4o-mini"),
        
        # Text-to-Speech with KugelAudio
        tts=kugelaudio.TTS(
            model="kugel-1-turbo",
            voice_id=123,
            cfg_scale=2.0,
        ),
    )
    
    # Start the assistant
    assistant.start(ctx.room, participant)
    
    # Initial greeting
    await assistant.say("Hello! I'm your AI assistant. How can I help you today?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Using with VoicePipelineAgent

For more control, use the VoicePipelineAgent:
from livekit.agents import VoicePipelineAgent
from livekit.plugins import kugelaudio, silero, openai

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
    participant = await ctx.wait_for_participant()
    
    # Create pipeline agent
    agent = VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=openai.STT(),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=kugelaudio.TTS(
            model="kugel-1-turbo",
            voice_id=123,
        ),
        # Pipeline configuration
        interrupt_speech_duration=0.5,  # Allow interruptions
        min_endpointing_delay=0.5,
    )
    
    agent.start(ctx.room, participant)
    
    # Handle events
    @agent.on("user_speech_committed")
    def on_user_speech(msg):
        print(f"User said: {msg.content}")
    
    @agent.on("agent_speech_committed")
    def on_agent_speech(msg):
        print(f"Agent said: {msg.content}")

Switching Voices Dynamically

You can switch voices during a conversation:
# Create TTS with initial voice
tts = kugelaudio.TTS(
    model="kugel-1-turbo",
    voice_id=123,
)

# Later, update the voice
tts.voice_id = 456

# Or create a new TTS instance
new_tts = kugelaudio.TTS(
    model="kugel-1-turbo",
    voice_id=789,
)
agent.tts = new_tts

Performance Tuning

Optimize for Latency

For the lowest possible latency:
tts = kugelaudio.TTS(
    model="kugel-1-turbo",  # Fastest model
    cfg_scale=1.5,          # Lower CFG for faster generation
    max_new_tokens=1024,    # Limit output length
)

Optimize for Quality

For the best quality output:
tts = kugelaudio.TTS(
    model="kugel-1",        # Premium model
    cfg_scale=2.5,          # Higher CFG for more expressiveness
    speaker_prefix=True,    # Better voice consistency
)

Error Handling

from livekit.plugins import kugelaudio
from kugelaudio.exceptions import KugelAudioError, RateLimitError

try:
    tts = kugelaudio.TTS(
        api_key="your-api-key",
        model="kugel-1-turbo",
    )
    
    # Use TTS...
    
except RateLimitError:
    print("Rate limit exceeded, please wait")
except KugelAudioError as e:
    print(f"TTS error: {e}")

Local Development

For local development with a self-hosted TTS server:
tts = kugelaudio.TTS(
    api_key="your-api-key",
    base_url="http://localhost:8000",  # Local TTS server
    model="kugel-1-turbo",
)

Next Steps