Skip to main content
KugelAudio offers two models optimized for different use cases.

Model Overview

ModelParametersSample RateMax InputBest For
kugel-1-turbo1.5B24kHz4096 charsReal-time applications
kugel-17B24kHz8192 charsPremium quality content

Kugel 1 Turbo

Recommended for: Voice agents, real-time conversations, interactive applications
The Kugel 1 Turbo model is optimized for ultra-low latency. With 1.5 billion parameters, it delivers excellent quality while maintaining ~39ms time-to-first-audio.

Specifications

  • Parameters: 1.5B
  • Sample Rate: 24kHz
  • Max Input Length: 4096 characters
  • Time to First Audio: ~39ms
  • Real-time Factor: < 0.1x (10x faster than real-time)

When to Use

  • Voice agents and conversational AI
  • Real-time streaming from LLMs
  • Interactive applications requiring immediate feedback
  • High-volume production workloads

Example

audio = client.tts.generate(
    text="Hello! I'm your AI assistant. How can I help you today?",
    model="kugel-1-turbo",
    cfg_scale=2.0,
)

Kugel 1

Recommended for: Audiobooks, podcasts, premium content, voice cloning
The Kugel 1 model is our premium offering with 7 billion parameters. It produces the highest quality output with more natural prosody, emotion, and expressiveness.

Specifications

  • Parameters: 7B
  • Sample Rate: 24kHz
  • Max Input Length: 8192 characters
  • Time to First Audio: ~77ms
  • Real-time Factor: < 0.3x (3x faster than real-time)

When to Use

  • Audiobook production
  • Podcast generation
  • Marketing and promotional content
  • Any application where quality is the top priority

Example

audio = client.tts.generate(
    text="In a hole in the ground there lived a hobbit. Not a nasty, dirty, wet hole...",
    model="kugel-1",
    cfg_scale=2.5,  # Higher CFG for more expressive output
)

Generation Parameters

Both models support the following parameters:
ParameterTypeDefaultDescription
cfg_scalefloat2.0Classifier-free guidance scale (1.0-5.0). Higher values = more expressive
max_new_tokensint2048Maximum tokens to generate
sample_rateint24000Output sample rate in Hz
speaker_prefixbooltrueAdd speaker prefix for better quality
voice_idintnullSpecific voice to use

CFG Scale Guide

The cfg_scale parameter controls how closely the model follows the voice characteristics:
  • 1.0-1.5: More natural, relaxed delivery
  • 2.0: Balanced (default)
  • 2.5-3.0: More expressive, dynamic
  • 3.5-5.0: Maximum expressiveness, best for dramatic content

Listing Models

models = client.models.list()

for model in models:
    print(f"{model.id}: {model.name}")
    print(f"  Description: {model.description}")
    print(f"  Parameters: {model.parameters}")
    print(f"  Max Input: {model.max_input_length} characters")

Choosing the Right Model

Choose Kugel 1 Turbo when...

  • Latency is critical
  • Building real-time applications
  • Processing high volumes
  • Streaming from LLMs

Choose Kugel 1 when...

  • Quality is the top priority
  • Creating pre-recorded content
  • Voice cloning applications
  • Emotional/expressive content