Model Overview
| Model | Generation | Sample Rate | Max Input | Access |
|---|---|---|---|---|
kugel-1-turbo | Kugel 1 | 24kHz | 5000 chars | Public |
kugel-1 | Kugel 1 | 24kHz | 5000 chars | Public |
kugel-2-turbo | Kugel 2 | 24kHz | 10000 chars | Early Access |
kugel-2 | Kugel 2 | 24kHz | 10000 chars | Early Access |
Kugel 2 models are in early access. They support IPA notation and custom pronunciation dictionaries. Contact us to get access.
Kugel 1 Turbo
Recommended for: Voice agents, real-time conversations, interactive applications
Specifications
- Sample Rate: 24kHz
- Max Input Length: 4096 characters
- Time to First Audio: ~39ms (at inference)
When to Use
- Voice agents and conversational AI
- Real-time streaming from LLMs
- Interactive applications requiring immediate feedback
- High-volume production workloads
Example
- Python
- JavaScript
- cURL
Kugel 1
Recommended for: Audiobooks, podcasts, premium content, voice cloning
Specifications
- Sample Rate: 24kHz
- Max Input Length: 8192 characters
- Time to First Audio: ~77ms
- Real-time Factor: < 0.3x (3x faster than real-time)
When to Use
- Audiobook production
- Podcast generation
- Marketing and promotional content
- Any application where quality is the top priority
Example
- Python
- JavaScript
- cURL
Generation Parameters
Both models support the following parameters:| Parameter | Type | Default | Description |
|---|---|---|---|
cfg_scale | float | 2.0 | Classifier-free guidance scale (1.0-5.0). Higher values = more expressive |
max_new_tokens | int | 2048 | Maximum tokens to generate |
sample_rate | int | 24000 | Output sample rate in Hz |
voice_id | int | null | Specific voice to use |
CFG Scale Guide
Thecfg_scale parameter controls how closely the model follows the voice characteristics:
- 1.0-1.5: More natural, relaxed delivery
- 2.0: Balanced (default)
- 2.5-3.0: More expressive, dynamic
- 3.5-5.0: Maximum expressiveness, best for dramatic content
Listing Models
- Python
- JavaScript
- cURL
Choosing the Right Model
Choose Kugel 1 Turbo when...
- Latency is critical
- Building real-time applications
- Processing high volumes
- Streaming from LLMs
Choose Kugel 1 when...
- Quality is the top priority
- Creating pre-recorded content
- Voice cloning applications
- Emotional/expressive content
Choose Kugel 2 Turbo when...
- You need IPA notation support
- Custom pronunciation dictionaries are required
- Low latency with next-gen quality
Choose Kugel 2 when...
- Maximum quality with next-gen architecture
- Full IPA and dictionary support
- Longer input texts (up to 10k chars)