Quick Start - KugelAudio

Prerequisites

Before you begin, make sure you have:

An API key from kugelaudio.com
Python 3.9+, Node.js 18+, Java 17+, or cURL

Installation

Python
JavaScript/TypeScript
Java
cURL

Install the Python SDK using pip or uv:

pip install kugelaudio

Or with uv (recommended):

uv add kugelaudio

Install the JavaScript SDK using your preferred package manager:

npm install kugelaudio

Or with yarn/pnpm:

yarn add kugelaudio
# or
pnpm add kugelaudio

Add the dependency to your pom.xml (requires Java 17+):

<dependency>
  <groupId>com.kugelaudio</groupId>
  <artifactId>kugelaudio</artifactId>
  <version>0.1.0</version>
</dependency>

Or with Gradle:

implementation 'com.kugelaudio:kugelaudio:0.1.0'

cURL comes pre-installed on macOS, Linux, and Windows 10+. No installation needed.Set your API key as an environment variable:

export KUGELAUDIO_API_KEY="your_api_key"

Basic Usage

Initialize the Client

Pre-connect at startup. Without client.connect(), the first TTS request pays the WebSocket handshake; subsequent requests reuse the connection. Pre-connecting moves the handshake cost to application startup, where it doesn’t affect user-perceived latency. See Latency for the numbers.

Python
JavaScript
Java
cURL

from kugelaudio import KugelAudio

# Initialize with your API key
client = KugelAudio(api_key="your_api_key")

# Pre-connect at startup (handshake happens here)
client.connect()

# Confirm connection is ready
print(f"Connected: {client.is_connected()}")

# First request is now fast — no handshake on the hot path

import { KugelAudio } from 'kugelaudio';

// Initialize with your API key
const client = new KugelAudio({ apiKey: 'your_api_key' });

// Pre-connect at startup (handshake happens here)
await client.connect();

// Confirm connection is ready
console.log(`Connected: ${client.isConnected()}`);

// First request is now fast — no handshake on the hot path

import com.kugelaudio.sdk.KugelAudio;
import com.kugelaudio.sdk.KugelAudioOptions;

KugelAudio client = new KugelAudio(
    KugelAudioOptions.builder("your_api_key").build()
);

// Pre-connect at startup (handshake happens here)
client.connect();

// Confirm connection is ready
System.out.println("Connected: " + client.isConnected());

// First request is now fast — no handshake on the hot path

No client initialization needed — just pass the API key in the Authorization header:

curl https://api.kugelaudio.com/v1/models \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY"

Generate Speech

Examples below use kugel-3, the canonical production model. Legacy IDs such as kugel-2.5 and kugel-2-turbo are still accepted for backwards compatibility; see Models for details.

Python
JavaScript
Java
cURL

# Generate speech
audio = client.tts.generate(
    text="Welcome to KugelAudio! This is high-quality text-to-speech.",
    model_id="kugel-3",
)

# Save to file
audio.save("output.wav")

# Or get the raw bytes
wav_bytes = audio.to_wav_bytes()

// Generate speech
const audio = await client.tts.generate({
  text: 'Welcome to KugelAudio! This is high-quality text-to-speech.',
  modelId: 'kugel-3',
});

// audio.audio is an ArrayBuffer with PCM16 data
console.log(`Duration: ${audio.durationMs}ms`);

import com.kugelaudio.sdk.GenerateRequest;
import com.kugelaudio.sdk.AudioResponse;

AudioResponse audio = client.tts().generate(
    GenerateRequest.builder("Welcome to KugelAudio! This is high-quality text-to-speech.")
        .modelId("kugel-3")
        .language("en")
        .build()
);

// Save to file
audio.saveWav(java.nio.file.Path.of("output.wav"));

curl -X POST https://api.kugelaudio.com/v1/tts/generate \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to KugelAudio! This is high-quality text-to-speech.",
    "model_id": "kugel-3"
  }' \
  --output output.pcm

# Convert to WAV for playback
ffmpeg -f s16le -ar 24000 -ac 1 -i output.pcm output.wav

Stream Audio

For lower latency, stream audio chunks as they’re generated:

Python
JavaScript
Java
cURL

# Synchronous streaming
for chunk in client.tts.stream(
    text="Hello, this is streaming audio.",
    model_id="kugel-3",
):
    if hasattr(chunk, 'audio'):
        # Process audio chunk immediately
        print(f"Chunk {chunk.index}: {len(chunk.audio)} bytes")
        # play_audio(chunk.audio)

For async applications:

import asyncio

async def stream_audio():
    async for chunk in client.tts.stream_async(
        text="Async streaming example.",
        model_id="kugel-3",
    ):
        if hasattr(chunk, 'audio'):
            # Process chunk
            pass

asyncio.run(stream_audio())

await client.tts.stream(
  {
    text: 'Hello, this is streaming audio.',
    modelId: 'kugel-3',
  },
  {
    onChunk: (chunk) => {
      console.log(`Chunk ${chunk.index}: ${chunk.samples} samples`);
      // Play or process the audio chunk
    },
    onFinal: (stats) => {
      console.log(`Total duration: ${stats.durationMs}ms`);
      console.log(`Generation time: ${stats.generationMs}ms`);
    },
  }
);

import com.kugelaudio.sdk.StreamCallbacks;
import com.kugelaudio.sdk.AudioChunk;

client.tts().stream(
    GenerateRequest.builder("Hello, this is streaming audio.")
        .modelId("kugel-3")
        .language("en")
        .build(),
    new StreamCallbacks() {
        @Override
        public void onChunk(AudioChunk chunk) {
            System.out.printf("Chunk %d: %d bytes%n",
                chunk.getIndex(), chunk.getAudio().length);
            // playAudio(chunk.getAudio());
        }
    }
);

The REST endpoint streams raw PCM bytes — pipe directly to ffplay for real-time playback:

curl -X POST https://api.kugelaudio.com/v1/tts/generate \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is streaming audio.",
    "model_id": "kugel-3"
  }' \
  --no-buffer | ffplay -f s16le -ar 24000 -ac 1 -nodisp -

For advanced streaming (WebSocket-based, token-by-token from LLMs), use the Python, JavaScript, or Java SDK or the raw WebSocket API.

Working with Voices

Pick your voice deliberately. Different voices have wildly different baseline energy, age, and warmth — a peppy DTC bot and a calm clinical agent should not share the same voice even with the same prompt. Listen to several before locking one in. Building an LLM-driven voice agent? See Voice Agent Prompting for the prompt patterns that matter most.

List Available Voices

Python
JavaScript
Java
cURL

# List all voices
result = client.voices.list()

for voice in result.voices:
    print(f"{voice.id}: {voice.name}")
    print(f"  Languages: {', '.join(voice.supported_languages)}")
print(f"Total: {result.total}")

# Filter by language
result = client.voices.list(language="de")

// List all voices
const result = await client.voices.list();

for (const voice of result.voices) {
  console.log(`${voice.id}: ${voice.name}`);
  console.log(`  Languages: ${voice.supportedLanguages.join(', ')}`);
}
console.log(`Total: ${result.total}`);

// Filter by language
const germanVoices = await client.voices.list({ language: 'de' });

import com.kugelaudio.sdk.Voice;
import com.kugelaudio.sdk.VoiceListResponse;

VoiceListResponse result = client.voices().list();
for (Voice voice : result.getVoices()) {
    System.out.printf("%d: %s%n", voice.getId(), voice.getName());
}
System.out.printf("Total: %d%n", result.getTotal());

// Filter by language
VoiceListResponse germanVoices = client.voices().list("de", null, null, null);

# List all voices
curl https://api.kugelaudio.com/v1/voices \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY"

# Filter by language
curl "https://api.kugelaudio.com/v1/voices?language=de" \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY"

Use a Specific Voice

Python
JavaScript
Java
cURL

audio = client.tts.generate(
    text="Hello with a specific voice!",
    model_id="kugel-3",
    voice_id=1071,  # Use a specific voice ID
)

const audio = await client.tts.generate({
  text: 'Hello with a specific voice!',
  modelId: 'kugel-3',
  voiceId: 1071,  // Use a specific voice ID
});

AudioResponse audio = client.tts().generate(
    GenerateRequest.builder("Hello with a specific voice!")
        .modelId("kugel-3")
        .voiceId(1071)
        .language("en")
        .build()
);

curl -X POST https://api.kugelaudio.com/v1/tts/generate \
  -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello with a specific voice!",
    "model_id": "kugel-3",
    "voice_id": 1071
  }' \
  --output output.pcm

Next Steps

Generate Speech

All generation options and parameters

Streaming

Real-time audio streaming techniques

Using Voices

Browse and filter available voices

Text Processing

Normalization and spell tags

​Prerequisites

​Installation

​Basic Usage

​Initialize the Client

​Generate Speech

​Stream Audio

​Working with Voices

​List Available Voices

​Use a Specific Voice

​Next Steps

Generate Speech

Streaming

Using Voices

Text Processing

Prerequisites

Installation

Basic Usage

Initialize the Client

Generate Speech

Stream Audio

Working with Voices

List Available Voices

Use a Specific Voice

Next Steps