from kugelaudio import KugelAudioclient = KugelAudio(api_key="your_api_key")audio = client.tts.generate( text="Hello, this is a test of the KugelAudio text-to-speech system.", model_id="kugel-1-turbo",)# Save to fileaudio.save("output.wav")# Or get WAV byteswav_bytes = audio.to_wav_bytes()
import { KugelAudio } from 'kugelaudio';const client = new KugelAudio({ apiKey: 'your_api_key' });const audio = await client.tts.generate({ text: 'Hello, this is a test of the KugelAudio text-to-speech system.', modelId: 'kugel-1-turbo',});// audio.audio is an ArrayBuffer with PCM16 dataconsole.log(`Duration: ${audio.durationMs}ms`);
import com.kugelaudio.sdk.KugelAudio;import com.kugelaudio.sdk.KugelAudioOptions;import com.kugelaudio.sdk.GenerateRequest;import com.kugelaudio.sdk.AudioResponse;KugelAudio client = new KugelAudio( KugelAudioOptions.builder("your_api_key").build());AudioResponse audio = client.tts().generate( GenerateRequest.builder("Hello, this is a test of the KugelAudio text-to-speech system.") .modelId("kugel-1-turbo") .language("en") .build());// Save to WAV fileaudio.saveWav(java.nio.file.Path.of("output.wav"));// Or get raw PCM bytesbyte[] pcmData = audio.getAudio();
curl -X POST https://api.kugelaudio.com/v1/tts/generate \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "Hello, this is a test of the KugelAudio text-to-speech system.", "model_id": "kugel-1-turbo" }' \ --output output.pcm# The response is raw PCM16 audio (signed 16-bit LE, mono, 24kHz)# Convert to WAV for playback:ffmpeg -f s16le -ar 24000 -ac 1 -i output.pcm output.wav
The speed parameter adjusts playback rate using pitch-preserving time-stretching (WSOLA), so the voice pitch stays natural even at different speeds. Range: 0.8 (20% slower) to 1.2 (20% faster).
Dashboard: The playground in the KugelAudio dashboard includes a Slow / Normal / Fast speed toggle next to the model selector. Changes are reflected live in the SDK code snippet shown below the generator.
For fine-grained control, use inline <prosody rate="..."> tags to slow down only specific parts of the text — useful for phone numbers, addresses, or other content that benefits from slower delivery:
Python
JavaScript
Java
cURL
# Global speed — whole sentence at 80% speedaudio = client.tts.generate( text="Bitte rufen Sie uns an unter: 0 30 12 34 56 78.", language="de", speed=0.8,)# Inline prosody — only the phone number slowed downaudio = client.tts.generate( text='Bitte rufen Sie uns an unter: <prosody rate="slow">0 30 12 34 56 78.</prosody>', language="de",)
// Global speedconst audio = await client.tts.generate({ text: 'Bitte rufen Sie uns an unter: 0 30 12 34 56 78.', language: 'de', speed: 0.8,});// Inline prosody tag — only the number slowedconst audio2 = await client.tts.generate({ text: 'Bitte rufen Sie uns an unter: <prosody rate="slow">0 30 12 34 56 78.</prosody>', language: 'de',});
// Global speedAudioResponse audio = client.tts().generate( GenerateRequest.builder("Bitte rufen Sie uns an unter: 0 30 12 34 56 78.") .language("de") .speed(0.8) .build());// Inline prosody tag — only the number slowedAudioResponse audio2 = client.tts().generate( GenerateRequest.builder( "Bitte rufen Sie uns an unter: <prosody rate=\"slow\">0 30 12 34 56 78.</prosody>" ).language("de").build());
curl -X POST https://api.kugelaudio.com/v1/tts/generate \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "Bitte rufen Sie uns an unter: 0 30 12 34 56 78.", "language": "de", "speed": 0.8 }' \ --output output.pcm
speed value
Rate
Typical use
0.8
20% slower
Phone numbers, addresses, medical terms
1.0
Normal (default)
General purpose
1.2
20% faster
Notifications, fast-paced content
Use <prosody rate="slow"> or <prosody rate="fast"> inline tags to vary speed within a single sentence without needing multiple API calls.
KugelAudio supports a subset of SSML focused on <prosody rate> and <spell>. Full SSML is not supported — the following tags are silently stripped or will produce unexpected output:
Tag / Attribute
Status
Alternative
<speak> wrapper
Not supported
Omit — plain text is assumed
<prosody pitch="...">
Not supported
No pitch control available
<prosody volume="...">
Not supported
No volume control available
<prosody duration="...">
Not supported
Use speed parameter instead
<emphasis>
Not supported
No emphasis tag processing
<break>
Not supported
Add punctuation (., ,) for natural pauses
<say-as>
Not supported
Use <spell> for character-by-character output
<audio>, <p>, <s>, <w>
Not supported
—
Unsupported tags are not validated — they are stripped from the text before synthesis. If you pass <prosody pitch="high"> the pitch attribute is ignored and the inner text is synthesized at the default pitch. Always test output when migrating from a full-SSML TTS provider.
audio = client.tts.generate( text="Hello, this is a test of the KugelAudio text-to-speech system.", model_id="kugel-1-turbo", voice_id=123, cfg_scale=2.0, max_new_tokens=2048, sample_rate=24000, normalize=True, language="en", word_timestamps=False, speed=1.0,)# Inspect the responseprint(f"Duration: {audio.duration_seconds:.2f}s")print(f"Samples: {audio.samples}")print(f"Sample rate: {audio.sample_rate} Hz")print(f"Generation time: {audio.generation_ms:.0f}ms")print(f"RTF: {audio.rtf:.2f}")# Save to WAV fileaudio.save("output.wav")# Get raw PCM bytespcm_data = audio.audio# Get WAV bytes (with header)wav_bytes = audio.to_wav_bytes()
const audio = await client.tts.generate({ text: 'Hello, this is a test of the KugelAudio text-to-speech system.', modelId: 'kugel-1-turbo', voiceId: 123, cfgScale: 2.0, maxNewTokens: 2048, sampleRate: 24000, normalize: true, language: 'en', wordTimestamps: false, speed: 1.0,});// Inspect the responseconsole.log(`Duration: ${audio.durationMs}ms`);console.log(`Samples: ${audio.samples}`);console.log(`Sample rate: ${audio.sampleRate} Hz`);console.log(`Generation time: ${audio.generationMs}ms`);console.log(`RTF: ${audio.rtf}`);// audio.audio is an ArrayBuffer with PCM16 data
AudioResponse audio = client.tts().generate( GenerateRequest.builder("Hello, this is a test of the KugelAudio text-to-speech system.") .modelId("kugel-1-turbo") .voiceId(123) .cfgScale(2.0) .maxNewTokens(2048) .sampleRate(24000) .normalize(true) .language("en") .wordTimestamps(false) .speed(1.0) .build());// Inspect the responseSystem.out.printf("Duration: %.2fs%n", audio.getDurationSeconds());System.out.printf("Samples: %d%n", audio.getTotalSamples());System.out.printf("Sample rate: %d Hz%n", audio.getSampleRate());System.out.printf("Generation time: %.0fms%n", audio.getGenerationMs());System.out.printf("RTF: %.2f%n", audio.getRtf());// Save to WAV fileaudio.saveWav(java.nio.file.Path.of("output.wav"));
curl -X POST https://api.kugelaudio.com/v1/tts/generate \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "Hello, this is a test of the KugelAudio text-to-speech system.", "model_id": "kugel-1-turbo", "voice_id": 123, "cfg_scale": 2.0, "max_new_tokens": 2048, "sample_rate": 24000, "normalize": true, "language": "en", "word_timestamps": false, "speed": 1.0 }' \ --output output.pcm
For latency-sensitive applications, pre-establish the WebSocket connection at startup to eliminate cold start latency (~500ms) from your first request.
Python (Async)
Python (Sync)
JavaScript
Java
import asynciofrom kugelaudio import KugelAudioasync def main(): # Create a pre-connected client (~500ms happens here) client = await KugelAudio.create(api_key="your_api_key") # First request is now fast (~100-150ms TTFA instead of ~600ms) audio = await client.tts.generate_async( text="Hello, world!", model_id="kugel-1-turbo", ) audio.save("output.wav") await client.aclose()asyncio.run(main())
from kugelaudio import KugelAudioclient = KugelAudio(api_key="your_api_key")# Pre-connect at startup (~500ms happens here)client.connect()# First request is now fastaudio = client.tts.generate( text="Hello, world!", model_id="kugel-1-turbo",)
import { KugelAudio } from 'kugelaudio';// Create a pre-connected client (~500ms happens here)const client = await KugelAudio.create({ apiKey: 'your_api_key' });// First request is now fast (~100-150ms TTFA instead of ~500ms)const audio = await client.tts.generate({ text: 'Hello, world!', modelId: 'kugel-1-turbo',});
import com.kugelaudio.sdk.KugelAudio;import com.kugelaudio.sdk.KugelAudioOptions;// autoConnect warms the WebSocket in the background during constructionKugelAudio client = new KugelAudio( KugelAudioOptions.builder("your_api_key") .autoConnect(true) .build());// First request is now fast — connection is already establishedAudioResponse audio = client.tts().generate( GenerateRequest.builder("Hello, world!") .modelId("kugel-1-turbo") .language("en") .build());client.close();
Without pre-connecting, the first TTS request includes WebSocket connection setup (~500ms).
Subsequent requests reuse the connection and are fast (~100-150ms TTFA).
Pre-connecting moves this overhead to application startup.
const audio = await client.tts.generate({ text: 'Hello, how are you today?', modelId: 'kugel-1-turbo', wordTimestamps: true,});for (const ts of audio.wordTimestamps) { console.log(`${ts.word}: ${ts.startMs}ms - ${ts.endMs}ms (score: ${ts.score.toFixed(2)})`);}
import com.kugelaudio.sdk.WordTimestamp;AudioResponse audio = client.tts().generate( GenerateRequest.builder("Hello, how are you today?") .modelId("kugel-1-turbo") .language("en") .wordTimestamps(true) .build());for (WordTimestamp ts : audio.getWordTimestamps()) { System.out.printf("%s: %dms - %dms (score: %.2f)%n", ts.getWord(), ts.getStartMs(), ts.getEndMs(), ts.getScore());}
curl -X POST https://api.kugelaudio.com/v1/tts/generate \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "Hello, how are you today?", "model_id": "kugel-1-turbo", "word_timestamps": true }' \ --output output.pcm
Word timestamps are included in the response headers or as a JSON
preamble before the audio bytes. Use an SDK for convenient access
to parsed timestamp objects.
Word timestamps add no extra audio latency. For streaming use cases, see the Streaming Guide.