Skip to main content
KugelAudio exposes an ElevenLabs-compatible HTTP API, so any existing integration built for ElevenLabs works by changing one line: the base_url. No other code changes required.

Quick Start

Python SDK

from elevenlabs import ElevenLabs

client = ElevenLabs(
    api_key="your-kugelaudio-api-key",
    base_url="https://api.kugelaudio.com/11labs",
)

audio = client.text_to_speech.convert(
    voice_id="480",  # use client.voices.get_all() to list available voices
    text="Hello from KugelAudio!",
    model_id="kugel-1-turbo",
    output_format="pcm_24000",
)

with open("output.pcm", "wb") as f:
    for chunk in audio:
        f.write(chunk)

Node.js SDK

import ElevenLabs from "elevenlabs";

const client = new ElevenLabs({
  apiKey: "your-kugelaudio-api-key",
  baseUrl: "https://api.kugelaudio.com/11labs",
});

const stream = await client.textToSpeech.convertAsStream("480", {
  text: "Hello from KugelAudio!",
  modelId: "kugel-1-turbo",
  outputFormat: "pcm_24000",
});

Migrating from ElevenLabs

The only changes needed:
  1. Replace base_url — point to your KugelAudio server
  2. Update voice_id — use KugelAudio voice IDs (not ElevenLabs IDs)
  3. Update output_format — use a PCM format (see Output Formats)
# Before
client = ElevenLabs(api_key="your-elevenlabs-key")

# After
client = ElevenLabs(
    api_key="your-kugelaudio-key",
    base_url="https://api.kugelaudio.com/11labs",
)
List your available voices to get the right IDs:
voices = client.voices.get_all()
for v in voices.voices:
    print(f"{v.voice_id}: {v.name}")

Output Formats

KugelAudio generates audio natively at 24 kHz PCM16. Lower sample rates use server-side resampling.
FormatStatusNotes
pcm_24000✅ RecommendedNative rate, zero conversion cost
pcm_22050✅ Supported
pcm_16000✅ SupportedCommon for telephony
pcm_8000✅ Supported
mp3_*⚠️ Not yetConvert client-side (see below)
ulaw_8000⚠️ Not yetConvert client-side (see below)

Converting PCM to MP3 or µ-law client-side

If your downstream system requires MP3 or µ-law (e.g. telephony platforms like Twilio), convert after receiving the PCM stream:
# MP3 — using pydub + ffmpeg
from pydub import AudioSegment
import io

pcm_bytes = b"".join(chunk for chunk in audio_stream)
segment = AudioSegment(
    data=pcm_bytes,
    sample_width=2,   # 16-bit
    frame_rate=24000,
    channels=1,
)
segment.export("output.mp3", format="mp3", bitrate="128k")
# µ-law (G.711) — using audioop (stdlib)
import audioop

pcm_bytes = b"".join(chunk for chunk in audio_stream)
# Downsample to 8kHz first if needed
pcm_8k = audioop.ratecv(pcm_bytes, 2, 1, 24000, 8000, None)[0]
ulaw_bytes = audioop.lin2ulaw(pcm_8k, 2)

Supported Endpoints

Text-to-Speech

EndpointMethodStatus
/v1/text-to-speech/{voice_id}POST✅ Supported
/v1/text-to-speech/{voice_id}/streamPOST✅ Supported
/v1/text-to-speech/{voice_id}/stream-inputWebSocket✅ Supported
About stream-input: Feed text tokens as they arrive from an LLM — synthesis starts as soon as a sentence boundary is detected, minimizing time-to-first-audio. The server sends ElevenLabs-format audio frames ({"audio": "<base64>", "isFinal": false}) and closes with {"audio": "", "isFinal": true}.
import asyncio, base64, json
import websockets

async def stream_tts():
    url = "wss://api.kugelaudio.com/11labs/v1/text-to-speech/480/stream-input?model_id=eleven_turbo_v2&output_format=pcm_24000"
    async with websockets.connect(url, extra_headers={"xi-api-key": "your-api-key"}) as ws:
        # Send text tokens one by one (e.g. from an LLM stream)
        for token in ["Hello, ", "this is ", "streamed ", "speech."]:
            await ws.send(json.dumps({"text": token}))

        # Signal end of stream
        await ws.send(json.dumps({"text": ""}))

        # Receive audio frames
        with open("output.pcm", "wb") as f:
            async for msg in ws:
                frame = json.loads(msg)
                if frame.get("isFinal"):
                    break
                if audio := frame.get("audio"):
                    f.write(base64.b64decode(audio))

asyncio.run(stream_tts())

Voices

EndpointMethodStatus
/v1/voicesGET✅ Supported
/v1/voices/{voice_id}GET✅ Supported
/v1/voices/addPOST❌ Not supported
/v1/voices/{voice_id}/editPOST❌ Not supported

Other

EndpointMethodStatus
/v1/modelsGET✅ Supported
/v1/userGET⚠️ Stub
/v1/user/subscriptionGET⚠️ Stub
/v1/historyGET⚠️ Stub

Available Models

Model ID (ElevenLabs alias)KugelAudio modelDescription
eleven_turbo_v2, eleven_turbo_v2_5kugel-1-turboFast, low-latency
eleven_multilingual_v2kugel-1High quality, multilingual
You can also pass KugelAudio model IDs directly — kugel-1-turbo and kugel-1 are accepted.

Parameter Mapping

ElevenLabsKugelAudioNotes
voice_idvoice_idUse KugelAudio voice IDs
model_idmodelSee model table above
similarity_boostcfg_scalecfg_scale = 1.0 + (similarity_boost × 2.0)
stabilityNot used

Troubleshooting

# Check server health
curl https://api.kugelaudio.com/11labs/health

# List voices
curl -H "xi-api-key: your-api-key" https://api.kugelaudio.com/11labs/v1/voices | jq '.voices[:5]'

# Test TTS
curl -X POST https://api.kugelaudio.com/11labs/v1/text-to-speech/480 \
  -H "xi-api-key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "model_id": "kugel-1-turbo"}' \
  --output test.pcm

Python SDK

Native KugelAudio SDK with full feature access

JavaScript SDK

Native KugelAudio SDK with full feature access