Documentation Index
Fetch the complete documentation index at: https://docs.kugelaudio.com/llms.txt
Use this file to discover all available pages before exploring further.
Self-hosted migration: monolithic image to compose stack
KugelAudio self-hosted deployments previously ran as a single monolithic container. That image is retired. The new bundle is a Docker Compose stack:ingress, normalizer, and tts-turbo start by default; tts-standard
is available as an opt-in high-quality engine via the with-standard
profile. The stack is defined in
backend/docker-compose.selfhosted.yml. The migration is a one-time
cutover; both stacks cannot share a host at the same time on the default
ports.
What’s changing
| Old | New |
|---|---|
kugelaudio/kugelaudio-tts-selfhosted:<version> | kugelaudio/ingress + kugelaudio/normalizer + kugelaudio/tts |
| Single container, single process tree | Compose services orchestrated by Docker Compose |
| Ray Serve as the in-process router | gRPC between ingress and TTS engines |
One KUGEL_* env block | Per-service env block, required values fail-fast |
| One model cache mount | Per-engine cache volumes plus a shared voice store |
Restart by docker restart <id> | docker compose -f <file> up -d / down |
:8000 (HTTP and WebSocket) is unchanged. Existing
SDK and client integrations continue to work without code changes.
Prerequisites
- Docker Engine 24+ with the Compose plugin (
docker compose version). - For the TTS engines: an NVIDIA GPU host with the NVIDIA Container Toolkit installed.
KUGEL_LICENSE_KEYandKUGEL_INSTANCE_IDissued by KugelAudio.- A Hugging Face access token (
HF_TOKEN) with read access to the KugelAudio model repositories.
Migration steps
-
Back up customer state.
- Voice references uploaded to the old container’s voice store.
- Any local config or
.envfiles mounted into the old container. - License key and instance ID — required by both stacks.
-
Stop the old container.
-
Fetch the new compose file. It lives in the repo at
backend/docker-compose.selfhosted.yml. Copy it to a working directory on the host, e.g./opt/kugelaudio/. -
Resolve image tags. The compose file ships with
:TBD-<service>placeholders. Replace each one with the published per-service tag you were given by KugelAudio support, for example: -
Create the
.envfile next to the compose file. Minimum:Optional knobs (license-server URL for support-directed staging/private deployments, Sentry DSN, neural TN, port overrides) are documented inline at the top of the compose file. -
Restore the voice store. Create a named volume and copy the
previous voice references into
/data/voicesinside it. Both TTS services mount this same volume read-write. -
Bring up the stack.
The
ingressservice waits onnormalizerandtts-turboto be healthy before accepting traffic, so the first start can take several minutes while the TTS engine downloads model weights. On multi-GPU hosts, add--profile with-standardto start the optionaltts-standardengine.
How to verify
Run each of these against the freshly-started stack:-
Ingress health:
Expect a 200 response.
-
Per-service health:
The default services should report
healthy. If a TTS engine is stillstartingafter 5 minutes, checkdocker compose logs tts-turbo— it is most likely still pulling weights. -
End-to-end TTS request:
You should get a non-empty
audio/pcmresponse. If the request 502s, check the ingress logs for the upstream error fromtts-turbo.
Rollback
If the new stack misbehaves, the old monolithic image is still pullable from Docker Hub until the deprecation window closes. To roll back:Internal references
- Teardown plan:
.claude/plans/rayserve-teardown.md - Compose file:
backend/docker-compose.selfhosted.yml - Per-service Dockerfiles:
backend/ingress/Dockerfile,backend/normalizer/Dockerfile,backend/tts/Dockerfile