> ## Documentation Index > Fetch the complete documentation index at: https://docs.kugelaudio.com/llms.txt > Use this file to discover all available pages before exploring further. # Voice Cloning > Create custom voices from audio samples Voice cloning allows you to create a synthetic voice that sounds like a specific person from just a few seconds of reference audio. ## How It Works 1. **Upload reference audio** - Provide 10-30 seconds of clean speech 2. **Processing** - Our AI analyzes the voice characteristics 3. **Voice created** - Use your new voice in any TTS request ## Requirements ### Audio Quality For best results, your reference audio should be: * **Duration:** 10-30 seconds of speech * **Format:** WAV, MP3, OGG, M4A, or FLAC * **Sample rate:** 16kHz or higher * **Channels:** Mono preferred * **Quality:** Clean, no background noise ### Content Guidelines ✅ **Good audio:** * Clear speech with natural pacing * Single speaker only * Minimal background noise * Natural emotional range * Free of filler words (um, uh, ah, hmm) unless you want them in the output ❌ **Avoid:** * Multiple speakers * Background music * Heavy reverb or echo * Whispered or shouted speech * Heavily compressed audio * Recordings with frequent filler sounds or hesitations * Long gaps or extended silence between sentences, unless you want the cloned voice to reproduce those pauses **Your samples define the voice.** The cloned voice will reproduce everything present in your reference audio — including filler sounds like "um", "ah", "hmm", long sentence gaps, breathing patterns, and any other speech habits. If your reference audio contains these sounds or pauses, they will appear in the generated output and cannot be removed after cloning. For the most controllable results, use **clean recordings without fillers or long silences**. You can then add natural-sounding hesitations through your text prompts when needed (e.g., writing "um" or "..." in the input text). ## Creating a Voice Clone ### Via Dashboard 1. Go to **Dashboard** → **Voices** → **Create Voice** 2. Upload your reference audio 3. Enter a name and description 4. Click **Create Voice** 5. Wait for processing (usually 2-5 minutes) ### Via SDK ```python theme={null} from kugelaudio import KugelAudio client = KugelAudio(api_key="YOUR_API_KEY") # Create a voice with reference audio voice = client.voices.create( name="My Custom Voice", sex="female", description="Cloned from reference audio", category="cloned", reference_files=["reference.wav"], ) print(f"Created voice: {voice.id}") print(f"Name: {voice.name}") ``` ```typescript theme={null} import { KugelAudio } from 'kugelaudio'; const client = new KugelAudio({ apiKey: 'YOUR_API_KEY' }); // Create a voice with reference audio (browser) const fileInput = document.getElementById('audio-upload') as HTMLInputElement; const file = fileInput.files![0]; const voice = await client.voices.create({ name: 'My Custom Voice', sex: 'female', description: 'Cloned from reference audio', category: 'cloned', referenceFiles: [file], }); console.log(`Created voice: ${voice.id}`); ``` ```bash theme={null} curl -X POST https://api.kugelaudio.com/v1/voices \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -F 'metadata={"name":"My Custom Voice","sex":"female","description":"Cloned from reference audio","category":"cloned"};type=application/json' \ -F "files=@reference.wav" ``` ## Using Cloned Voices Once created, use your cloned voice like any other: ```python theme={null} from kugelaudio import KugelAudio client = KugelAudio(api_key="YOUR_API_KEY") # Use your cloned voice audio = client.tts.generate( text="Hello, this is my cloned voice speaking!", model_id="kugel-3", voice_id=YOUR_CLONED_VOICE_ID, ) audio.save("cloned_output.wav") ``` ```typescript theme={null} import { KugelAudio } from 'kugelaudio'; const client = new KugelAudio({ apiKey: 'YOUR_API_KEY' }); const audio = await client.tts.generate({ text: 'Hello, this is my cloned voice speaking!', modelId: 'kugel-3', voiceId: YOUR_CLONED_VOICE_ID, }); ``` ```bash theme={null} curl -X POST https://api.kugelaudio.com/v1/tts/generate \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "Hello, this is my cloned voice speaking!", "model_id": "kugel-3", "voice_id": YOUR_CLONED_VOICE_ID }' \ --output cloned_output.pcm ``` ## Best Practices ### Optimizing Voice Quality The quality of your cloned voice depends heavily on the source audio. Use professional recordings when possible. Include a range of intonations, emotions, and sentence types in your reference audio for a more natural clone. Experiment with different `cfg_scale` values. Cloned voices often benefit from slightly lower values (1.5-2.0) for more natural output. The `kugel-3` model generally produces better results for voice cloning due to its larger capacity. If your output contains unwanted "um"s, "ah"s, or hesitations, re-record or edit your reference audio to remove them. The model faithfully reproduces what it hears in the samples — clean input produces clean, controllable output. You can always add fillers via your text prompts later. ### Troubleshooting | Issue | Solution | | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------ | | Voice sounds robotic | Use higher quality source audio, try lower CFG scale | | Voice sounds different | Ensure source audio is clean, try different text samples | | Accent not preserved | Include more diverse samples, use longer reference audio | | Inconsistent output | Try different CFG values (2.0–3.0) | | Unwanted filler sounds (um, ah, hmm) | Re-record or edit reference audio to remove fillers — see [Content Guidelines](#content-guidelines) | | Unexpected long pauses | Re-record or edit reference audio to remove long gaps between sentences — the model can learn and reproduce these pauses | ## Managing Cloned Voices ### List Your Voices ```python theme={null} voices = client.voices.list() for voice in voices: print(f"{voice.id}: {voice.name} ({voice.category})") ``` ```typescript theme={null} const voices = await client.voices.list(); for (const voice of voices) { console.log(`${voice.id}: ${voice.name} (${voice.category})`); } ``` ```bash theme={null} curl "https://api.kugelaudio.com/v1/voices" \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" ``` ### Update Voice ```python theme={null} voice = client.voices.update( voice_id=1071, name="Updated Name", description="Updated description", ) print(f"Updated: {voice.name}") ``` ```typescript theme={null} const voice = await client.voices.update(1071, { name: 'Updated Name', description: 'Updated description', }); console.log(`Updated: ${voice.name}`); ``` ```bash theme={null} curl -X PATCH https://api.kugelaudio.com/v1/voices/1071 \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "Updated Name", "description": "Updated description" }' ``` ### Delete Voice ```python theme={null} client.voices.delete(voice_id=1071) ``` ```typescript theme={null} await client.voices.delete(1071); ``` ```bash theme={null} curl -X DELETE https://api.kugelaudio.com/v1/voices/1071 \ -H "Authorization: Bearer $KUGELAUDIO_API_KEY" ``` ## Managing Reference Audio You can add and remove reference audio files after creating a voice. ### List References ```python theme={null} refs = client.voices.list_references(voice_id=1071) for ref in refs: print(f"{ref.id}: {ref.name}") ``` ```typescript theme={null} const refs = await client.voices.listReferences(1071); for (const ref of refs) { console.log(`${ref.id}: ${ref.name}`); } ``` ### Add Reference ```python theme={null} ref = client.voices.add_reference( voice_id=1071, file_path="new_reference.wav", reference_text="Optional transcript of the audio.", ) print(f"Added reference: {ref.id}") ``` ```typescript theme={null} const file = new File([audioBuffer], 'new_reference.wav', { type: 'audio/wav' }); const ref = await client.voices.addReference(1071, file, 'Optional transcript.'); console.log(`Added reference: ${ref.id}`); ``` ### Delete Reference ```python theme={null} client.voices.delete_reference(voice_id=1071, reference_id=456) ``` ```typescript theme={null} await client.voices.deleteReference(1071, 456); ``` ## Publishing Voices Request that your voice be made public. It will be marked as pending verification until reviewed by an admin. ```python theme={null} voice = client.voices.publish(voice_id=1071) print(f"Pending verification: {voice.pending_verification}") ``` ```typescript theme={null} const voice = await client.voices.publish(1071); console.log(`Pending verification: ${voice.pendingVerification}`); ``` ## Generating Voice Samples Trigger sample audio generation for a voice. This is done automatically on creation, but you can re-trigger it manually. ```python theme={null} voice = client.voices.generate_sample(voice_id=1071) print(f"Sample URL: {voice.sample_url}") ``` ```typescript theme={null} const voice = await client.voices.generateSample(1071); console.log(`Sample URL: ${voice.sampleUrl}`); ``` ## AI Transparency & Watermarking All audio generated by KugelAudio — including voice-cloned output — is automatically watermarked using **AudioSeal**, an imperceptible neural watermarking technique. This watermarking is required under **EU AI Act Article 50** (Regulation (EU) 2024/1689), which mandates that AI-generated audio content be marked in a machine-detectable way. The watermark is inaudible to humans and survives common post-processing operations (re-encoding, light compression). The watermark encodes: * A KugelAudio-issued identifier linking the audio to the originating API key * A generation timestamp This allows KugelAudio and auditors to verify whether a piece of audio was generated by the system, supporting abuse detection and regulatory compliance. **What this means for you as an API customer:** * You do not need to do anything — watermarking is applied automatically on every synthesis request. * If you redistribute AI-generated audio, you are responsible for complying with applicable disclosure obligations in your jurisdiction (e.g. labelling synthetic media in advertising or public communications). * The watermark does **not** affect audio quality at perceptible levels. ## Privacy & Ethics Only clone voices you have permission to use. Misuse of voice cloning technology may violate laws and our Terms of Service. ### Guidelines 1. **Get consent** - Always obtain permission before cloning someone's voice 2. **Disclose synthetic speech** - Be transparent when using cloned voices in public-facing contexts 3. **No impersonation** - Don't use cloned voices to deceive or defraud 4. **Respect rights** - Don't clone voices of public figures without authorization ### Verification For Business and Enterprise plans, we offer voice verification to ensure ethical use: 1. Upload proof of consent 2. Our team reviews the submission 3. Voice is marked as "verified" 4. Verified voices have no usage restrictions ## Next Steps Browse and use available voices Generate audio with your cloned voice Learn about available models