> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kugelaudio.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Voice Cloning

> Create custom voices from audio samples

Voice cloning allows you to create a synthetic voice that sounds like a specific person from just a few seconds of reference audio.

## How It Works

1. **Upload reference audio** - Provide 10-30 seconds of clean speech
2. **Processing** - Our AI analyzes the voice characteristics
3. **Voice created** - Use your new voice in any TTS request

## Requirements

### Audio Quality

For best results, your reference audio should be:

* **Duration:** 10-30 seconds of speech
* **Format:** WAV, MP3, OGG, M4A, or FLAC
* **Sample rate:** 16kHz or higher
* **Channels:** Mono preferred
* **Quality:** Clean, no background noise

### Content Guidelines

✅ **Good audio:**

* Clear speech with natural pacing
* Single speaker only
* Minimal background noise
* Natural emotional range
* Free of filler words (um, uh, ah, hmm) unless you want them in the output

❌ **Avoid:**

* Multiple speakers
* Background music
* Heavy reverb or echo
* Whispered or shouted speech
* Heavily compressed audio
* Recordings with frequent filler sounds or hesitations
* Long gaps or extended silence between sentences, unless you want the cloned voice to reproduce those pauses

<Warning>
  **Your samples define the voice.** The cloned voice will reproduce everything present in your reference audio — including filler sounds like "um", "ah", "hmm", long sentence gaps, breathing patterns, and any other speech habits. If your reference audio contains these sounds or pauses, they will appear in the generated output and cannot be removed after cloning.

  For the most controllable results, use **clean recordings without fillers or long silences**. You can then add natural-sounding hesitations through your text prompts when needed (e.g., writing "um" or "..." in the input text).
</Warning>

## Creating a Voice Clone

### Via Dashboard

1. Go to **Dashboard** → **Voices** → **Create Voice**
2. Upload your reference audio
3. Enter a name and description
4. Click **Create Voice**
5. Wait for processing (usually 2-5 minutes)

### Via SDK

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from kugelaudio import KugelAudio

    client = KugelAudio(api_key="YOUR_API_KEY")

    # Create a voice with reference audio
    voice = client.voices.create(
        name="My Custom Voice",
        sex="female",
        description="Cloned from reference audio",
        category="cloned",
        reference_files=["reference.wav"],
    )

    print(f"Created voice: {voice.id}")
    print(f"Name: {voice.name}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    import { KugelAudio } from 'kugelaudio';

    const client = new KugelAudio({ apiKey: 'YOUR_API_KEY' });

    // Create a voice with reference audio (browser)
    const fileInput = document.getElementById('audio-upload') as HTMLInputElement;
    const file = fileInput.files![0];

    const voice = await client.voices.create({
      name: 'My Custom Voice',
      sex: 'female',
      description: 'Cloned from reference audio',
      category: 'cloned',
      referenceFiles: [file],
    });

    console.log(`Created voice: ${voice.id}`);
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    curl -X POST https://api.kugelaudio.com/v1/voices \
      -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
      -F 'metadata={"name":"My Custom Voice","sex":"female","description":"Cloned from reference audio","category":"cloned"};type=application/json' \
      -F "files=@reference.wav"
    ```
  </Tab>
</Tabs>

## Using Cloned Voices

Once created, use your cloned voice like any other:

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from kugelaudio import KugelAudio

    client = KugelAudio(api_key="YOUR_API_KEY")

    # Use your cloned voice
    audio = client.tts.generate(
        text="Hello, this is my cloned voice speaking!",
        model_id="kugel-3",
        voice_id=YOUR_CLONED_VOICE_ID,
    )

    audio.save("cloned_output.wav")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    import { KugelAudio } from 'kugelaudio';

    const client = new KugelAudio({ apiKey: 'YOUR_API_KEY' });

    const audio = await client.tts.generate({
      text: 'Hello, this is my cloned voice speaking!',
      modelId: 'kugel-3',
      voiceId: YOUR_CLONED_VOICE_ID,
    });
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    curl -X POST https://api.kugelaudio.com/v1/tts/generate \
      -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "text": "Hello, this is my cloned voice speaking!",
        "model_id": "kugel-3",
        "voice_id": YOUR_CLONED_VOICE_ID
      }' \
      --output cloned_output.pcm
    ```
  </Tab>
</Tabs>

## Best Practices

### Optimizing Voice Quality

<AccordionGroup>
  <Accordion title="Use high-quality source audio" icon="microphone">
    The quality of your cloned voice depends heavily on the source audio. Use professional recordings when possible.
  </Accordion>

  <Accordion title="Provide diverse samples" icon="list">
    Include a range of intonations, emotions, and sentence types in your reference audio for a more natural clone.
  </Accordion>

  <Accordion title="Adjust CFG scale" icon="sliders">
    Experiment with different `cfg_scale` values. Cloned voices often benefit from slightly lower values (1.5-2.0) for more natural output.
  </Accordion>

  <Accordion title="Use the right model" icon="microchip">
    The `kugel-3` model generally produces better results for voice cloning due to its larger capacity.
  </Accordion>
</AccordionGroup>

<Accordion title="Remove filler sounds from samples" icon="eraser">
  If your output contains unwanted "um"s, "ah"s, or hesitations, re-record or edit your reference audio to remove them. The model faithfully reproduces what it hears in the samples — clean input produces clean, controllable output. You can always add fillers via your text prompts later.
</Accordion>

### Troubleshooting

| Issue                                | Solution                                                                                                                 |
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------ |
| Voice sounds robotic                 | Use higher quality source audio, try lower CFG scale                                                                     |
| Voice sounds different               | Ensure source audio is clean, try different text samples                                                                 |
| Accent not preserved                 | Include more diverse samples, use longer reference audio                                                                 |
| Inconsistent output                  | Try different CFG values (2.0–3.0)                                                                                       |
| Unwanted filler sounds (um, ah, hmm) | Re-record or edit reference audio to remove fillers — see [Content Guidelines](#content-guidelines)                      |
| Unexpected long pauses               | Re-record or edit reference audio to remove long gaps between sentences — the model can learn and reproduce these pauses |

## Managing Cloned Voices

### List Your Voices

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    voices = client.voices.list()

    for voice in voices:
        print(f"{voice.id}: {voice.name} ({voice.category})")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const voices = await client.voices.list();

    for (const voice of voices) {
      console.log(`${voice.id}: ${voice.name} (${voice.category})`);
    }
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    curl "https://api.kugelaudio.com/v1/voices" \
      -H "Authorization: Bearer $KUGELAUDIO_API_KEY"
    ```
  </Tab>
</Tabs>

### Update Voice

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    voice = client.voices.update(
        voice_id=1071,
        name="Updated Name",
        description="Updated description",
    )
    print(f"Updated: {voice.name}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const voice = await client.voices.update(1071, {
      name: 'Updated Name',
      description: 'Updated description',
    });
    console.log(`Updated: ${voice.name}`);
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    curl -X PATCH https://api.kugelaudio.com/v1/voices/1071 \
      -H "Authorization: Bearer $KUGELAUDIO_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "Updated Name",
        "description": "Updated description"
      }'
    ```
  </Tab>
</Tabs>

### Delete Voice

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    client.voices.delete(voice_id=1071)
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    await client.voices.delete(1071);
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    curl -X DELETE https://api.kugelaudio.com/v1/voices/1071 \
      -H "Authorization: Bearer $KUGELAUDIO_API_KEY"
    ```
  </Tab>
</Tabs>

## Managing Reference Audio

You can add and remove reference audio files after creating a voice.

### List References

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    refs = client.voices.list_references(voice_id=1071)
    for ref in refs:
        print(f"{ref.id}: {ref.name}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const refs = await client.voices.listReferences(1071);
    for (const ref of refs) {
      console.log(`${ref.id}: ${ref.name}`);
    }
    ```
  </Tab>
</Tabs>

### Add Reference

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    ref = client.voices.add_reference(
        voice_id=1071,
        file_path="new_reference.wav",
        reference_text="Optional transcript of the audio.",
    )
    print(f"Added reference: {ref.id}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const file = new File([audioBuffer], 'new_reference.wav', { type: 'audio/wav' });
    const ref = await client.voices.addReference(1071, file, 'Optional transcript.');
    console.log(`Added reference: ${ref.id}`);
    ```
  </Tab>
</Tabs>

### Delete Reference

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    client.voices.delete_reference(voice_id=1071, reference_id=456)
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    await client.voices.deleteReference(1071, 456);
    ```
  </Tab>
</Tabs>

## Publishing Voices

Request that your voice be made public. It will be marked as pending verification until reviewed by an admin.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    voice = client.voices.publish(voice_id=1071)
    print(f"Pending verification: {voice.pending_verification}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const voice = await client.voices.publish(1071);
    console.log(`Pending verification: ${voice.pendingVerification}`);
    ```
  </Tab>
</Tabs>

## Generating Voice Samples

Trigger sample audio generation for a voice. This is done automatically on creation, but you can re-trigger it manually.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    voice = client.voices.generate_sample(voice_id=1071)
    print(f"Sample URL: {voice.sample_url}")
    ```
  </Tab>

  <Tab title="JavaScript">
    ```typescript theme={null}
    const voice = await client.voices.generateSample(1071);
    console.log(`Sample URL: ${voice.sampleUrl}`);
    ```
  </Tab>
</Tabs>

## AI Transparency & Watermarking

All audio generated by KugelAudio — including voice-cloned output — is automatically watermarked using **AudioSeal**, an imperceptible neural watermarking technique.

<Info>
  This watermarking is required under **EU AI Act Article 50** (Regulation (EU) 2024/1689), which mandates that AI-generated audio content be marked in a machine-detectable way. The watermark is inaudible to humans and survives common post-processing operations (re-encoding, light compression).
</Info>

The watermark encodes:

* A KugelAudio-issued identifier linking the audio to the originating API key
* A generation timestamp

This allows KugelAudio and auditors to verify whether a piece of audio was generated by the system, supporting abuse detection and regulatory compliance.

**What this means for you as an API customer:**

* You do not need to do anything — watermarking is applied automatically on every synthesis request.
* If you redistribute AI-generated audio, you are responsible for complying with applicable disclosure obligations in your jurisdiction (e.g. labelling synthetic media in advertising or public communications).
* The watermark does **not** affect audio quality at perceptible levels.

## Privacy & Ethics

<Warning>
  Only clone voices you have permission to use. Misuse of voice cloning technology may violate laws and our Terms of Service.
</Warning>

### Guidelines

1. **Get consent** - Always obtain permission before cloning someone's voice
2. **Disclose synthetic speech** - Be transparent when using cloned voices in public-facing contexts
3. **No impersonation** - Don't use cloned voices to deceive or defraud
4. **Respect rights** - Don't clone voices of public figures without authorization

### Verification

For Business and Enterprise plans, we offer voice verification to ensure ethical use:

1. Upload proof of consent
2. Our team reviews the submission
3. Voice is marked as "verified"
4. Verified voices have no usage restrictions

## Next Steps

<CardGroup cols={2}>
  <Card title="Using Voices" icon="microphone" href="/features/voices">
    Browse and use available voices
  </Card>

  <Card title="Generate Speech" icon="play" href="/features/generate">
    Generate audio with your cloned voice
  </Card>

  <Card title="Models" icon="microchip" href="/models">
    Learn about available models
  </Card>
</CardGroup>
