Skip to main content
Text normalization converts numbers, dates, times, and other non-verbal text into spoken words:
  • “I have 3 apples” → “I have three apples”
  • “The meeting is at 2:30 PM” → “The meeting is at two thirty PM”
  • “€50.99” → “fifty euros and ninety-nine cents”
// With explicit language (recommended - fastest)
const audio = await client.tts.generate({
  text: 'I bought 3 items for €50.99 on 01/15/2024.',
  normalize: true,
  language: 'en',  // Specify language for best performance
});

// With auto-detection (may cause incorrect normalizations)
const audio = await client.tts.generate({
  text: 'Ich habe 3 Artikel für 50,99€ gekauft.',
  normalize: true,
  // language not specified - will auto-detect
});

Supported Languages

CodeLanguageCodeLanguage
deGermannlDutch
enEnglishplPolish
frFrenchsvSwedish
esSpanishdaDanish
itItaliannoNorwegian
ptPortuguesefiFinnish
csCzechhuHungarian
roRomanianelGreek
ukUkrainianbgBulgarian
trTurkishviVietnamese
arArabichiHindi
zhChinesejaJapanese
koKorean
Using normalize: true without specifying language may cause incorrect normalizations, especially for short texts or languages that share similar vocabulary. Always specify language when you know it.

Spell Tags

Use <spell> tags to spell out text letter by letter. This is useful for email addresses, codes, acronyms, or any text that should be pronounced character by character:
// Spell out an email address
const audio = await client.tts.generate({
  text: 'Contact me at <spell>kajo@kugelaudio.com</spell>',
  normalize: true,
  language: 'en',
});
// Output: "Contact me at K, A, J, O, at, K, U, G, E, L, A, U, D, I, O, dot, C, O, M"

// Spell out an acronym
const audio = await client.tts.generate({
  text: 'The <spell>API</spell> is easy to use.',
  normalize: true,
  language: 'en',
});
// Output: "The A, P, I is easy to use."

// German example with language-specific translations
const audio = await client.tts.generate({
  text: 'Meine E-Mail ist <spell>test@beispiel.de</spell>',
  normalize: true,
  language: 'de',
});
// Output: "Meine E-Mail ist T, E, S, T, ät, B, E, I, S, P, I, E, L, Punkt, D, E"
Spell tags also work with streaming:
await client.tts.stream(
  {
    // Even if the tag is split across the stream, it works correctly
    text: 'My verification code is <spell>ABC-123-XYZ</spell>.',
    normalize: true,
    language: 'en',
  },
  {
    onChunk: (chunk) => playAudio(chunk.audio),
  }
);
Special Characters: Characters like @, ., - are translated to language-specific words. For example, @ becomes “at” in English, “ät” in German, and “arobase” in French.
Model recommendation: use kugel-3 for the cleanest letter-by-letter pronunciation of spelled-out text.

For per-project pronunciation overrides (brand names, acronyms, domain vocabulary), see Dictionaries.