Skip to main content
KugelAudio provides text processing features to ensure your text is spoken naturally. This includes automatic normalization of numbers, dates, and currencies, as well as the ability to spell out text letter by letter.

Text Normalization

Text normalization converts numbers, dates, times, and other non-verbal text into spoken words:
  • “I have 3 apples” → “I have three apples”
  • “The meeting is at 2:30 PM” → “The meeting is at two thirty PM”
  • “€50.99” → “fifty euros and ninety-nine cents”
Enable normalization by setting normalize=True (Python) or normalize: true (JavaScript):
# With explicit language (recommended - fastest)
audio = client.tts.generate(
    text="I bought 3 items for €50.99 on 01/15/2024.",
    normalize=True,
    language="en",
)

# With auto-detection (adds ~150ms latency)
audio = client.tts.generate(
    text="Ich habe 3 Artikel für 50,99€ gekauft.",
    normalize=True,
    # language not specified - will auto-detect
)
Using normalize without specifying language adds approximately 150ms latency for language auto-detection. For best performance in latency-sensitive applications, always specify the language parameter.

Supported Languages

CodeLanguageCodeLanguage
deGermannlDutch
enEnglishplPolish
frFrenchsvSwedish
esSpanishdaDanish
itItaliannoNorwegian
ptPortuguesefiFinnish
csCzechhuHungarian
roRomanianelGreek
ukUkrainianbgBulgarian
trTurkishviVietnamese
arArabichiHindi
zhChinesejaJapanese
koKorean

Spell Tags

Use <spell> tags to spell out text letter by letter. This is useful for email addresses, codes, acronyms, or any text that should be pronounced character by character.
Spell tags require normalize to be enabled.
# Spell out an email address
audio = client.tts.generate(
    text="Contact me at <spell>[email protected]</spell>",
    normalize=True,
    language="en",
)
# Output: "Contact me at K, A, J, O, at, K, U, G, E, L, A, U, D, I, O, dot, C, O, M"

# Spell out an acronym
audio = client.tts.generate(
    text="The <spell>API</spell> is easy to use.",
    normalize=True,
    language="en",
)
# Output: "The A, P, I is easy to use."

# German example with language-specific translations
audio = client.tts.generate(
    text="Meine E-Mail ist <spell>[email protected]</spell>",
    normalize=True,
    language="de",
)
# Output: "Meine E-Mail ist T, E, S, T, ät, B, E, I, S, P, I, E, L, Punkt, D, E"

Language-Specific Character Translations

Special characters within <spell> tags are translated based on the language:
CharacterEnglishGermanFrenchSpanish
@atätarobasearroba
.dotPunktpointpunto
-dashStrichtiretguión
_underscoreUnterstrichunderscoreguión bajo

Spell Tags with Streaming

Spell tags work seamlessly with streaming. When streaming text token-by-token (e.g., from an LLM), tags that span multiple chunks are automatically handled:
async with client.tts.streaming_session(
    voice_id=123,
    normalize=True,
    language="en",
) as session:
    # Even if the tag is split across tokens, it works correctly
    async for chunk in session.send("My code is <spell>"):
        play_audio(chunk.audio)
    async for chunk in session.send("ABC123</spell>"):
        play_audio(chunk.audio)
    async for chunk in session.flush():
        play_audio(chunk.audio)
Streaming Safety: The system buffers text until the closing </spell> tag arrives before generating audio. If the stream ends unexpectedly, incomplete tags are auto-closed so the content still gets spelled out.
Model recommendation: For clearer letter-by-letter pronunciation, use kugel-1 instead of kugel-1-turbo.

Using Spell Tags with LLMs

When integrating with language models, add instructions to your system prompt so the LLM wraps appropriate text in spell tags:
SYSTEM_PROMPT = """You are a helpful assistant. When you need to spell out text 
(like email addresses, codes, or acronyms), wrap it in <spell> tags.

Examples:
- "My email is <spell>[email protected]</spell>"
- "The code is <spell>ABC123</spell>"
- "That stands for <spell>API</spell>, Application Programming Interface"
"""
For more details, see the LLM Integration guide.

Next Steps