Skip to main content

AudioChunk

A single audio chunk from streaming:
AudioChunk chunk;
byte[]  bytes = chunk.getAudio();      // Raw bytes in chunk.getEncoding() format
String  enc   = chunk.getEncoding();   // "pcm_s16le", "mulaw", or "alaw"
int     index = chunk.getIndex();      // 0-based chunk index
int     rate  = chunk.getSampleRate(); // 24000, or 8000 for G.711
int     n     = chunk.getSamples();    // Number of samples; G.711 uses 1 byte/sample
float[] f32   = chunk.toFloat32();     // PCM-only; throws for "mulaw"/"alaw"

AudioResponse

Complete audio result from generate():
AudioResponse audio;
byte[]              bytes      = audio.getAudio();          // Raw bytes in audio.getEncoding() format
String              enc        = audio.getEncoding();       // "pcm_s16le", "mulaw", or "alaw"
int                 rate       = audio.getSampleRate();      // 24000
int                 samples    = audio.getTotalSamples();
double              durationMs = audio.getDurationMs();
double              genMs      = audio.getGenerationMs();
double              rtf        = audio.getRtf();             // Real-time factor
List<WordTimestamp> timestamps = audio.getWordTimestamps();  // empty unless requested

audio.saveWav(Path.of("output.wav")); // PCM-only; throws for "mulaw"/"alaw"
byte[] wav = audio.toWavBytes();      // PCM-only WAV bytes with 44-byte header
float[] f32 = audio.toFloat32();      // PCM-only normalised [-1.0, 1.0]

WordTimestamp

Word-level time alignment:
WordTimestamp ts;
String word     = ts.getWord();      // The aligned word
long   startMs  = ts.getStartMs();   // Start time in milliseconds
long   endMs    = ts.getEndMs();     // End time in milliseconds
int    charStart = ts.getCharStart(); // Start character offset in original text
int    charEnd   = ts.getCharEnd();   // End character offset in original text
double score    = ts.getScore();     // Alignment confidence (0.0 – 1.0)

Model

Model model;
String id             = model.getId();             // e.g. 'kugel-3'
String name           = model.getName();
String description    = model.getDescription();
String parameters     = model.getParameters();      // Parameter-count label (e.g. '7B')
int    maxInput       = model.getMaxInputLength();
int    sampleRate     = model.getSampleRate();

VoiceListResponse

Paginated response from voices().list():
VoiceListResponse result;
List<Voice> voices = result.getVoices();  // Voices on this page
int total          = result.getTotal();   // Total matching voices
int limit          = result.getLimit();   // Page size used
int offset         = result.getOffset();  // Offset used

Voice

Voice voice;
int    id        = voice.getId();
String name      = voice.getName();
String sex       = voice.getSex();       // 'male', 'female', 'neutral'
String language  = voice.getLanguage();
String sampleUrl = voice.getSampleUrl();
boolean isPublic = voice.isPublic();

Audio Utilities

The AudioFormats utility class provides codec helpers:
import com.kugelaudio.sdk.AudioFormats;

// Write PCM16 data directly to a WAV file
AudioFormats.writePcm16Wav(Path.of("out.wav"), pcmBytes, 24000, (short) 1);

// Get audio duration in milliseconds
int durationMs = AudioFormats.durationMs(pcmBytes, 24000, 16, 1);

// Convert PCM16 to G.711 (for telephony)
byte[] ulaw = AudioFormats.pcm16ToUlaw(pcmBytes);
byte[] alaw = AudioFormats.pcm16ToAlaw(pcmBytes);

// Convert G.711 back to PCM16
byte[] pcm = AudioFormats.ulawToPcm16(ulawBytes);
byte[] pcmFromAlaw = AudioFormats.alawToPcm16(alawBytes);

Complete Example

import com.kugelaudio.sdk.*;
import java.nio.file.Path;
import java.util.List;

public class KugelAudioExample {
    public static void main(String[] args) throws Exception {
        try (KugelAudio client = new KugelAudio(
                KugelAudioOptions.builder("your_api_key").build())) {

            // List available models
            System.out.println("Available Models:");
            for (Model model : client.models().list()) {
                System.out.printf("  - %s: %s (%s)%n",
                    model.getId(), model.getName(), model.getParameters());
            }

            // List available voices
            System.out.println("\nAvailable Voices:");
            VoiceListResponse voiceResult = client.voices().list("en", true, 5, null);
            for (Voice voice : voiceResult.getVoices()) {
                System.out.printf("  - %d: %s%n", voice.getId(), voice.getName());
            }

            // Generate audio
            System.out.println("\nGenerating audio...");
            AudioResponse audio = client.tts().generate(
                GenerateRequest.builder(
                    "Welcome to KugelAudio. This is an example of high-quality text-to-speech synthesis."
                )
                .modelId("kugel-3")
                .language("en")
                .build()
            );

            System.out.printf("Generated %.2fs of audio in %.0fms (RTF: %.2f)%n",
                audio.getDurationMs() / 1000.0, audio.getGenerationMs(), audio.getRtf());

            // Save to file
            audio.saveWav(Path.of("example.wav"));
            System.out.println("Saved to example.wav");
        }
    }
}

Back to Quickstart — or see the API Reference for the raw HTTP contract.