Guides

Changelog

A record of all SAUTI platform releases, API changes, and model updates.

v1.0.0 — March 2026

SAUTI TTS generally available.

POST /v1/text-to-speech/{voice_id} is live. Accepts Kiswahili text and returns base64-encoded WAV audio in a JSON response.
Model: VITS fine-tuned on the Google WAXAL swa_tts split (1,387 training utterances, 1,778 total across splits). Improved MOS over multilingual baselines on Swahili test sentences.
Voices API: GET /v1/voices to list available voices and retrieve metadata.
Async Jobs API: POST /v1/tts/jobs for long-text synthesis. Texts over 2,000 characters are automatically queued. Poll with GET /v1/tts/jobs/{job_id} and download with GET /v1/tts/jobs/{job_id}/audio.
Authentication via xi-api-key header.
Rate limiting: per-minute sliding windows — 10/min synthesis, 60/min polling, 30/min default.
Output: 16kHz mono WAV. Latency under 500ms for inputs up to 200 characters.

FiniFlow Labs platform foundation.

Initial infrastructure: API gateway, authentication, rate limiting, and RFC 7807 error handling in place.
Research pipeline established: WAXAL dataset download, preprocessing, and training harness for VITS-based models.
FiniFlow Labs developer site launched at finiflowlabs.com — introducing the SAUTI platform and research proposition.
Dual-model roadmap published: TTS (Kiswahili) → ASR (Kiswahili) → Voice Agent → Hausa, Yoruba, Amharic.