Guides

Changelog

A record of all SAUTI platform releases, API changes, and model updates.

v1.0.0 — March 2026

SAUTI TTS generally available.

  • POST /v1/text-to-speech/{voice_id} is live. Accepts Kiswahili text and returns base64-encoded WAV audio in a JSON response.
  • Model: VITS fine-tuned on the Google WAXAL swa_tts split (1,387 training utterances, 1,778 total across splits). Improved MOS over multilingual baselines on Swahili test sentences.
  • Voices API: GET /v1/voices to list available voices and retrieve metadata.
  • Async Jobs API: POST /v1/tts/jobs for long-text synthesis. Texts over 2,000 characters are automatically queued. Poll with GET /v1/tts/jobs/{job_id} and download with GET /v1/tts/jobs/{job_id}/audio.
  • Authentication via xi-api-key header.
  • Rate limiting: per-minute sliding windows — 10/min synthesis, 60/min polling, 30/min default.
  • Output: 16kHz mono WAV. Latency under 500ms for inputs up to 200 characters.

v0.1.0 — January 2026

FiniFlow Labs platform foundation.

  • Initial infrastructure: API gateway, authentication, rate limiting, and RFC 7807 error handling in place.
  • Research pipeline established: WAXAL dataset download, preprocessing, and training harness for VITS-based models.
  • FiniFlow Labs developer site launched at finiflowlabs.com — introducing the SAUTI platform and research proposition.
  • Dual-model roadmap published: TTS (Kiswahili) → ASR (Kiswahili) → Voice Agent → Hausa, Yoruba, Amharic.