African Language AI
Voice AI built for Africa's languages
FiniFlow Labs trains, evaluates, and deploys speech AI systems grounded in African linguistic data. Our first system, SAUTI, brings natural Swahili voice synthesis and recognition to production APIs.
Our Mission
African languages deserve first-class AI
The majority of AI speech systems are trained on English and a handful of European languages. African languages — spoken by over a billion people — are largely absent from mainstream model research.
FiniFlow Labs closes that gap. We source native-speaker data, build language-specific training pipelines, and publish models and benchmarks that the broader research community can build on.
Kiswahili
200M+ speakers
LiveHausa
150M+ speakers
In DevelopmentYoruba
50M+ speakers
In DevelopmentAmharic
60M+ speakers
In DevelopmentWhat We Build
The SAUTI platform
SAUTI TTS
Convert written Swahili into natural-sounding speech. Fine-tuned on the Google WAXAL dataset using VITS architecture with LoRA adapters.
View API docsSAUTI ASR
Transcribe spoken Swahili audio with low word-error rate. Built on HuggingFace MMS-300M with Swahili-specific fine-tuning.
Request accessVoice Agent API
A full voice-turn IVR agent for Swahili — combining SAUTI TTS + ASR with a language model backend for real telephony deployments.
Join the waitlistProducts
Platform roadmap
SAUTI TTS
v1.0 — Swahili
Production-grade Swahili text-to-speech. Serves synthesized audio via a low-latency REST API backed by a fine-tuned VITS model.
SAUTI ASR
v0.5 — Swahili
Swahili automatic speech recognition with competitive WER on conversational audio. Optimised for telephony-quality 8kHz input.
Voice Agent
Roadmap — Q3 2026
End-to-end Swahili voice agent for IVR and telephony. Combines TTS, ASR, and an LLM backbone for natural turn-by-turn conversation.
Research
Grounded in African linguistic data
We build on open datasets and pretrained multilingual models, then apply targeted fine-tuning to close the performance gap between high-resource and African language AI systems. All model weights, training configs, and evaluation results are published openly.
Techniques
Datasets
Collaborators & acknowledgements
Google WAXAL / African Next Voices
Open Swahili speech dataset
HuggingFace MMS
Massively multilingual pretrained models
Mozilla Common Voice
Community speech data pipeline
Latest Updates
From the lab
SAUTI TTS v1: Training a VITS model on Swahili from 1,400 utterances
We fine-tuned a VITS TTS model on the WAXAL swa_tts dataset — just 1,387 training samples — and achieved a 3× naturalness improvement over the multilingual MMS-TTS baseline. Here is what we learned.
Read moreInside the WAXAL dataset: structure, quirks, and what to watch for
The Google WAXAL Swahili TTS split is small (1,778 utterances), has extreme duration outliers, and uses 48kHz audio. We document every gotcha we hit building our preprocessing pipeline.
Read moreIntroducing FiniFlow Labs: building African language AI from the ground up
African languages are spoken by over a billion people yet remain largely absent from mainstream AI research. FiniFlow Labs is our answer — a research lab and API platform dedicated to closing that gap.
Read more