Audio model · kokoro
KO Kokoro-82M requirements
Text to speech · 82M params · fp32 / ONNX int8 · released Jan 2025. Light enough to run on CPU, no GPU required.
Run it
Kokoro-82M runs in kokoro (Python), ONNX Runtime, Kokoro-FastAPI or CoreML at fp32. It runs CPU-only, and the smaller tiers are fast enough for real-time use on a laptop or phone.
Which devices can run Kokoro-82M?
Apple Silicon Macs
- Apple M1 (8GB) Yes
- Apple M2 (16GB) Yes
- Apple M4 (16GB) Yes
- Apple M5 (16GB) Yes
- Apple M3 Pro (18GB) Yes
- Apple M4 (24GB) Yes
- Apple M4 Pro (24GB) Yes
- Apple M5 (32GB) Yes
- Apple M4 Pro (48GB) Yes
- Apple M5 Pro (48GB) Yes
- Apple M4 Max (64GB) Yes
- Apple M4 Max (128GB) Yes
- Apple M5 Max (128GB) Yes
- Apple M3 Ultra (256GB) Yes
RAM-only laptops
iPhone & iPad
Android
NVIDIA GPUs
AMD GPUs
FAQ
How much memory does Kokoro-82M need?
At fp32 it consumes ~1 GB. It runs on CPU, so a GPU is optional.
Can Kokoro-82M run on a phone or CPU?
Yes for CPU. The smaller tiers are light enough for real-time use, and on-device phone runtimes are available.
Can I use Kokoro-82M commercially?
Yes. Kokoro-82M is licensed Apache-2.0, which permits commercial use.
Tiny 82M StyleTTS2-based voice model with surprisingly good quality. Weights ~326MB (fp32), under 1GB for CPU inference; CoreML process peak ~1.5GB on Apple Silicon; ~2-3GB on a CUDA GPU. Confirmed on-device iPhone/iPad (CoreML, Apple Neural Engine) and Android (ONNX int8). Apache-2.0, commercial OK. Sources: HF card, FluidInference CoreML port, Kokoro-FastAPI, Android demo.
Sources
Memory is a sourced peak-usage anchor at fp32 (composed from reported sizes, not a single measurement), validated 2026-06-15. See methodology.