Audio model · kokoro

KO Kokoro-82M requirements

Text to speech · 82M params · fp32 / ONNX int8 · released Jan 2025. Light enough to run on CPU, no GPU required.

Apache-2.0 Commercial use OK

Peak memory (fp32)

~1 GB

Runs on CPU

Yes

Parameters

82M

Type

Text to speech

Run it

Runtime tools fp32

Kokoro-82M runs in kokoro (Python), ONNX Runtime, Kokoro-FastAPI or CoreML at fp32. It runs CPU-only, and the smaller tiers are fast enough for real-time use on a laptop or phone.

kokoro (Python)ONNX RuntimeKokoro-FastAPICoreML

Which devices can run Kokoro-82M?

Apple Silicon Macs

RAM-only laptops

iPhone & iPad

Android

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB) Yes

FAQ

How much memory does Kokoro-82M need?

At fp32 it consumes ~1 GB. It runs on CPU, so a GPU is optional.

Can Kokoro-82M run on a phone or CPU?

Yes for CPU. The smaller tiers are light enough for real-time use, and on-device phone runtimes are available.

Can I use Kokoro-82M commercially?

Yes. Kokoro-82M is licensed Apache-2.0, which permits commercial use.

Notes

Tiny 82M StyleTTS2-based voice model with surprisingly good quality. Weights ~326MB (fp32), under 1GB for CPU inference; CoreML process peak ~1.5GB on Apple Silicon; ~2-3GB on a CUDA GPU. Confirmed on-device iPhone/iPad (CoreML, Apple Neural Engine) and Android (ONNX int8). Apache-2.0, commercial OK. Sources: HF card, FluidInference CoreML port, Kokoro-FastAPI, Android demo.

Sources

Memory is a sourced peak-usage anchor at fp32 (composed from reported sizes, not a single measurement), validated 2026-06-15. See methodology.