Audio model · whisper
Whisper large-v3-turbo requirements
Speech to text · 809M params · int8 (faster-whisper) / fp16 (whisper.cpp) · released Oct 2024. Light enough to run on CPU, no GPU required.
Run it
Whisper large-v3-turbo runs in whisper.cpp, faster-whisper, MacWhisper or WhisperKit at int8. It runs CPU-only, and the smaller tiers are fast enough for real-time use on a laptop or phone.
Which devices can run Whisper large-v3-turbo?
Apple Silicon Macs
- Apple M1 (8GB) Yes
- Apple M2 (16GB) Yes
- Apple M4 (16GB) Yes
- Apple M5 (16GB) Yes
- Apple M3 Pro (18GB) Yes
- Apple M4 (24GB) Yes
- Apple M4 Pro (24GB) Yes
- Apple M5 (32GB) Yes
- Apple M4 Pro (48GB) Yes
- Apple M5 Pro (48GB) Yes
- Apple M4 Max (64GB) Yes
- Apple M4 Max (128GB) Yes
- Apple M5 Max (128GB) Yes
- Apple M3 Ultra (256GB) Yes
RAM-only laptops
iPhone & iPad
Android
NVIDIA GPUs
AMD GPUs
FAQ
How much memory does Whisper large-v3-turbo need?
At int8 it consumes ~1.5 GB. It runs on CPU, so a GPU is optional.
Can Whisper large-v3-turbo run on a phone or CPU?
Yes for CPU. The smaller tiers are light enough for real-time use, and on-device phone runtimes are available.
Can I use Whisper large-v3-turbo commercially?
Yes. Whisper large-v3-turbo is licensed MIT, which permits commercial use.
Pruned large-v3 (decoder layers cut from 32 to 4, ~809M params) that runs roughly 5x faster at near-large-v3 accuracy. Via faster-whisper int8 it peaks at ~1.5GB VRAM (1,545MB measured). Light enough for phones (CoreML) and CPU. License MIT. Sources: HF model card, faster-whisper benchmark, whisper.cpp README.
Sources
Memory is a sourced peak-usage anchor at int8, validated 2026-06-15. See methodology.