audio model · whisper · macOS
Can I run Whisper small on Apple M3 Ultra (256GB)?
Yes. Whisper small runs on Apple M3 Ultra (256GB) at fp16 (whisper.cpp) (~0.85 GB of ~192 GB usable).
Runs at fp16 (whisper.cpp) using ~0.85 GB of ~192 GB usable.
- Peak memory
- ~0.85 GB
- Usable on device
- ~192 GB
- Device memory
- 256 GB
- Quant
- fp16 (whisper.cpp)
How to run it
Use whisper.cpp or faster-whisper at fp16 (whisper.cpp). It is light enough to run on CPU; a GPU just makes it faster.
- Type
- Speech to text
- Parameters
- 244M
- Peak memory
- ~0.85 GB at fp16 (whisper.cpp)
- License
- MIT
- Memory
- 256 GB unified
- Usable for weights
- ~192 GB
- Best runtime
- MLX direct / Ollama (MLX backend)
You could also run
Run Whisper small on other hardware
FAQ
Can Apple M3 Ultra (256GB) run Whisper small?
Yes. Whisper small runs on Apple M3 Ultra (256GB) at fp16 (whisper.cpp) (~0.85 GB of ~192 GB usable).
How much memory does Whisper small need?
Apple M3 Ultra (256GB) has room to spare. At fp16 (whisper.cpp) the realistic peak is ~0.85 GB of memory.
What do I use to run Whisper small locally?
Whisper small runs in whisper.cpp or faster-whisper (among others). It runs on CPU, so no GPU is required.
Sources
VRAM figures are sourced peak-usage anchors at the noted quant, validated 2026-06-15. See methodology.