Skip to content
localmodel.run

Audio model · whisper

Whisper large-v3 requirements

Speech to text · 1.55B params · int8 (faster-whisper) / fp16 (whisper.cpp) · released Nov 2023. Light enough to run on CPU, no GPU required.

MIT Commercial use OK

MIT per the OpenAI Whisper repo (the HuggingFace card labels large-v3 Apache-2.0; the repo is the source of truth).

Peak memory (int8)
~2.5 GB
Runs on CPU
Yes
Parameters
1.55B
Type
Speech to text

Run it

Runtime tools int8

Whisper large-v3 runs in whisper.cpp, faster-whisper, MacWhisper or WhisperX at int8. It runs CPU-only, and the smaller tiers are fast enough for real-time use on a laptop or phone.

whisper.cppfaster-whisperMacWhisperWhisperX

Which devices can run Whisper large-v3?

FAQ

How much memory does Whisper large-v3 need?

At int8 it consumes ~2.5 GB. It runs on CPU, so a GPU is optional.

Can Whisper large-v3 run on a phone or CPU?

Yes for CPU. The smaller tiers are light enough for real-time use, and on-device phone runtimes are available.

Can I use Whisper large-v3 commercially?

Yes. Whisper large-v3 is licensed MIT, which permits commercial use.

Notes

Most accurate Whisper (1.55B). Via faster-whisper int8 it peaks at ~2.5GB VRAM (2,953MB measured, beam=5); via whisper.cpp the ggml model is 2.9GB and runtime RAM ~3.9GB. Runs CPU-only and on Apple Silicon (Metal) and phones (whisper.cpp). The openai/whisper README's ~10GB figure is the original fp32 PyTorch path, not whisper.cpp/faster-whisper. License is MIT per the OpenAI repo. Sources: OpenAI Whisper repo, faster-whisper benchmark issue, whisper.cpp README.

Sources

Memory is a sourced peak-usage anchor at int8 (composed from reported sizes, not a single measurement), validated 2026-06-15. See methodology.