Skip to content
localmodel.run

audio model · whisper · Windows

Can I run Whisper small on Nvidia GeForce RTX 5090 (32GB)?

Compatibility verdict memory check
Yes, it runsfast on this GPU

Yes. Whisper small runs on Nvidia GeForce RTX 5090 (32GB) at fp16 (whisper.cpp) (~0.85 GB of ~31 GB usable).

Needs ~0.85 GB Device usable ~31 GB

Runs at fp16 (whisper.cpp) using ~0.85 GB of ~31 GB usable.

Peak memory
~0.85 GB
Usable on device
~31 GB
Device memory
32 GB
Quant
fp16 (whisper.cpp)

How to run it

Use whisper.cpp or faster-whisper at fp16 (whisper.cpp). It is light enough to run on CPU; a GPU just makes it faster.

Model whisper
Type
Speech to text
Parameters
244M
Peak memory
~0.85 GB at fp16 (whisper.cpp)
License
MIT
Full Whisper small requirements →
Device Windows
Memory
32 GB vram
Usable for weights
~31 GB
Best runtime
vLLM (Linux) / Ollama (CUDA)
Best models for Nvidia GeForce RTX 5090 (32GB) →

You could also run

Run Whisper small on other hardware

FAQ

Can Nvidia GeForce RTX 5090 (32GB) run Whisper small?

Yes. Whisper small runs on Nvidia GeForce RTX 5090 (32GB) at fp16 (whisper.cpp) (~0.85 GB of ~31 GB usable).

How much memory does Whisper small need?

Nvidia GeForce RTX 5090 (32GB) has room to spare. At fp16 (whisper.cpp) the realistic peak is ~0.85 GB of memory.

What do I use to run Whisper small locally?

Whisper small runs in whisper.cpp or faster-whisper (among others). It runs on CPU, so no GPU is required.

Sources

VRAM figures are sourced peak-usage anchors at the noted quant, validated 2026-06-15. See methodology.