# Can I run Llama 3.2 3B on Nvidia GeForce RTX 5090 (32GB)?

Updated: 2026-06-15

**Yes, it runs.** Runs at Q4_K_M using ~3.2 GB of ~31 GB usable. You have room for FP16 for higher quality.

- Model: 3B, Q4_K_M 2.02 GB
- Device: 32 GB vram, ~31 GB usable for weights
- Needs ~3.2 GB at Q4_K_M; recommended quant: Q4_K_M
- Best tool on Windows: LM Studio
- Command: `ollama run llama3.2:3b`

Estimate. Method: weights + KV cache + ~0.8GB overhead. Sources: https://ollama.com/library/llama3.2, https://ollama.com/library/llama3.2/tags, https://huggingface.co/unsloth/Llama-3.2-3B-Instruct-GGUF, https://lmarena.ai/leaderboard.

More: https://localmodel.run/can-i-run/llama-3.2-3b/nvidia-rtx-5090-32gb