Text model · Llama 3.2 Vision
Llama 3.2 Vision 11B requirements
Llama 3.2 Vision family · 10.7B params · released Sep 2024 · 4.6M Ollama pulls. Minimum to run at Q4_K_M: Nvidia GeForce RTX 3060 (12GB).
Quantization sizes
| Quantization | Size on disk |
|---|---|
| Q2_K | 4.5 GB est |
| Q3_K_M | 5.2 GB est |
| Q4_K_M (default) | 7.36 GB |
| Q5_K_M | 7.6 GB est |
| Q6_K | 8.8 GB est |
| Q8_0 | 11.49 GB |
| FP16 | 21.4 GB est |
Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.
Run it
ollama run llama3.2-vision:11b llama-cli -hf leafspark/Llama-3.2-11B-Vision-Instruct-GGUF:Q4_K_M lms get leafspark/Llama-3.2-11B-Vision-Instruct-GGUF Which devices can run Llama 3.2 Vision 11B?
Apple Silicon Macs
RAM-only laptops
iPhone & iPad
Android
NVIDIA GPUs
AMD GPUs
FAQ
How much VRAM or RAM does Llama 3.2 Vision 11B need?
At Q4_K_M, Llama 3.2 Vision 11B needs about 9 GB (weights ~7.36 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~13.1 GB.
Can Llama 3.2 Vision 11B run on a laptop?
Llama 3.2 Vision 11B is large; you need a 24 GB+ GPU or a 32-48 GB Mac at Q4_K_M.
Can I use Llama 3.2 Vision 11B commercially?
Conditionally. Llama 3.2 Community License: free under 700M MAU.
Meta Llama 3.2 Vision 11B: image understanding plus text. Q4 ~5.55GB plus a ~1.81GB vision projector, folded into the totals (matches the ~7.3GB Ollama llama3.2-vision tag). Vision needs an mmproj-capable runtime.
Sources
Memory figures are estimates. See methodology.