Text model · Llama 3.2 Vision

Llama 3.2 Vision 11B requirements

Llama 3.2 Vision family · 10.7B params · released Sep 2024 · 4.6M Ollama pulls. Minimum to run at Q4_K_M: Nvidia GeForce RTX 3060 (12GB).

Vision · accepts imagesLicenseLlama 3.2 Community· Conditional↓ 117.1K/mo♥ 1.6Kon HuggingFace

Q4_K_M

7.36 GB

Q8_0

11.49 GB

Total @ Q4 (4k)

~9 GB

Context

128 k

Quantization sizes

GGUF quantson disk

Quantization	Size on disk
Q2_K	4.5 GB est
Q3_K_M	5.2 GB est
Q4_K_M (default)	7.36 GB
Q5_K_M	7.6 GB est
Q6_K	8.8 GB est
Q8_0	11.49 GB
FP16	21.4 GB est

Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.

Run it

Ollama

$ ollama run llama3.2-vision:11b

llama.cpp

$ llama-cli -hf leafspark/Llama-3.2-11B-Vision-Instruct-GGUF:Q4_K_M

LM Studio

$ lms get leafspark/Llama-3.2-11B-Vision-Instruct-GGUF

Which devices can run Llama 3.2 Vision 11B?

Apple Silicon Macs

RAM-only laptops

iPhone & iPad

Android

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB)Yes

FAQ

How much VRAM or RAM does Llama 3.2 Vision 11B need?

At Q4_K_M, Llama 3.2 Vision 11B needs about 9 GB (weights ~7.36 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~13.1 GB.

Can Llama 3.2 Vision 11B run on a laptop?

Llama 3.2 Vision 11B is large; you need a 24 GB+ GPU or a 32-48 GB Mac at Q4_K_M.

Can I use Llama 3.2 Vision 11B commercially?

Conditionally. Llama 3.2 Community License: free under 700M MAU.

Meta Llama 3.2 Vision 11B: image understanding plus text. Q4 ~5.55GB plus a ~1.81GB vision projector, folded into the totals (matches the ~7.3GB Ollama llama3.2-vision tag). Vision needs an mmproj-capable runtime.

Sources

Memory figures are estimates. See methodology.