Skip to content
localmodel.run

Text model · Llama 3.2 Vision

Llama 3.2 Vision 11B requirements

Llama 3.2 Vision family · 10.7B params · released Sep 2024 · 4.6M Ollama pulls. Minimum to run at Q4_K_M: Nvidia GeForce RTX 3060 (12GB).

Vision · accepts imagesLicenseLlama 3.2 Community· Conditional↓ 117.1K/mo♥ 1.6Kon HuggingFace
Q4_K_M
7.36 GB
Q8_0
11.49 GB
Total @ Q4 (4k)
~9 GB
Context
128 k

Quantization sizes

GGUF quantson disk
QuantizationSize on disk
Q2_K4.5 GB est
Q3_K_M5.2 GB est
Q4_K_M (default)7.36 GB
Q5_K_M7.6 GB est
Q6_K8.8 GB est
Q8_011.49 GB
FP1621.4 GB est

Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.

Run it

Ollama
$ ollama run llama3.2-vision:11b
llama.cpp
$ llama-cli -hf leafspark/Llama-3.2-11B-Vision-Instruct-GGUF:Q4_K_M
LM Studio
$ lms get leafspark/Llama-3.2-11B-Vision-Instruct-GGUF

Which devices can run Llama 3.2 Vision 11B?

FAQ

How much VRAM or RAM does Llama 3.2 Vision 11B need?

At Q4_K_M, Llama 3.2 Vision 11B needs about 9 GB (weights ~7.36 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~13.1 GB.

Can Llama 3.2 Vision 11B run on a laptop?

Llama 3.2 Vision 11B is large; you need a 24 GB+ GPU or a 32-48 GB Mac at Q4_K_M.

Can I use Llama 3.2 Vision 11B commercially?

Conditionally. Llama 3.2 Community License: free under 700M MAU.

Meta Llama 3.2 Vision 11B: image understanding plus text. Q4 ~5.55GB plus a ~1.81GB vision projector, folded into the totals (matches the ~7.3GB Ollama llama3.2-vision tag). Vision needs an mmproj-capable runtime.

Sources

Memory figures are estimates. See methodology.