Text model · llama
Llama 3.3 70B requirements
llama family · 70B params · released Dec 2024 · 3.4M Ollama pulls · LMArena Elo 1318. Minimum to run at Q4_K_M: Apple M4 Max (64GB).
Quantization sizes
| Quantization | Size on disk |
|---|---|
| Q2_K | 29.3 GB est |
| Q3_K_M | 34.2 GB est |
| Q4_K_M (default) | 42.52 GB |
| Q5_K_M | 49.9 GB est |
| Q6_K | 57.4 GB est |
| Q8_0 | 74.98 GB |
| FP16 | 140 GB est |
Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.
Run it
ollama run llama3.3:70b llama-cli -hf bartowski/Llama-3.3-70B-Instruct-GGUF:Q4_K_M lms get bartowski/Llama-3.3-70B-Instruct-GGUF Which devices can run Llama 3.3 70B?
Apple Silicon Macs
RAM-only laptops
iPhone & iPad
Android
NVIDIA GPUs
AMD GPUs
FAQ
How much VRAM or RAM does Llama 3.3 70B need?
At Q4_K_M, Llama 3.3 70B needs about 45.3 GB (weights ~42.52 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~77.8 GB.
Can Llama 3.3 70B run on a laptop?
Llama 3.3 70B is large; you need a high-memory Mac or multi-GPU setup at Q4_K_M.
Can I use Llama 3.3 70B commercially?
Conditionally. Llama 3.3 Community License: free under 700M MAU.
Released December 6, 2024. Delivers near-405B performance at 70B cost. Q4_K_M and Q8_0 sizes from bartowski HF repo cross-validated against Ollama tags page (43GB and 75GB displayed).
Sources
Memory figures are estimates. See methodology.