Text model · Qwen3
Qwen3 235B A22B requirements
Qwen3 family · 235B params (Mixture-of-Experts, 22B active) · released Apr 2025. Minimum to run at Q4_K_M: Apple M3 Ultra (256GB).
Quantization sizes
| Quantization | Size on disk |
|---|---|
| Q2_K | 98.4 GB est |
| Q3_K_M | 114.9 GB est |
| Q4_K_M (default) | 132.39 GB |
| Q5_K_M | 167.4 GB est |
| Q6_K | 192.7 GB est |
| Q8_0 | 249.7 GB est |
| FP16 | 470 GB est |
Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.
Run it
ollama run qwen3:235b llama-cli -hf unsloth/Qwen3-235B-A22B-GGUF:Q4_K_M lms get unsloth/Qwen3-235B-A22B-GGUF Which devices can run Qwen3 235B A22B?
Apple Silicon Macs
RAM-only laptops
iPhone & iPad
Android
NVIDIA GPUs
AMD GPUs
FAQ
How much VRAM or RAM does Qwen3 235B A22B need?
At Q4_K_M, Qwen3 235B A22B needs about 136.9 GB (weights ~132.39 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~254.2 GB.
Can Qwen3 235B A22B run on a laptop?
Qwen3 235B A22B is large; you need a high-memory Mac or multi-GPU setup at Q4_K_M.
Is Qwen3 235B A22B cheaper to run because it is a MoE model?
It is faster, not lighter. Qwen3 235B A22B activates only 22B of 235B params per token (so it runs quickly), but all experts must stay in memory, so it still needs memory for the full 235B.
Can I use Qwen3 235B A22B commercially?
Yes. Qwen3 235B A22B is licensed Apache-2.0, which permits commercial use.
Qwen3 235B sparse MoE (22B active). Q4_K_M ~132GB, fits a 192GB Mac. Size from the unsloth GGUF repo, exact-matched to the Ollama 235B tag.
Sources
Memory figures are estimates. See methodology.