Image model · DIT
FX FLUX.1 dev requirements
DIT image model · 12B params · 1024×1024 · 20-50 steps · released Aug 2024. Realistic minimum to run: Nvidia GeForce RTX 3060 (12GB) at Q4 GGUF.
Weights are non-commercial. Generated images may be used commercially, but using the model inside a product needs a separate Black Forest Labs license.
Backbone size by precision
| Precision | Size |
|---|---|
| fp16 / bf16 | 23.8 GB |
| fp8 | 11.9 GB |
| Q8 GGUF | 12.7 GB |
| Q4 GGUF (recommended) | 6.81 GB |
| Q2 GGUF | 4.03 GB |
Backbone weights only. The verdict uses peak VRAM consumed at Q4 GGUF, not the file size.
Pipeline components
| Component | Size |
|---|---|
| CLIP-L text encoder | 0.25 GB |
| T5-XXL text encoder offloaded | 2.9 GB |
| VAE | 0.34 GB |
Encoders marked “offloaded” move to CPU before denoising, so they do not count toward peak VRAM.
Run it
FLUX.1 dev runs in ComfyUI, Draw Things or diffusers. Load the Q4 GGUF backbone with its text encoder and VAE; there is no single chat command like a text LLM.
Which devices can run FLUX.1 dev?
Apple Silicon Macs
- Apple M1 (8GB) No
- Apple M2 (16GB) Yes
- Apple M4 (16GB) Yes
- Apple M5 (16GB) Yes
- Apple M3 Pro (18GB) Yes
- Apple M4 (24GB) Yes
- Apple M4 Pro (24GB) Yes
- Apple M5 (32GB) Yes
- Apple M4 Pro (48GB) Yes
- Apple M5 Pro (48GB) Yes
- Apple M4 Max (64GB) Yes
- Apple M4 Max (128GB) Yes
- Apple M5 Max (128GB) Yes
- Apple M3 Ultra (256GB) Yes
RAM-only laptops
No mainstream local runtime for a 12B image model on RAM-only laptops yet.
iPhone & iPad
Android
No mainstream local runtime for a 12B image model on Android yet.
NVIDIA GPUs
AMD GPUs
FAQ
How much VRAM does FLUX.1 dev need?
At Q4 GGUF the realistic peak is ~6.5 GB, versus ~33 GB with every component resident. With aggressive CPU offload it drops to ~3 GB, much slower.
Why is peak VRAM lower than the sum of the files?
The text encoder is run once to encode your prompt, then offloaded to CPU before the denoising steps, so it is not resident at the memory peak.
Can I use FLUX.1 dev commercially?
No. Weights are non-commercial. Generated images may be used commercially, but using the model inside a product needs a separate Black Forest Labs license.
12B DiT with CLIP-L and a 9.8GB T5-XXL text encoder. T5 is offloaded to CPU after prompt-encoding, so the Q4 GGUF backbone (6.8GB) drives a denoising peak of ~6.5GB, measured at 6.4GB on an RTX 2080 8GB in Forge. All-components-resident at bf16 is ~33GB. Sequential CPU offload runs on ~3GB, very slowly. Non-commercial license. Sources: BFL model card, city96 FLUX.1-dev GGUF repo and 8GB discussion, diffusers memory docs.
Sources
VRAM is a sourced peak-usage anchor at Q4 GGUF, validated 2026-06-15. See methodology.