Image model · DIT

FX FLUX.1 schnell requirements

DIT image model · 12B params · 1024×1024 · 1-4 steps · released Aug 2024. Realistic minimum to run: Nvidia GeForce RTX 3060 (12GB) at Q4 GGUF.

Apache-2.0 Commercial use OK

Apache-2.0; the only FLUX.1 variant licensed for commercial products.

Peak VRAM (Q4 GGUF)

~6.5 GB

All resident

~33 GB

Offload floor

~3 GB

Resolution

1024×1024

Backbone size by precision

Precision On disk

Precision	Size
fp16 / bf16	23.8 GB
fp8	11.9 GB
Q8 GGUF	12.7 GB
Q4 GGUF (recommended)	6.78 GB
Q2 GGUF	4.01 GB

Backbone weights only. The verdict uses peak VRAM consumed at Q4 GGUF, not the file size.

Pipeline components

Component Size

Component	Size
CLIP-L text encoder	0.25 GB
T5-XXL text encoder offloaded	2.9 GB
VAE	0.34 GB

Encoders marked “offloaded” move to CPU before denoising, so they do not count toward peak VRAM.

Run it

FLUX.1 schnell runs in ComfyUI, Draw Things or diffusers. Load the Q4 GGUF backbone with its text encoder and VAE; there is no single chat command like a text LLM.

ComfyUIDraw Thingsdiffusers

Which devices can run FLUX.1 schnell?

Apple Silicon Macs

RAM-only laptops

No mainstream local runtime for a 12B image model on RAM-only laptops yet.

iPhone & iPad

Android

No mainstream local runtime for a 12B image model on Android yet.

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB) Yes

FAQ

How much VRAM does FLUX.1 schnell need?

At Q4 GGUF the realistic peak is ~6.5 GB, versus ~33 GB with every component resident. With aggressive CPU offload it drops to ~3 GB, much slower.

Why is peak VRAM lower than the sum of the files?

The text encoder is run once to encode your prompt, then offloaded to CPU before the denoising steps, so it is not resident at the memory peak.

Can I use FLUX.1 schnell commercially?

Yes. FLUX.1 schnell is licensed Apache-2.0, which permits commercial use.

The same 12B DiT as FLUX.1 dev, distilled to 1-4 step generation and licensed Apache-2.0 (the only commercially-usable FLUX.1 variant). T5-XXL is offloaded after prompt-encoding, so the Q4 GGUF backbone peaks at ~6.5GB during denoising. All-resident bf16 is ~33GB. Sources: BFL schnell model card, city96 FLUX.1-schnell GGUF repo, diffusers memory docs.

Sources

VRAM is a sourced peak-usage anchor at Q4 GGUF (composed from component sizes, not a single measurement), validated 2026-06-15. See methodology.