Image model · UNET

SD Stable Diffusion XL 1.0 requirements

UNET image model · 2.6B params · 1024×1024 · 25-40 steps · released Jul 2023. Realistic minimum to run: Nvidia GeForce RTX 3060 (12GB) at fp16.

CreativeML OpenRAIL++-M Commercial use OK

Use-based restrictions; no revenue cap.

Peak VRAM (fp16)

~7.5 GB

All resident

~8.5 GB

Offload floor

~4 GB

Resolution

1024×1024

Backbone size by precision

Precision On disk

Precision	Size
fp16 / bf16 (recommended)	5.1 GB

Backbone weights only. The verdict uses peak VRAM consumed at fp16, not the file size.

Pipeline components

Component Size

Component	Size
CLIP-L text encoder	0.25 GB
OpenCLIP-G text encoder	1.39 GB
VAE	0.34 GB

Encoders marked “offloaded” move to CPU before denoising, so they do not count toward peak VRAM.

Run it

Stable Diffusion XL 1.0 runs in ComfyUI, AUTOMATIC1111 / Forge, Draw Things or diffusers. Load the fp16 backbone with its text encoder and VAE; there is no single chat command like a text LLM.

ComfyUIAUTOMATIC1111 / ForgeDraw Thingsdiffusers

Which devices can run Stable Diffusion XL 1.0?

Apple Silicon Macs

RAM-only laptops

No mainstream local runtime for a 2.6B image model on RAM-only laptops yet.

iPhone & iPad

Android

No mainstream local runtime for a 2.6B image model on Android yet.

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB) Yes

FAQ

How much VRAM does Stable Diffusion XL 1.0 need?

At fp16 the realistic peak is ~7.5 GB, versus ~8.5 GB with every component resident. With aggressive CPU offload it drops to ~4 GB, much slower.

Why is peak VRAM lower than the sum of the files?

The pipeline moves each stage off the GPU between passes (sequential CPU offload), so peak VRAM stays near the active stage rather than the sum of every file.

Can I use Stable Diffusion XL 1.0 commercially?

Yes. Stable Diffusion XL 1.0 is licensed CreativeML OpenRAIL++-M, which permits commercial use.

UNet 2.6B (3.5B with CLIP-L + OpenCLIP-G; no T5). The two CLIP encoders total ~1.6GB and stay resident, so peak VRAM is the sum: ~7.5GB measured at 1024x1024 on an 8GB card, matching Stability's stated 8GB minimum. Runs on 4GB with AUTOMATIC1111 --lowvram, slowly. An optional refiner adds a second ~6GB UNet. Sources: Stability SDXL announcement, ComfyUI 8GB measurement, AUTOMATIC1111 Optimum SDXL wiki.

Sources

VRAM is a sourced peak-usage anchor at fp16, validated 2026-06-15. See methodology.