image model · flux · iOS
FX Can I run FLUX.1 schnell on iPhone 16 Pro?
Needs ~6.5 GB at Q4 GGUF, but only ~4.5 GB is usable on iPhone 16 Pro. With aggressive CPU offload it can run on as little as ~3 GB, much slower.
Needs ~6.5 GB at Q4 GGUF, but only ~4.5 GB is usable on iPhone 16 Pro. With aggressive CPU offload it can run on as little as ~3 GB, much slower.
- Peak VRAM
- ~6.5 GB
- Usable on device
- ~4.5 GB
- Device memory
- 8 GB
- Quant
- Q4 GGUF
- Type
- image (DIT)
- Parameters
- 12B
- Peak VRAM
- ~6.5 GB at Q4 GGUF
- Resolution
- 1024×1024
- License
- Apache-2.0
- Memory
- 8 GB unified
- Usable for weights
- ~4.5 GB
- Best runtime
- llama.cpp + Metal (via PocketPal or Off Grid app)
What you can run instead
Run FLUX.1 schnell on other hardware
FAQ
Can iPhone 16 Pro run FLUX.1 schnell?
Needs ~6.5 GB at Q4 GGUF, but only ~4.5 GB is usable on iPhone 16 Pro. With aggressive CPU offload it can run on as little as ~3 GB, much slower.
How much VRAM does FLUX.1 schnell need?
iPhone 16 Pro does not have enough memory. At Q4 GGUF the realistic peak is ~6.5 GB of VRAM, versus ~33 GB with every component kept resident (no offload). With aggressive CPU offload it drops to ~3 GB, much slower.
What do I use to run FLUX.1 schnell locally?
FLUX.1 schnell runs in ComfyUI or Draw Things (among others). It loads as a diffusion checkpoint plus its text encoder and VAE, not a single chat command.
Sources
VRAM figures are sourced peak-usage anchors at the noted quant, validated 2026-06-15. See methodology.