Skip to content
localmodel.run

Video model · wan

WA Wan 2.2 T2V A14B requirements

DIT video model · 27B params (MoE, 14B active) · 1280×720 (720p), 81f (~5s) · released Jul 2025. Realistic minimum to run: Nvidia GeForce RTX 4090 (24GB) at Q4 GGUF.

Apache-2.0 Commercial use OK
Peak VRAM (Q4 GGUF)
~16 GB
All resident
~80 GB
Offload floor
~8 GB
Clip
81f / ~5s

Backbone size by precision

PrecisionOn disk
PrecisionSize
fp16 / bf16 57.2 GB
fp8 28.6 GB
Q8 GGUF 30.8 GB
Q4 GGUF (recommended) 19.3 GB
Q2 GGUF 10.6 GB

Backbone weights only. Peak VRAM is dominated by the activation memory for 81 frames at 1280×720 (720p), not the file size.

Pipeline components

ComponentSize
ComponentSize
umT5-XXL text encoder offloaded 3.66 GB
VAE (3D) 0.51 GB

Video VAEs are larger than image VAEs because they decode a temporal stack of frames.

Run it

Wan 2.2 T2V A14B runs in ComfyUI or Diffusers. Generating more frames or higher resolution raises peak VRAM sharply; the Q4 GGUF figure is for the default 81-frame clip.

ComfyUIDiffusers

Which devices can run Wan 2.2 T2V A14B?

Apple Silicon Macs

No mainstream local runtime for a 27B video model on Apple Silicon Macs yet.

RAM-only laptops

No mainstream local runtime for a 27B video model on RAM-only laptops yet.

iPhone & iPad

No mainstream local runtime for a 27B video model on iPhone & iPad yet.

Android

No mainstream local runtime for a 27B video model on Android yet.

NVIDIA GPUs

AMD GPUs

FAQ

How much VRAM does Wan 2.2 T2V A14B need?

At Q4 GGUF the realistic peak is ~16 GB, versus ~80 GB with every component resident. With aggressive CPU offload it drops to ~8 GB, much slower.

Why is peak VRAM lower than the sum of the files?

The text encoder is run once to encode your prompt, then offloaded to CPU before the frames are generated, so it is not resident at the memory peak.

Can I use Wan 2.2 T2V A14B commercially?

Yes. Wan 2.2 T2V A14B is licensed Apache-2.0, which permits commercial use.

Wan 2.2's flagship: a Mixture-of-Experts video model with two 14B experts (high-noise + low-noise), 27B total / 14B active per step. Both experts must be available, so GGUF Q4_K_M is ~19GB across the pair; full bf16 needs ~80GB. With ComfyUI sequential expert loading + T5 on CPU, peak lands around 12-20GB on a 24GB card (10-15 min per clip). Apache-2.0, commercial OK. Anchor is the GGUF Q4 path (synthesis). Sources: Wan-AI card, QuantStack GGUF, Wan2.2 repo.

Sources

VRAM is a sourced peak-usage anchor at Q4 GGUF (composed from component sizes, not a single measurement) for the default clip length, validated 2026-06-15. See methodology.