Video model · wan

WA Wan 2.1 T2V 1.3B requirements

DIT video model · 1.3B params · 832×480 (480p), 81f (~5s) · released Feb 2025. Realistic minimum to run: Nvidia GeForce RTX 3060 (12GB) at Q4 GGUF.

Apache-2.0 Commercial use OK

Peak VRAM (Q4 GGUF)

~6 GB

All resident

~20 GB

Offload floor

~5 GB

Clip

81f / ~5s

Backbone size by precision

PrecisionOn disk

Precision	Size
fp16 / bf16	2.84 GB
Q8 GGUF	1.54 GB
Q4 GGUF (recommended)	0.98 GB

Backbone weights only. Peak VRAM is dominated by the activation memory for 81 frames at 832×480 (480p), not the file size.

Pipeline components

ComponentSize

Component	Size
umT5-XXL text encoder offloaded	3.66 GB
VAE (3D)	0.51 GB

Video VAEs are larger than image VAEs because they decode a temporal stack of frames.

Run it

Wan 2.1 T2V 1.3B runs in ComfyUI or Diffusers. Generating more frames or higher resolution raises peak VRAM sharply; the Q4 GGUF figure is for the default 81-frame clip.

ComfyUIDiffusers

Which devices can run Wan 2.1 T2V 1.3B?

Apple Silicon Macs

RAM-only laptops

No mainstream local runtime for a 1.3B video model on RAM-only laptops yet.

iPhone & iPad

No mainstream local runtime for a 1.3B video model on iPhone & iPad yet.

Android

No mainstream local runtime for a 1.3B video model on Android yet.

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB) Yes

FAQ

How much VRAM does Wan 2.1 T2V 1.3B need?

At Q4 GGUF the realistic peak is ~6 GB, versus ~20 GB with every component resident. With aggressive CPU offload it drops to ~5 GB, much slower.

Why is peak VRAM lower than the sum of the files?

The text encoder is run once to encode your prompt, then offloaded to CPU before the frames are generated, so it is not resident at the memory peak.

Can I use Wan 2.1 T2V 1.3B commercially?

Yes. Wan 2.1 T2V 1.3B is licensed Apache-2.0, which permits commercial use.

The accessible Wan tier (1.3B DiT). Backbone bf16 is 2.84GB; the umT5-XXL encoder (11.4GB) is offloaded to CPU via --t5_cpu, so the native bf16 GPU peak is ~8.2GB (Wan-AI's own figure) and GGUF Q4 + T5 on CPU drops it to ~5-6GB. Generates 832x480, 81 frames (~5s) at 16fps. Apache-2.0, commercial OK. Anchor is the GGUF Q4 community path (synthesis). Sources: Wan-AI card, samuelchristlie GGUF, city96 umT5 GGUF, Wan2.1 repo.

Sources

VRAM is a sourced peak-usage anchor at Q4 GGUF (composed from component sizes, not a single measurement) for the default clip length, validated 2026-06-15. See methodology.