video model · mochi · Windows
MO Can I run Mochi 1 on Nvidia GeForce RTX 4060 Ti (16GB)?
Needs ~20 GB at fp8 + offload, but only ~15 GB is usable on Nvidia GeForce RTX 4060 Ti (16GB). With aggressive CPU offload it can run on as little as ~18 GB, much slower.
Needs ~20 GB at fp8 + offload, but only ~15 GB is usable on Nvidia GeForce RTX 4060 Ti (16GB). With aggressive CPU offload it can run on as little as ~18 GB, much slower.
- Peak VRAM
- ~20 GB
- Usable on device
- ~15 GB
- Device memory
- 16 GB
- Quant
- fp8 + offload
- Type
- video (DIT)
- Parameters
- 10B
- Peak VRAM
- ~20 GB at fp8 + offload
- Resolution
- 480×848
- License
- Apache-2.0
- Memory
- 16 GB vram
- Usable for weights
- ~15 GB
- Best runtime
- Ollama (CUDA) / llama.cpp CUDA
What you can run instead
Run Mochi 1 on other hardware
FAQ
Can Nvidia GeForce RTX 4060 Ti (16GB) run Mochi 1?
Needs ~20 GB at fp8 + offload, but only ~15 GB is usable on Nvidia GeForce RTX 4060 Ti (16GB). With aggressive CPU offload it can run on as little as ~18 GB, much slower.
How much VRAM does Mochi 1 need?
Nvidia GeForce RTX 4060 Ti (16GB) does not have enough memory. At fp8 + offload the realistic peak is ~20 GB of VRAM, versus ~60 GB with every component kept resident (no offload). With aggressive CPU offload it drops to ~18 GB, much slower.
What do I use to run Mochi 1 locally?
Mochi 1 runs in ComfyUI or Diffusers. It loads as a video diffusion checkpoint plus its text encoder and VAE, not a single chat command.
Sources
VRAM figures are sourced peak-usage anchors at the noted quant, validated 2026-06-15. See methodology.