Audio model · stable-audio
SA Stable Audio Open 1.0 requirements
Music and audio generation · 1.3B params · fp32 / fp16 · released Jun 2024. Realistic minimum to run: Nvidia GeForce RTX 4060 Ti (16GB).
Free for commercial use under $1M annual revenue; enterprise license above. Cannot be used to train other generative models.
Run it
Stable Audio Open 1.0 runs in stable-audio-tools, diffusers or ComfyUI at fp32. It needs a GPU (or Apple Silicon with enough unified memory).
Which devices can run Stable Audio Open 1.0?
Apple Silicon Macs
No mainstream local runtime for Stable Audio Open 1.0 on Apple Silicon Macs yet.
RAM-only laptops
No mainstream local runtime for Stable Audio Open 1.0 on RAM-only laptops yet.
iPhone & iPad
No mainstream local runtime for Stable Audio Open 1.0 on iPhone & iPad yet.
Android
No mainstream local runtime for Stable Audio Open 1.0 on Android yet.
NVIDIA GPUs
AMD GPUs
FAQ
How much memory does Stable Audio Open 1.0 need?
At fp32 it consumes ~15 GB. It needs a GPU or Apple Silicon.
Can Stable Audio Open 1.0 run on a phone or CPU?
Not practically. Stable Audio Open 1.0 expects a GPU; CPU inference is too slow for real use.
Can I use Stable Audio Open 1.0 commercially?
Conditionally. Free for commercial use under $1M annual revenue; enterprise license above. Cannot be used to train other generative models.
Latent-diffusion music model (1.06B DiT + autoencoder + T5-base, ~1.3B total) generating up to 47s of 44.1kHz stereo. The diffusion phase uses ~5.9GB VRAM; the decoder peaks at ~14.5GB (measured on an RTX 3090), so budget ~15GB unless you use chunked decoding. Stability Community License: free commercial use under $1M revenue. Sources: HF card, the Stable Audio Open paper, the license.
Sources
Memory is a sourced peak-usage anchor at fp32, validated 2026-06-15. See methodology.