Skip to content
localmodel.run

Audio model · stable-audio

SA Stable Audio Open 1.0 requirements

Music and audio generation · 1.3B params · fp32 / fp16 · released Jun 2024. Realistic minimum to run: Nvidia GeForce RTX 4060 Ti (16GB).

Stability Community License Commercial use: conditional

Free for commercial use under $1M annual revenue; enterprise license above. Cannot be used to train other generative models.

Peak memory (fp32)
~15 GB
Runs on CPU
No
Parameters
1.3B
Type
Music and audio generation

Run it

Runtime tools fp32

Stable Audio Open 1.0 runs in stable-audio-tools, diffusers or ComfyUI at fp32. It needs a GPU (or Apple Silicon with enough unified memory).

stable-audio-toolsdiffusersComfyUI

Which devices can run Stable Audio Open 1.0?

Apple Silicon Macs

No mainstream local runtime for Stable Audio Open 1.0 on Apple Silicon Macs yet.

RAM-only laptops

No mainstream local runtime for Stable Audio Open 1.0 on RAM-only laptops yet.

iPhone & iPad

No mainstream local runtime for Stable Audio Open 1.0 on iPhone & iPad yet.

Android

No mainstream local runtime for Stable Audio Open 1.0 on Android yet.

NVIDIA GPUs

AMD GPUs

FAQ

How much memory does Stable Audio Open 1.0 need?

At fp32 it consumes ~15 GB. It needs a GPU or Apple Silicon.

Can Stable Audio Open 1.0 run on a phone or CPU?

Not practically. Stable Audio Open 1.0 expects a GPU; CPU inference is too slow for real use.

Can I use Stable Audio Open 1.0 commercially?

Conditionally. Free for commercial use under $1M annual revenue; enterprise license above. Cannot be used to train other generative models.

Notes

Latent-diffusion music model (1.06B DiT + autoencoder + T5-base, ~1.3B total) generating up to 47s of 44.1kHz stereo. The diffusion phase uses ~5.9GB VRAM; the decoder peaks at ~14.5GB (measured on an RTX 3090), so budget ~15GB unless you use chunked decoding. Stability Community License: free commercial use under $1M revenue. Sources: HF card, the Stable Audio Open paper, the license.

Sources

Memory is a sourced peak-usage anchor at fp32, validated 2026-06-15. See methodology.