Skip to content
localmodel.run

Real memory math

Can I run this AI model locally?

Pick your device. Pick a model. Get a yes, a tight, or a no, plus the exact command to run it. No guessing. No signup.

Models
95
Devices
39
Modalities
4
Free
Always
Yes
NEEDS6.4 GB
USABLE10.5 GB

Runs at Q4_K_M using ~6.4 GB of ~10.5 GB usable. You have room for Q8_0 for higher quality.

$ollama run llama3.1:8b

Best on macOS: LM Studio · Q4_K_M recommended

Covers Text generation Image models Video models Audio models Apple Silicon NVIDIA / AMD

Popular models

All 95 models →

How it works

1

Real sizes

Quant sizes come from HuggingFace GGUF repos, Ollama (text models) and vendor specs, not guesses. Every number is sourced.

2

Honest memory math

We add KV cache and runtime overhead, and use realistic usable memory (Apple unified ~66-75% depending on chip, GPU VRAM minus driver).

3

The right tool

Every platform has a different winner. MLX on Mac, CUDA on Windows, vLLM on Linux, PocketPal on phones. Wrong choice and you lose half your tokens per second.

Frequently asked

How do I know if my computer can run a local AI model?

Compare the model's memory needs to your usable memory. A 7-8B text model at Q4_K_M needs about 6-7 GB, so it runs on a 16 GB Mac or a 12 GB GPU. Image and video diffusion models typically need 4-12 GB of GPU or Apple Silicon VRAM. Audio models (Whisper, Kokoro) run on CPU and need 1-4 GB. localmodel.run does this math for 95 models across 39 devices.

Can I run AI models locally on a Mac?

Yes. Apple Silicon shares unified memory, so a 16 GB Mac runs 7-8B models and a 64 GB+ Mac runs 70B. Use LM Studio (which ships MLX) for a GUI, or mlx-lm for the most speed. vLLM is not a Mac tool, it is a Linux/CUDA serving engine.

Can I run AI models on my phone?

Yes, within limits. iPhones and Android flagships realistically run 1B-4B text models. For text: PocketPal AI works on both iOS and Android; Apple Foundation Models is built into iOS 26. For images on iPhone: Draw Things supports diffusion models locally. For audio: Whisper (speech-to-text) runs on both iOS and Android.

Which is the best tool to run models locally?

On Mac, start with LM Studio, it ships MLX and has a GUI. On Linux, Ollama for quick chat, vLLM if you are serving traffic. On phones: for text, PocketPal AI (iOS and Android) or Apple Foundation Models (iOS 26); for images on iPhone, Draw Things; for audio, whisper.cpp. Each device page links the right tool so you do not have to guess.

Estimates, not guarantees. See how we calculate and our sources.