Real memory math
Can I run this AI model locally?
Pick your device. Pick a model. Get a yes, a tight, or a no, plus the exact command to run it. No guessing. No signup.
- Models
- 95
- Devices
- 39
- Modalities
- 4
- Free
- Always
Runs at Q4_K_M using ~6.4 GB of ~10.5 GB usable. You have room for Q8_0 for higher quality.
ollama run llama3.1:8bBest on macOS: LM Studio · Q4_K_M recommended
Popular models
All 95 models →Llama 3.1 8B
8B4.92 GB at Q4_K_M · 128k context
111M Ollama pulls
DeepSeek-R1-Distill-Qwen 7B
7B4.68 GB at Q4_K_M · 128k context
79.3M Ollama pulls
Gemma 3 4B
4B2.49 GB at Q4_K_M · 128k context
32.8M Ollama pulls
Mistral 7B
7B4.37 GB at Q4_K_M · 32k context
26.1M Ollama pulls
Qwen2.5 7B
7B4.68 GB at Q4_K_M · 128k context
23.2M Ollama pulls
Qwen3 8B
8B5.03 GB at Q4_K_M · 32k context
23M Ollama pulls
Popular hardware
All 39 devices →How it works
Real sizes
Quant sizes come from HuggingFace GGUF repos, Ollama (text models) and vendor specs, not guesses. Every number is sourced.
Honest memory math
We add KV cache and runtime overhead, and use realistic usable memory (Apple unified ~66-75% depending on chip, GPU VRAM minus driver).
The right tool
Every platform has a different winner. MLX on Mac, CUDA on Windows, vLLM on Linux, PocketPal on phones. Wrong choice and you lose half your tokens per second.
Frequently asked
How do I know if my computer can run a local AI model?
Compare the model's memory needs to your usable memory. A 7-8B text model at Q4_K_M needs about 6-7 GB, so it runs on a 16 GB Mac or a 12 GB GPU. Image and video diffusion models typically need 4-12 GB of GPU or Apple Silicon VRAM. Audio models (Whisper, Kokoro) run on CPU and need 1-4 GB. localmodel.run does this math for 95 models across 39 devices.
Can I run AI models locally on a Mac?
Yes. Apple Silicon shares unified memory, so a 16 GB Mac runs 7-8B models and a 64 GB+ Mac runs 70B. Use LM Studio (which ships MLX) for a GUI, or mlx-lm for the most speed. vLLM is not a Mac tool, it is a Linux/CUDA serving engine.
Can I run AI models on my phone?
Yes, within limits. iPhones and Android flagships realistically run 1B-4B text models. For text: PocketPal AI works on both iOS and Android; Apple Foundation Models is built into iOS 26. For images on iPhone: Draw Things supports diffusion models locally. For audio: Whisper (speech-to-text) runs on both iOS and Android.
Which is the best tool to run models locally?
On Mac, start with LM Studio, it ships MLX and has a GUI. On Linux, Ollama for quick chat, vLLM if you are serving traffic. On phones: for text, PocketPal AI (iOS and Android) or Apple Foundation Models (iOS 26); for images on iPhone, Draw Things; for audio, whisper.cpp. Each device page links the right tool so you do not have to guess.
Estimates, not guarantees. See how we calculate and our sources.