Text model · mistral

Mistral Nemo 12B requirements

mistral family · 12.2B params · released Jul 2024 · 4.9M Ollama pulls. Minimum to run at Q4_K_M: Nvidia GeForce RTX 3060 (12GB).

Q4_K_M

6.96 GB

Q8_0

12.13 GB

Total @ Q4 (4k)

~8.6 GB

Context

128 k

Quantization sizes

GGUF quantson disk

Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.

Ollama

$ ollama run mistral-nemo:12b

llama.cpp

$ llama-cli -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q4_K_M

LM Studio

$ lms get bartowski/Mistral-Nemo-Instruct-2407-GGUF

How much VRAM or RAM does Mistral Nemo 12B need?

At Q4_K_M, Mistral Nemo 12B needs about 8.6 GB (weights ~6.96 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~13.7 GB.

Can Mistral Nemo 12B run on a laptop?

Mistral Nemo 12B is large; you need a 24 GB+ GPU or a 32-48 GB Mac at Q4_K_M.

Can I use Mistral Nemo 12B commercially?

Yes. Mistral Nemo 12B is licensed Apache-2.0, which permits commercial use.

Mistral Nemo 12B, built with NVIDIA, 128K context. Q4_K_M and Q8_0 sizes from the bartowski GGUF repo.

Memory figures are estimates. See methodology.