Text model · Sarvam

SSarvam-1 2B requirements

Sarvam family · 2B params · released Oct 2024. Minimum to run at Q4_K_M: Apple M1 (8GB).

LicenseSarvam non-commercial· Non-commercial↓ 3.8K/mo♥ 139on HuggingFace

Q4_K_M

1.55 GB

Q8_0

2.69 GB

Total @ Q4 (4k)

~2.7 GB

Context

8 k

Quantization sizes

GGUF quantson disk

Quantization	Size on disk
Q2_K	0.8 GB est
Q3_K_M	1 GB est
Q4_K_M (default)	1.55 GB
Q5_K_M	1.4 GB est
Q6_K	1.6 GB est
Q8_0	2.69 GB
FP16	5.05 GB

Lower quant = smaller and faster, slightly lower quality. Q4_K_M is the common default.

Run it

llama.cpp

$ llama-cli -hf bartowski/sarvam-1-GGUF:Q4_K_M

LM Studio

$ lms get bartowski/sarvam-1-GGUF

Which devices can run Sarvam-1 2B?

Apple Silicon Macs

RAM-only laptops

iPhone & iPad

Android

NVIDIA GPUs

AMD GPUs

AMD Radeon RX 7900 XTX (24GB)Yes

FAQ

How much VRAM or RAM does Sarvam-1 2B need?

At Q4_K_M, Sarvam-1 2B needs about 2.7 GB (weights ~1.55 GB + KV cache + overhead) at a 4k context. At Q8_0 budget ~3.8 GB.

Can Sarvam-1 2B run on a laptop?

Yes, Sarvam-1 2B fits on a 16 GB machine at Q4_K_M and runs on Apple Silicon or a 12 GB+ GPU comfortably.

Can I use Sarvam-1 2B commercially?

No. Non-commercial use only per the Sarvam license. Check the current HuggingFace model card for updates.

2B dense model trained from scratch, optimized for 10 Indic languages plus English. Released 2024-10-24 under the Sarvam non-commercial license (not Apache). GGUF sizes from bartowski: Q4_K_M 1.55GB, Q8_0 2.69GB. Context 8K from the model card.

Sources

Memory figures are estimates. See methodology.