Back to Quant Hub
Llama 3.2 3B Instruct
3BMeta Llama 3.2
Tiny but capable. Runs on 4GB VRAM or 8GB RAM, even on phones via llama.cpp.
Consumer GPUMac / Apple SiliconCPU / VPS
131K
Max Context
3
Quant Variants
GGUF Q8_0
Best Quality
99.8%
Accuracy Retained