Back to Quant Hub

Stable LM 2 12B Chat

12B

Stability AI

Stability AI's 12B chat model. Solid general-purpose option for 16GB GPUs.

Consumer GPUMac / Apple Silicon

4K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

96.8%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.858.2 GB3.2%108 tok/s
AWQINT447.2 GB4.5%142 tok/s