Back to Quant Hub

DeepSeek-R1-Distill-Qwen-7B

7B

DeepSeek

R1 reasoning in a 7B footprint. Best value for 8–12GB VRAM CoT experiments.

51.5K HF downloads131 likesbartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

2

Quant Variants

EXL2 4.65bpw

Best Quality

97.8%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.855.4 GB3.0%152 tok/s
EXL24.65bpw4.655.2 GB2.2%210 tok/s