Back to Quant Hub

Qwen2.5 7B Instruct

7B

Alibaba Qwen2.5

Alibaba's highly optimized 7B. Punches well above its weight, especially in coding.

173.2K HF downloads157 likesQwen/Qwen2.5-7B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

131K

Max Context

4

Quant Variants

GGUF Q6_K

Best Quality

99.3%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.855.4 GB3.0%155 tok/s
GGUFQ6_K6.567.0 GB0.7%132 tok/s
AWQINT444.8 GB4.2%222 tok/s
EXL24.65bpw4.655.2 GB2.2%245 tok/s