Back to Quant Hub

Qwen2.5 14B Instruct

14B

Alibaba Qwen2.5

The sweet spot between performance and resource usage. 16GB VRAM with Q4.

93.3K HF downloads57 likesQwen/Qwen2.5-14B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple Silicon

131K

Max Context

4

Quant Variants

GGUF Q5_K_M

Best Quality

98.6%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8510.2 GB2.9%98 tok/s
GGUFQ5_K_M5.6811.8 GB1.4%86 tok/s
AWQINT449.2 GB3.8%128 tok/s
EXL24.65bpw4.659.8 GB2.1%138 tok/s