Qwen2.5 14B Instruct

14B

Alibaba Qwen2.5

The sweet spot between performance and resource usage. 16GB VRAM with Q4.

⬇ 93.3K HF downloads♥ 57 likesQwen/Qwen2.5-14B-Instruct-GGUF· stats from 6/24/2026

Consumer GPUMac / Apple Silicon

131K

Max Context

Quant Variants

GGUF Q5_K_M

Best Quality

98.6%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	10.2 GB	2.9%	98 tok/s	Calc HF
GGUF	Q5_K_M	5.68	11.8 GB	1.4%	86 tok/s	Calc HF
AWQ	INT4	4	9.2 GB	3.8%	128 tok/s	Calc HF
EXL2	4.65bpw	4.65	9.8 GB	2.1%	138 tok/s	Calc HF