Qwen2.5 0.5B Instruct

0.5B

Alibaba Qwen2.5

Smallest Qwen2.5. Ideal for Raspberry Pi, phones, and ultra-low-latency demos.

⬇ 205.8K HF downloads♥ 105 likesQwen/Qwen2.5-0.5B-Instruct-GGUF· stats from 6/24/2026

Consumer GPUMac / Apple SiliconCPU / VPS

33K

Max Context

Quant Variants

GGUF Q8_0

Best Quality

99.5%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	0.6 GB	5.2%	620 tok/s	Calc HF
GGUF	Q8_0	8.5	0.9 GB	0.5%	540 tok/s	Calc HF