Back to Quant Hub

Qwen2.5 0.5B Instruct

0.5B

Alibaba Qwen2.5

Smallest Qwen2.5. Ideal for Raspberry Pi, phones, and ultra-low-latency demos.

205.8K HF downloads105 likesQwen/Qwen2.5-0.5B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

33K

Max Context

2

Quant Variants

GGUF Q8_0

Best Quality

99.5%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.850.6 GB5.2%620 tok/s
GGUFQ8_08.50.9 GB0.5%540 tok/s