Back to Quant Hub

InternLM2 20B Chat

20B

Shanghai AI Lab

Mid-size InternLM2 with excellent Chinese comprehension. Fits 24GB at Q4.

1.7K HF downloads8 likesbartowski/internlm2_5-20b-chat-GGUF· stats from 6/24/2026
Consumer GPUPro GPU

33K

Max Context

2

Quant Variants

GGUF Q5_K_M

Best Quality

98.6%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8513.8 GB2.9%78 tok/s
GGUFQ5_K_M5.6815.8 GB1.4%68 tok/s