Back to Quant Hub

Yi 1.5 34B Chat

34B

01.AI Yi

01.AI's strong bilingual (EN/ZH) model. Competitive with Qwen 32B.

Consumer GPUPro GPU

4K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

97.2%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8522.5 GB2.8%40 tok/s
AWQINT4419.8 GB4.0%52 tok/s