Back to Quant Hub

Phi-3 Medium 14B Instruct

14B

Microsoft Phi

Microsoft's mid-size Phi-3. Excellent quality-per-GB on 16GB cards.

Consumer GPUMac / Apple Silicon

131K

Max Context

3

Quant Variants

GGUF Q6_K

Best Quality

99.2%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.859.8 GB3.0%102 tok/s
GGUFQ6_K6.5612.8 GB0.8%88 tok/s
AWQINT448.8 GB4.2%135 tok/s