Back to Cookbook
BeginnerMac / Apple 6 min read
Mac M3 Pro: Realistic Model Limits
What actually fits in 18GB or 36GB unified memory with Ollama and llama.cpp.
MacM3 ProOllamaUnified Memory
18GB M3 Pro
Stick to 7–8B models at Q4. Avoid 14B+ unless you accept very short context.
text
✓ Llama 3.1 8B Q4_K_M (ctx 8K)
✓ Qwen2.5 7B Q4_K_M
✗ Qwen2.5 14B Q4_K_M (needs 36GB+)36GB M3 Pro
14B models at Q4_K_M with 8K context work well. 32B requires Q3 or heavy context sacrifice.
bash
ollama pull qwen2.5:14b
ollama run qwen2.5:14bDeployment guides are educational. Each model is subject to its own license — read the official Hugging Face model card before downloading or deploying.