Back to Cookbook
IntermediateServer / VPS 8 min read
TabbyAPI: ExLlamaV2 with a Web UI
Wrap ExLlamaV2 in TabbyAPI for a polished OpenAI-compatible server with streaming and model hot-swap.
TabbyAPIExLlamaV2APIEXL2
Install TabbyAPI
TabbyAPI is the most popular ExLlamaV2 server wrapper with a built-in UI.
bash
git clone https://github.com/theroyallab/tabbyAPI
cd tabbyAPI
pip install -r requirements.txtConfigure and run
Place EXL2 models in the models/ directory and start the API server.
bash
python main.py --port 5000
# API: http://localhost:5000/v1/chat/completionsDeployment guides are educational. Each model is subject to its own license — read the official Hugging Face model card before downloading or deploying.