2 docs tagged with "self-hosting"

10.1 Self-Hosting LLMs

Ollama, vLLM, TGI, hardware sizing, and GPU vs CPU inference for self-hosted LLMs.

Ollama, vLLM, hardware requirements, and quantization (Q4/Q8) for self-hosting.