1.5 Running Your First LLM
Ollama local setup, first API call to OpenAI/Anthropic, and comparing outputs.
Ollama local setup, first API call to OpenAI/Anthropic, and comparing outputs.
Ollama, vLLM, TGI, hardware sizing, and GPU vs CPU inference for self-hosted LLMs.
Ollama, vLLM, hardware requirements, and quantization (Q4/Q8) for self-hosting.