1.5 Running Your First LLM
AI-Generated Content
AI-generated content may contain errors. Always verify against official sources.
1.5 Running Your First LLM
Key Concepts: Ollama local setup · API call to OpenAI/Anthropic · Comparing outputs
Official Docs: Ollama · OpenAI Quickstart · Anthropic Quickstart
Option A — Run a Model Locally with Ollama (Free)
Ollama lets you run open-weight models locally. It exposes an OpenAI-compatible API on localhost:11434.
1. Install Ollama
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows — download from https://ollama.com/download
2. Pull and Run a Model
# Pull a small model (~2 GB)
ollama pull llama3.2
# Interactive chat in terminal
ollama run llama3.2
3. Call Ollama from Python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama", # required field, not validated locally
)
response = client.chat.completions.create(
model="llama3.2",
messages=[{"role": "user", "content": "Explain gradient descent in 3 sentences."}],
temperature=0.5,
)
print(response.choices[0].message.content)
Option B — OpenAI API
1. Get an API Key
- Sign up at platform.openai.com
- Go to API Keys → Create new secret key
- Set environment variable:
export OPENAI_API_KEY="sk-..."
2. Install and Call
pip install openai
from openai import OpenAI
client = OpenAI() # reads OPENAI_API_KEY from environment
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the transformer architecture?"}
],
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")
Option C — Anthropic Claude API
pip install anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=256,
messages=[
{"role": "user", "content": "What is the transformer architecture?"}
],
)
print(message.content[0].text)
Key Takeaways
- Ollama is the fastest way to run models locally for free
- OpenAI and Anthropic SDKs have a nearly identical structure
- Always check
usage.total_tokensto monitor costs - Start with smaller, cheaper models (
gpt-4o-mini,claude-3-5-haiku) before moving up
Further Reading
Next Chapter → Chapter 2: Prompt Engineering