1.5 Running Your First LLM

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

1.5 Running Your First LLM

Key Concepts: Ollama local setup · API call to OpenAI/Anthropic · Comparing outputs

Official Docs: Ollama · OpenAI Quickstart · Anthropic Quickstart

Option A — Run a Model Locally with Ollama (Free)

Ollama lets you run open-weight models locally. It exposes an OpenAI-compatible API on localhost:11434.

1. Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows — download from https://ollama.com/download

2. Pull and Run a Model

# Pull a small model (~2 GB)
ollama pull llama3.2

# Interactive chat in terminal
ollama run llama3.2

3. Call Ollama from Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",   # required field, not validated locally
)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Explain gradient descent in 3 sentences."}],
    temperature=0.5,
)
print(response.choices[0].message.content)

Option B — OpenAI API

1. Get an API Key

Sign up at platform.openai.com
Go to API Keys → Create new secret key
Set environment variable: export OPENAI_API_KEY="sk-..."

2. Install and Call

pip install openai

from openai import OpenAI

client = OpenAI()   # reads OPENAI_API_KEY from environment

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "What is the transformer architecture?"}
    ],
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")

Option C — Anthropic Claude API

pip install anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-3-5-haiku-20241022",
    max_tokens=256,
    messages=[
        {"role": "user", "content": "What is the transformer architecture?"}
    ],
)
print(message.content[0].text)

Key Takeaways

Ollama is the fastest way to run models locally for free
OpenAI and Anthropic SDKs have a nearly identical structure
Always check usage.total_tokens to monitor costs
Start with smaller, cheaper models (gpt-4o-mini, claude-3-5-haiku) before moving up

1.5 Running Your First LLM

Option A — Run a Model Locally with Ollama (Free)​

1. Install Ollama​

2. Pull and Run a Model​

3. Call Ollama from Python​

Option B — OpenAI API​

1. Get an API Key​

2. Install and Call​

Option C — Anthropic Claude API​

Key Takeaways​

Further Reading​

Option A — Run a Model Locally with Ollama (Free)

1. Install Ollama

2. Pull and Run a Model

3. Call Ollama from Python

Option B — OpenAI API

1. Get an API Key

2. Install and Call

Option C — Anthropic Claude API

Key Takeaways

Further Reading