2.3 Structured Output
AI-Generated Content
AI-generated content may contain errors. Always verify against official sources.
2.3 Structured Output
Key Concepts: JSON mode · XML tags · Schema enforcement · Pydantic validation
Official Docs: OpenAI Structured Outputs · Anthropic Tool Use · Pydantic Docs
Why Structured Output?
When your code needs to parse the model's response — extract a score, populate a database, call a downstream API — free-form text is fragile. Structured output guarantees a machine-readable format.
Method 1 — OpenAI Structured Outputs
OpenAI's response_format with a Pydantic model guarantees the output matches your schema via constrained decoding.
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
class SentimentResult(BaseModel):
sentiment: str # "positive" | "negative" | "neutral"
confidence: float # 0.0 – 1.0
key_phrases: list[str]
response = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Analyse the sentiment of the review."},
{"role": "user", "content": "Absolutely loved it — fast shipping, great quality!"}
],
response_format=SentimentResult,
)
result: SentimentResult = response.choices[0].message.parsed
print(result.sentiment) # positive
print(result.confidence) # 0.97
Method 2 — JSON Mode (Wider Compatibility)
import json
from pydantic import BaseModel, ValidationError
class Review(BaseModel):
sentiment: str
score: int
raw = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": 'Respond ONLY with JSON: {"sentiment": str, "score": int 1-5}'},
{"role": "user", "content": "The product broke on day one."}
],
response_format={"type": "json_object"},
).choices[0].message.content
try:
review = Review.model_validate_json(raw)
print(review)
except ValidationError as e:
print("Validation error:", e)
Method 3 — Anthropic XML Tags
import anthropic, re
client = anthropic.Anthropic()
prompt = """
Analyse the sentiment and respond using EXACTLY this format:
<sentiment>positive|negative|neutral</sentiment>
<confidence>0.0 to 1.0</confidence>
Review: 'Fantastic product, exceeded expectations!'
"""
text = client.messages.create(
model="claude-3-5-haiku-20241022",
max_tokens=128,
messages=[{"role": "user", "content": prompt}],
).content[0].text
sentiment = re.search(r"<sentiment>(.*?)</sentiment>", text).group(1)
confidence = re.search(r"<confidence>(.*?)</confidence>", text).group(1)
print(sentiment, confidence)
Key Takeaways
- Use OpenAI Structured Outputs (
client.beta.chat.completions.parse) for strictest guarantees - Use JSON mode + Pydantic for broader model compatibility
- Use XML tags with Anthropic Claude
- Always validate with Pydantic before using LLM output in production
Further Reading
- 📘 OpenAI — Structured Outputs Guide
- 📘 Anthropic — Tool Use for Structured Data
- 📘 Pydantic v2 Documentation
Next → Chapter 4: RAG