2.3 Structured Output

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

2.3 Structured Output

Key Concepts: JSON mode · XML tags · Schema enforcement · Pydantic validation

Official Docs: OpenAI Structured Outputs · Anthropic Tool Use · Pydantic Docs

Why Structured Output?

When your code needs to parse the model's response — extract a score, populate a database, call a downstream API — free-form text is fragile. Structured output guarantees a machine-readable format.

Method 1 — OpenAI Structured Outputs

OpenAI's response_format with a Pydantic model guarantees the output matches your schema via constrained decoding.

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class SentimentResult(BaseModel):
    sentiment: str          # "positive" | "negative" | "neutral"
    confidence: float       # 0.0 – 1.0
    key_phrases: list[str]

response = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Analyse the sentiment of the review."},
        {"role": "user",   "content": "Absolutely loved it — fast shipping, great quality!"}
    ],
    response_format=SentimentResult,
)

result: SentimentResult = response.choices[0].message.parsed
print(result.sentiment)     # positive
print(result.confidence)    # 0.97

Method 2 — JSON Mode (Wider Compatibility)

import json
from pydantic import BaseModel, ValidationError

class Review(BaseModel):
    sentiment: str
    score: int

raw = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": 'Respond ONLY with JSON: {"sentiment": str, "score": int 1-5}'},
        {"role": "user",   "content": "The product broke on day one."}
    ],
    response_format={"type": "json_object"},
).choices[0].message.content

try:
    review = Review.model_validate_json(raw)
    print(review)
except ValidationError as e:
    print("Validation error:", e)

Method 3 — Anthropic XML Tags

import anthropic, re

client = anthropic.Anthropic()

prompt = """
Analyse the sentiment and respond using EXACTLY this format:
<sentiment>positive|negative|neutral</sentiment>
<confidence>0.0 to 1.0</confidence>

Review: 'Fantastic product, exceeded expectations!'
"""

text = client.messages.create(
    model="claude-3-5-haiku-20241022",
    max_tokens=128,
    messages=[{"role": "user", "content": prompt}],
).content[0].text

sentiment  = re.search(r"<sentiment>(.*?)</sentiment>",   text).group(1)
confidence = re.search(r"<confidence>(.*?)</confidence>", text).group(1)
print(sentiment, confidence)

Key Takeaways

Use OpenAI Structured Outputs (client.beta.chat.completions.parse) for strictest guarantees
Use JSON mode + Pydantic for broader model compatibility
Use XML tags with Anthropic Claude
Always validate with Pydantic before using LLM output in production

2.3 Structured Output

Why Structured Output?​

Method 1 — OpenAI Structured Outputs​

Method 2 — JSON Mode (Wider Compatibility)​

Method 3 — Anthropic XML Tags​

Key Takeaways​

Further Reading​

Why Structured Output?

Method 1 — OpenAI Structured Outputs

Method 2 — JSON Mode (Wider Compatibility)

Method 3 — Anthropic XML Tags

Key Takeaways

Further Reading