Skip to main content

2.4 Domain-Specific Prompting

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

2.4 Domain-Specific Prompting

Key Concepts: Injecting domain knowledge · Glossary seeding · Persona prompting

Official Docs: OpenAI Prompt Engineering Guide · Anthropic Claude Prompting Guide


The Problem with Generic Prompts

A general-purpose LLM has broad knowledge but shallow expertise. A generic prompt like "Summarise this contract" produces generic output. Domain-specific prompting closes the gap by injecting:

  1. Vocabulary — technical terms the model must use correctly
  2. Rules — domain constraints (legal, medical, financial)
  3. Persona — a domain expert role that activates relevant patterns
  4. Output format — domain-standard structures (SOAP notes, legal clauses)

Technique 1 — Persona Prompting

Assigning a domain expert role reliably improves output quality for specialised tasks.

system_prompt = """
You are a senior software security engineer with 10 years of experience in
OWASP Top 10 vulnerabilities, secure code review, and penetration testing.

When reviewing code:
- Lead with security severity (Critical / High / Medium / Low)
- Reference the OWASP category by name (e.g., A03:2021 Injection)
- Suggest the minimal, safest fix
- Do NOT suggest third-party libraries unless absolutely necessary
"""

Why it works: Personas activate specific patterns from the model's training data. "Senior security engineer" co-occurs in training with OWASP references, CVE discussions, and secure coding practices.


Technique 2 — Glossary Seeding

For highly specialised domains (medical, legal, finance), inject a mini-glossary into the system prompt.

system_prompt = """
You are a clinical documentation assistant.

Key terminology to use precisely:
- SOC: Standard of Care
- ADE: Adverse Drug Event
- ICD-10: International Classification of Diseases, 10th revision
- HPI: History of Present Illness
- SOAP note format: Subjective, Objective, Assessment, Plan

Always use these terms correctly. Never invent medical abbreviations.
"""

Technique 3 — Rule Injection

Embed domain rules directly in the prompt to constrain outputs.

system_prompt = """
You are a financial reporting assistant for a UAE-regulated firm.

Mandatory rules:
1. All monetary values must include currency (AED, USD, EUR)
2. Never provide specific investment advice — redirect to a licensed advisor
3. Comply with IFRS reporting standards
4. If a figure is unavailable, state "Not disclosed" — never estimate
"""

Technique 4 — Few-Shot with Domain Examples

For domain-specific output format, show a worked example in the system prompt:

prompt = """
Convert the following patient notes into SOAP format.

Example input:
"Patient John, 45M, complains of chest tightness for 2 days. BP 140/90. ECG normal.
Started lisinopril 5mg. Follow up in 1 week."

Example SOAP output:
S: 45-year-old male presents with 2-day history of chest tightness.
O: BP 140/90 mmHg. ECG within normal limits.
A: Hypertension, probable anxiety-related chest tightness.
P: Initiate lisinopril 5 mg daily. Follow-up in 1 week; repeat BP check.

Now convert:
{patient_notes}
"""

Combining All Techniques

from openai import OpenAI

client = OpenAI()

SYSTEM = """
You are a senior UAE corporate lawyer specialising in contract law.

Key terms:
- Force majeure: unforeseeable circumstances preventing contract fulfilment
- Indemnification clause: obligation to compensate for specified losses
- Liquidated damages: pre-agreed compensation for breach

Rules:
1. Cite UAE Federal Law No. 5/1985 (Civil Transactions Law) where relevant
2. Flag high-risk clauses with ⚠️
3. Never give a final legal opinion — recommend review by a licensed UAE lawyer
4. Output format: clause name, risk level, plain-English summary, recommendation
"""

def review_clause(clause_text: str) -> str:
resp = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": SYSTEM},
{"role": "user", "content": f"Review this clause:\n\n{clause_text}"}
],
temperature=0,
)
return resp.choices[0].message.content

print(review_clause(
"The supplier shall not be liable for any indirect or consequential loss "
"arising from delays caused by events beyond its reasonable control."
))

Common Mistakes

Common Mistakes
  1. Overloading the system prompt — inject only what’s needed. A 3,000-token system prompt with irrelevant rules wastes context and confuses the model.
  2. Not testing domain terms — always verify the model uses your injected glossary correctly in edge cases.
  3. Forgetting the escape hatch — domain models should always know when to say "I don't know" rather than fabricate domain-specific facts.
  4. Using persona without rules — "You are a doctor" alone is insufficient; add explicit rules about what the model can and cannot say.

Quick Quiz

Test Your Understanding

Q1. Why does giving the model a domain expert persona improve output quality?
A1. The persona activates patterns from training data where that expert role co-occurs with domain-specific terminology and reasoning styles.

Q2. What is glossary seeding?
A2. Injecting a short list of technical terms and their definitions into the system prompt so the model uses them precisely.

Q3. A financial bot says "The stock will likely rise 20% next month." Which rule was missing from the system prompt?
A3. A rule forbidding specific investment predictions/advice — e.g., "Never make price predictions or investment recommendations."


Student Exercise

Exercise 2.4 — Build a domain prompt
Pick a domain you know well (medicine, law, finance, software security, cricket). Write a system prompt that includes: a persona, at least 3 domain rules, a glossary of 3 terms, and a required output format. Test it with gpt-4o-mini and evaluate whether the output respects all rules.


Further Reading

Next → 2.5 Prompt Versioning & Management