Advanced Prompting
Chain-of-thought, role prompts, and structured outputs.
← All modules in this stageYou already have the five-part template from Module 3. This module gives you three patterns that take any model from "usually right" to "reliably right": chain-of-thought, structured outputs, and self-critique.
By the end of this module you'll have
- A "think step by step" prompt that visibly improves answer quality on hard questions
- A working structured output call that returns JSON you can parse without tears
- A two-pass self-critique loop that catches its own mistakes
Time: about 1 hour for the basics, ~6 hours with all three notebooks.
Prerequisites: Foundations stage complete — Modules 1–5.
Pattern 1 · Chain-of-thought
For multi-step reasoning, asking Claude to show its working before answering measurably improves accuracy. The cost: more output tokens and slightly more latency.
from anthropic import Anthropic
from dotenv import load_dotenv
load_dotenv()
client = Anthropic()
question = (
"A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. "
"How much does the ball cost?"
)
# Without chain-of-thought:
direct = client.messages.create(
model="claude-haiku-4-5-20251001", max_tokens=80,
messages=[{"role": "user", "content": question + "\nAnswer with only the number."}],
)
# With chain-of-thought:
cot = client.messages.create(
model="claude-haiku-4-5-20251001", max_tokens=300,
messages=[{"role": "user", "content": question + "\nThink step by step, then give the final answer on a new line prefixed with 'Answer:'."}],
)
print("DIRECT:\n", direct.content[0].text)
print("\nCHAIN-OF-THOUGHT:\n", cot.content[0].text)
Use chain-of-thought when the problem has more than one step. Don't use it for things like classification or extraction — it just wastes tokens.
Newer models do this implicitly. Sonnet 4.6 and Opus 4.7 often reason internally before answering. Chain-of-thought still helps for high-stakes tasks where you want the reasoning visible and auditable.
Pattern 2 · Structured outputs (JSON, reliably)
Telling the model "reply in JSON" gets you JSON most of the time. Showing a schema and giving it an example makes it nearly bulletproof.
import json
schema_hint = """\
Respond with one JSON object matching this shape:
{
"sentiment": "positive" | "neutral" | "negative",
"topic": string (one or two words),
"confidence": number from 0 to 1
}
No prose. No code fences. One JSON object on a single line.
"""
review = "Setup was painless and audio is crisp, but the app crashes daily on Android."
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=200,
system=schema_hint,
messages=[{"role": "user", "content": review}],
)
raw = response.content[0].text.strip()
data = json.loads(raw) # parses cleanly
print(data["sentiment"], data["topic"], data["confidence"])
Three reliability tricks worth knowing:
- State the format twice — once in
system, once at the end of the user message ("One JSON object, nothing else"). - Refuse code fences explicitly. Models love wrapping JSON in
```jsonblocks. - Prefill the assistant turn with the opening
{so the model has nowhere to put prose:
python
messages=[
{"role": "user", "content": review},
{"role": "assistant", "content": "{"}, # forces continuation from "{"
]
Prepend "{" to the response text before parsing. This trick alone removes most "the model added a sentence" failures.
For complex schemas, tool use (Module 8) is even more reliable than structured prompting. Use that when you need guarantees, not heuristics.
Pattern 3 · Self-critique (two-pass loop)
For high-stakes answers, ask Claude to draft, then critique its own draft, then revise.
draft_prompt = "Write a 4-sentence customer apology email for a delayed shipment. Be warm but specific."
review_prompt = """\
You are a strict editor. Read the draft below. List up to 3 concrete problems
(unclear claims, vague apologies, missing specifics). Then write an improved
version. Format:
PROBLEMS:
- ...
IMPROVED:
<the rewritten email>
"""
draft = client.messages.create(
model="claude-sonnet-4-6", max_tokens=300,
messages=[{"role": "user", "content": draft_prompt}],
).content[0].text
revision = client.messages.create(
model="claude-sonnet-4-6", max_tokens=500,
messages=[
{"role": "user", "content": draft_prompt},
{"role": "assistant", "content": draft},
{"role": "user", "content": review_prompt},
],
).content[0].text
print("DRAFT:\n", draft, "\n\nCRITIQUE + REVISION:\n", revision)
Self-critique roughly halves obvious errors. It also doubles cost and latency, so reserve it for things that actually need it.
When to reach for which pattern
| Situation | Pattern |
|---|---|
| Multi-step math, logic, or planning | Chain-of-thought |
| Output needs to be machine-readable | Structured outputs (or tool use, Module 8) |
| High-stakes content shown to users | Self-critique |
| Long generation (essay, doc, plan) | Outline → fill, with self-critique on the outline |
| Many independent items | Run cheaper model in parallel, escalate the disagreements |
Try changing one thing
- Run the bat-and-ball question on Sonnet without chain-of-thought. Most of the time it gets it right anyway — newer models reason quietly.
- Add
"Be terse. No more than 60 tokens of working."to the chain-of-thought prompt. Latency drops noticeably. - Break the structured-output schema deliberately (e.g.
"confidence": "high"). Catch the parse error and ask Claude to fix it — you've just built a self-healing loop. - Use a cheaper model for the draft and a stronger model for the critique. Often the best ROI.
Going deeper: open the notebooks
notebooks/01_introduction.ipynb— chain-of-thought, decomposition, self-critique on real tasks (~1.5–2h)notebooks/02_intermediate.ipynb— versioned prompt libraries, robustness tests (~2–3h)notebooks/03_advanced.ipynb— JSON schemas, guardrails, prompts that prepare for tool use (~1.5–2.5h)
Module checklist
- [ ] You've seen chain-of-thought change an answer from wrong to right
- [ ] You've parsed JSON from Claude with
json.loads()and not had to clean it first - [ ] You've run a self-critique pass and caught a problem you hadn't spotted
- [ ] You can name one situation where each of the three patterns is the right call
Next module
Module 7 · Building Applications — turn these patterns into a real app with a clean structure.