Claude API Basics
Call the Claude API from Python with confidence.
← All modules in this stageCalling messages.create once is easy. Wrapping it in code that doesn't fall over the first time the network blips, you hit a rate limit, or a user types something weird — that's this module.
By the end of this module you'll have
- A working streaming response (text appears as Claude writes it, not after)
- Sensible error handling for the four or five things that actually go wrong in production
- A reusable client wrapper that retries safely without spamming the API
Time: about 1 hour for the basics, ~4 hours with all three notebooks.
Prerequisites: Modules 1, 2, and 3.
Stream a response (so users don't stare at a blank screen)
messages.create waits until the entire response is ready. For anything user-facing, that's a poor experience. Use messages.stream instead:
from anthropic import Anthropic
from dotenv import load_dotenv
load_dotenv()
client = Anthropic()
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=600,
messages=[{"role": "user", "content": "Write a 6-sentence pep talk for someone learning to code."}],
) as stream:
for text_delta in stream.text_stream:
print(text_delta, end="", flush=True)
print()
final = stream.get_final_message()
print(f"\n[done] {final.usage.input_tokens} in / {final.usage.output_tokens} out")
text_stream yields the text fragments as they arrive. get_final_message() returns the full message object once the stream closes — that's where usage lives.
Errors you'll actually hit
The Anthropic SDK raises a small set of exceptions. Handle them on purpose:
from anthropic import (
Anthropic,
APIConnectionError, # network problem on your side
APITimeoutError, # request took too long
RateLimitError, # 429 — slow down
APIStatusError, # 4xx/5xx with a status code
AuthenticationError, # 401 — bad key
BadRequestError, # 400 — your prompt or args are wrong
)
A minimal but honest call site:
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=400,
messages=[{"role": "user", "content": user_input}],
)
except AuthenticationError:
raise SystemExit("Bad ANTHROPIC_API_KEY — check your .env file.")
except BadRequestError as e:
# Fix the request shape; do not retry.
raise SystemExit(f"Bad request: {e}")
except RateLimitError:
# Retry with backoff (see below).
raise
except (APIConnectionError, APITimeoutError):
# Network-y; retry with backoff.
raise
Two key rules:
- Never retry on
BadRequestErrororAuthenticationError. Those won't fix themselves. - Always retry with backoff on
RateLimitError,APIConnectionError,APITimeoutError.
A tiny retry wrapper
import time, random
from anthropic import Anthropic, RateLimitError, APIConnectionError, APITimeoutError
client = Anthropic()
TRANSIENT = (RateLimitError, APIConnectionError, APITimeoutError)
def call_with_retry(messages, *, model="claude-sonnet-4-6", max_attempts=4):
for attempt in range(max_attempts):
try:
return client.messages.create(
model=model,
max_tokens=600,
messages=messages,
)
except TRANSIENT:
if attempt == max_attempts - 1:
raise
# Exponential backoff with jitter: ~1s, 2s, 4s plus 0–1s randomness.
time.sleep((2 ** attempt) + random.random())
Things this does not do (and you might want later, but not yet):
- Per-user rate limiting on your side.
- Idempotency keys. Module 14 (Production Patterns) shows when those matter.
- Circuit breakers. Same — useful at higher scale, distracting now.
Do the simple thing first.
What to log (and what not to)
Log enough to debug a bad reply tomorrow. Log nothing that endangers users.
| Log this | Avoid logging this |
|---|---|
| Model id, latency, input/output token counts | Full user prompts containing PII |
| Whether you got a refusal or a normal reply | API keys, even hashed |
| The first 200 chars of the user message (truncated, hashed if sensitive) | Long prompt bodies in flat files indefinitely |
| Error type and request ID | Anything you can't justify in a privacy review |
Try changing one thing
- Make the
systemarg part ofcall_with_retryso retried calls keep the same persona. - Add a
timeout=30.0arg toclient.messages.createand trigger anAPITimeoutErrordeliberately by setting a tiny timeout (e.g.0.001). - Switch from
messages.streamtomessages.create(..., stream=True)and inspect raw event types (streamis a higher-level helper around the same thing). - Print
response.stop_reasonafter a normal call. Recognise"end_turn","max_tokens","stop_sequence"— that's your signal whether the response is complete.
Going deeper: open the notebooks
notebooks/01_introduction.ipynb— message formats, streaming UX, what to log (~1.5–2h)notebooks/02_intermediate.ipynb— middleware, retries, idempotency, cost tracking (~2–3h)notebooks/03_advanced.ipynb— multi-tenant abuse prevention, caching, capacity planning (~1.5–2.5h)
Module checklist
- [ ] You've streamed a response token-by-token to your terminal
- [ ] You can name an exception you should retry and one you shouldn't
- [ ] You have a wrapper function that retries with backoff
- [ ] You can list three things to log and three things never to log
Next module
Module 5 · Tokens & Limits — what the numbers in response.usage actually cost you and how to keep them under control.