Quick Answer
Claude API pricing across the 4 tiers (Opus, Sonnet, Haiku) with cache pricing, batch API discounts, and the routing strategy that cuts AI bills 60-80% at scale. Authoritative breakdown for production teams.
Quick Answer
Claude 4 Opus is $15/M input + $75/M output. Claude 4.5 Sonnet is $3/M input + $15/M output. Claude Haiku 4 is $0.25/M input + $1.25/M output. Prompt caching cuts cached input costs by 90%; the batch API gives 50% off non-real-time work. A production stack that routes intelligently between tiers typically runs at 25–40% of the per-token list price.
Full pricing table (per 1M tokens)
| Model | Input | Output | Cache write | Cache read | Best for |
|---|---|---|---|---|---|
| Claude 4 Opus | $15.00 | $75.00 | $18.75 | $1.50 | Agentic coding, complex reasoning, 200K+ context |
| Claude 4.5 Sonnet | $3.00 | $15.00 | $3.75 | $0.30 | Balanced workhorse — most production traffic |
| Claude Haiku 4 | $0.25 | $1.25 | $0.30 | $0.025 | Classification, extraction, batch tasks |
The 3 discounts that cut Claude bills 60–80%
1. Prompt caching (up to 90% off cached input)
Mark stable parts of your prompt (system prompt, tool definitions, few-shot examples) as cacheable. Subsequent calls within 5 minutes hit cache at 1/10th the price. For any chatbot or agent with a long system prompt, prompt caching alone cuts costs 50–80%.
2. Batch API (50% off non-real-time work)
For workloads that don't need sub-second response (overnight content generation, classification jobs, embeddings backfill), submit via the Batch API. 50% discount, 24-hour SLA. Most operators run 30–50% of total tokens through Batch.
3. Model routing (Haiku for 70% of calls)
Most production traffic is classification, extraction, summarization, or simple reasoning — Haiku handles all of it at 1/60th the Opus price. Reserve Opus for hard agentic loops and complex code reasoning. Sonnet sits in the middle for balanced general work.
Concrete cost examples
- AI customer-support bot, 10K tickets/mo: ~$60/mo at Sonnet, ~$8/mo at Haiku with prompt caching. Most teams overpay 7–10x by defaulting to Sonnet.
- Daily AI newsletter generation (RSS → digest): ~$3–8/mo running through Sonnet via Batch API.
- Agentic coding assistant for a small team: $200–800/mo at Opus depending on usage. Cache aggressively.
- Bulk content classification (1M items/mo): $40–80/mo on Haiku Batch.
Claude vs OpenAI pricing (head-to-head)
At the cheap tier, Haiku ($0.25/$1.25) costs ~67% more than GPT-4o-mini ($0.15/$0.60), but produces stronger reasoning per call so cost-per-correct-output often favors Haiku. At the Opus/GPT-4 tier, prices are comparable and quality varies by task. Full Claude vs GPT-4 comparison.
How to get a Claude API key
- Sign up at console.anthropic.com with email or Google.
- Add a payment method (required even for free credits).
- Navigate to API Keys → Create Key. Copy and store securely.
- Set a monthly spend limit to prevent runaway costs.
Detailed walkthrough: how to get and use a Claude API key.
Build production AI products with Claude
Pricing intelligence is one piece — the AI SaaS Builder course covers the full 30-day path from idea to $10K MRR using Claude API + Next.js + Supabase + Stripe. Real builds, not toy demos.