GPT-4.1 at $2/1M vs Claude Sonnet 4.6 at $3/1M. Real pricing math for every model, with per-task cost examples and routing strategies that cut 50%+ off your bill.

The pricing landscape changed dramatically in 2026. OpenAI's GPT-4.1 undercuts Claude Sonnet on input tokens. Anthropic's Haiku 4.5 remains the cheapest capable model. Here's the real math.
Get AI pricing updates when models launch
Join 50+ engineering leaders. No spam.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context | Best For |
|---|---|---|---|---|
| GPT-5.4 | $2.50 | $10.00 | 128K | Flagship reasoning |
| GPT-4.1 | $2.00 | $8.00 | 128K | Code, instruction-following |
| GPT-4.1 mini | $0.40 | $1.60 | 128K | Balanced performance |
| GPT-4.1 Nano | $0.10 | $0.40 | 128K | High-volume simple tasks |
| GPT-4o | $2.50 | $10.00 | 128K | Multimodal |
Discounts: 50% prompt caching on cache hits. 50% batch API discount.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context | Best For |
|---|---|---|---|---|
| Claude Opus 4.7 | $5.00 | $25.00 | 200K | Complex analysis, research |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 200K | Balanced, coding |
| Claude Haiku 4.5 | $1.00 | $5.00 | 200K | Fast, cheap, capable |
Discounts: 90% prompt caching on read hits. 25% batch API discount.
| Tier | OpenAI | Anthropic | Cheaper |
|---|---|---|---|
| Flagship | GPT-5.4: $2.50/$10 | Opus 4.7: $5/$25 | OpenAI (2x cheaper) |
| Mid-tier | GPT-4.1: $2/$8 | Sonnet 4.6: $3/$15 | OpenAI (input 33% cheaper, output 47% cheaper) |
| Budget | GPT-4.1 Nano: $0.10/$0.40 | Haiku 4.5: $1/$5 | OpenAI (10x cheaper input) |
On raw pricing, OpenAI wins every tier in 2026. But pricing alone doesn't determine your real cost.
Anthropic gives 90% discount on cached prompt reads. OpenAI gives 50%.
If 70% of your tokens hit cache (common in production):
With heavy caching, Anthropic becomes cheaper on input.
Opus 4.7 introduced a new tokenizer that produces ~35% more tokens for the same text compared to older models. Your effective cost per word is higher than the rate card suggests.
Most applications generate 3-5x more output tokens than input. At the output tier:
OpenAI is nearly 2x cheaper on output. If your app is output-heavy (code generation, long-form writing), this matters more than input pricing.
Average: 400 input tokens, 200 output tokens per conversation.
| Provider | Model | Monthly Cost |
|---|---|---|
| OpenAI | GPT-4.1 Nano | $4.00 |
| OpenAI | GPT-4.1 mini | $24.00 |
| Anthropic | Haiku 4.5 | $70.00 |
| Anthropic | Sonnet 4.6 | $210.00 |
Winner: GPT-4.1 Nano — if quality is sufficient.
Average: 500 input tokens, 1,500 output tokens.
| Provider | Model | Monthly Cost |
|---|---|---|
| OpenAI | GPT-4.1 | $65.00 |
| Anthropic | Sonnet 4.6 | $120.00 |
| OpenAI | GPT-5.4 | $81.25 |
| Anthropic | Opus 4.7 | $200.00 |
Winner: GPT-4.1 — best code model per dollar.
Average: 2,000 input tokens (80% cached), 500 output tokens.
| Provider | Model | Monthly Cost (with caching) |
|---|---|---|
| Anthropic | Sonnet 4.6 (90% cache discount) | $98.00 |
| OpenAI | GPT-4.1 (50% cache discount) | $120.00 |
| Anthropic | Haiku 4.5 (90% cache discount) | $35.00 |
| OpenAI | GPT-4.1 mini (50% cache discount) | $56.00 |
Winner: Anthropic — caching discount dominates for RAG.
The best cost setup uses both. Route by task complexity:
import { CostLens } from 'costlens';
import OpenAI from 'openai';
const costlens = new CostLens({ apiKey: 'cl_...' });
const openai = costlens.wrapOpenAI(new OpenAI());
// CostLens routes simple prompts to cheaper models automatically
// Complex prompts stay on your default model
// You see exactly what each request costs in real-time
const resp openai.chat.completions.create({
model: 'gpt-4.1',
messages: [{ role: 'user', content: 'Summarize this document...' }],
});
// Terminal output: ✓ Routed to gpt-4.1-nano (saved $0.003)
Install in 30 seconds:
npm install costlens
No config needed — it tracks costs immediately and suggests cheaper routing opportunities.
Raw pricing winner: OpenAI (cheaper at every tier in 2026)
With heavy caching: Anthropic (90% cache discount beats OpenAI's 50%)
For code generation: GPT-4.1 ($2/$8) — best instruction-following per dollar
For high-volume simple tasks: GPT-4.1 Nano ($0.10/$0.40) — unbeatable
For long-context RAG: Haiku 4.5 with caching — cheapest capable option
Best overall strategy: Use both. Route intelligently. Track everything.
Prices sourced from OpenAI API pricing and Anthropic pricing. Last updated June 2026.
Liked this analysis? We publish one deep-dive per week.
AI pricing, model benchmarks, and real cost data.
See what AI is actually costing your team
Real data from a real engineering team. No sign-up required.