
Anthropic today released Claude Opus 4.8an update to its flagship model that ships at the same price as its predecessor, along with a dramatically cheaper model. "fast mode" level and a new feature that allows the model to generate hundreds of parallel subagents for work at codebase scale.
The model is immediately available on Anthropic platforms (claude.ai, Claude Code, API and Cowork) at an unchanged price: $5 per million input tokens and $25 per million output tokens. Developers can call it like claude-opus-4-8.
The main issue of efficiency is fast mode. Anthropic has reduced the price of running Opus 4.8 in fast mode, where the model produces tokens at about 2.5 times the normal speed, to $10 per million input tokens and $50 per million output tokens, compared to $30/$150 for Opus 4.7.
This represents a three-fold reduction in the price of Fast Mode on previous models and puts high-performance inference within reach of latency-sensitive production workloads.
Quick mode is available immediately in Claude Code via the /fast domain; API access is closed, with a waiting list in claude.com/fast-mode.
In normal mode, Claude Opus 4.8 is still among the most expensive leading frontier models, but it is still below the GPT-5.5 of its main rival OpenAI.
Frontier AI Model API Pricing Overview
|
Model |
Input |
Production |
Total cost |
Fountain |
|
Flash MiMo-V2.5 |
$0.10 |
$0.30 |
$0.40 |
|
|
deepseek-v4-flash |
$0.14 |
$0.28 |
$0.42 |
|
|
deepseek-v4-pro |
$0.435 |
$0.87 |
$1,305 |
|
|
Minimax M2.7 |
$0.30 |
$1.20 |
$1.50 |
|
|
Gemini 3.1 Flash-Lite |
$0.25 |
$1.50 |
$1.75 |
|
|
MiMo V2.5 |
$0.40 |
$2.00 |
$2.40 |
|
|
Kimi-K2.6 |
$0.95 |
$4.00 |
$4.95 |
|
|
GLM-5 |
$1.00 |
$3.20 |
$4.20 |
|
|
Grok 4.3 low context |
$1.25 |
$2.50 |
$3.75 |
|
|
GLM-5.1 |
$1.40 |
$4.40 |
$5.80 |
|
|
Claude Haiku 4.5 |
$1.00 |
$5.00 |
$6.00 |
|
|
Grok 4.3 high context |
$2.50 |
$5.00 |
$7.50 |
|
|
Qwen3.7-Max |
$2.50 |
$7.50 |
$10.00 |
|
|
Gemini 3.5 Flash |
$1.50 |
$9.00 |
$10.50 |
|
|
Gemini 3.1 Pro Preview ≤200K |
$2.00 |
$12.00 |
$14.00 |
|
|
GPT-5.4 |
$2.50 |
$15.00 |
$17.50 |
|
|
Gemini 3.1 Pro Preview >200K |
$4.00 |
$18.00 |
$22.00 |
|
|
Close Job 4.8 |
$5.00 |
$25.00 |
$30.00 |
|
|
GPT-5.5 |
$5.00 |
$30.00 |
$35.00 |
Modest gains over 4.7, but Mythos-class capabilities coming
As far as benchmarks go, Opus 4.8 is a step forward rather than a leap. It scores 88.6% in SWE-bench Verified (vs. 87.6% for Opus 4.7), 69.2% in the tougher SWE-bench Pro (vs. 64.3%), and 74.6% in Terminal-Bench 2.1 (vs. 66.1%). Anthropic itself characterizes the model as "a modest but tangible improvement over its predecessor."
It outperforms the regular GPT-5.5 in at least 12 benchmarks, including most knowledge work, coding (at the problem level), agent tool usage, and long context benchmarks. GPT-5.5 wins in terminal/CLI workflows and is roughly tied in web browsing and graduate sciences.
The biggest sign is on Anthropic’s internal capabilities scale: Opus 4.8 lands between Opus 4.7 and the more capable Claude Mythos Preview, which is currently restricted to a small number of organizations under Project Glasswing for cybersecurity work.
Anthropic says it hopes to bring "Mythos class models for all our clients in the coming weeks." once additional cyber safeguards have been implemented.
Several business partners cited material gains. Databricks reported that Opus 4.8 is unlocked "a radical change in agent reasoning" within your Genie data agent, in "Token cost 61% cheaper than Opus 4.7" thanks to multimodal efficiency in PDF files and diagrams.
Hebbia cited better citation accuracy and greater symbolic efficiency in dense financial presentations. Cognition, maker of Devin, said the launch "directly translates into faster capacity gains for engineers" and noted that Opus 4.8 fixed comment verbosity and tool call issues from 4.7. One computer usage provider reported 84% on Online-Mind2Web, a jump over Opus 4.7 and GPT-5.5.
Dynamic workflows: hundreds of parallel subagents
Along with the model, Anthropic released a research preview of dynamic workflows in Claude Code, a feature designed for tasks too large for a single context window. Claude plans the work, generates hundreds of parallel subagents, and then checks his own results before reporting. The Anthropic example: a codebase-scale migration "across hundreds of thousands of lines of code from inception to merge, with the existing test suite as the bar."
Dynamic workflows are available on Claude Code’s Enterprise, Team, and Max plans.
Two smaller additions complete the release:
-
Effort control in claude.ai and Claude Cowork: A new dial allows users to dial in how much Claude thinks per answer: higher effort spends more tokens to get better answers, lower effort answers faster and burns through speed limits more slowly. Available on all plans.
-
System entries within the messages array in the API: Developers can now update Claude instructions mid-task (adjusting permissions, token budgets, or environment context while an agent is running) without breaking the message cache.
Honesty and a "evaluation awareness" warning
Anthropic leads with honesty as its main trait. The company’s alignment team reports that Opus 4.8 is "about four times less likely than your predecessor to allow flaws in the code you have written to go unnoticed," and that misaligned behavior rates are now "substantially lower than Opus 4.7, and similar to our top-ranked model, Claude Mythos Preview."
In fact, a bar chart published by Anthropic shows how close Opus 4.8 is to the still selectively released Mythos in terms of its misalignment (a lower score is better), coming in at about 1.9, below Opus 4.7’s 2.5 and effectively tied with the more capable and restrained Mythos Preview. The score is based on approximately 2,600 simulated research sessions per model.
He 244-page system card published publicly by Anthropic also goes into greater detail about specific categories of misalignment: whether a model produces potentially harmful content around "military grade weapons," "harmful sexual content", "cybercrime not allowed"and "undermining liberal democracy," and again, in all of them, Opus 4.8 scores noticeably better than 4.7 or Sonnet 4.6, and comes pretty close to Mythos.
Anthropic flags a discovery that considers "the most worrying" From training: Opus 4.8 shows an increasing tendency to reason explicitly about how his results will be graded, even in environments where he was not told he was being evaluated. In other words: the model knows that you are likely to be graded and produces an answer that it believes will get you a good grade on the test, not one that you would necessarily produce if you thought you would not be graded.
Anthropic says this did not translate into worse observable behavior (Opus 4.8 shows fewer misleading claims of task success than previous models), but calls it "a worrying trend that could complicate training in the future." Preliminary interpretability work also found nonverbalized learner-related reasoning in approximately 5% of training episodes.
Anthropic ran the model through a week-long live bug bounty for a rapid injection (for the first time) and concluded that Opus 4.8 ranks between Opus 4.7 and Sonnet 4.6 in robustness, ahead of "all comparable frontier models" tested, with safeguards in place that reduce the success rates of browser usage attacks to almost zero.
What’s next?
Anthropic sparked two trajectories. In the short term: cheaper models that provide "many of the same capabilities as Opus." Longer term: Mythos-class models, which the company says represent higher intelligence than Opus but require stronger cyber safeguards before general release.
For now, Opus 4.8 is positioned as the new enterprise and development workhorse: a little smarter than 4.7, dramatically cheaper to run quickly, and noticeably more honest about what it doesn’t know.





