
DeepSeek today released preview versions of its highly anticipated V4 AI models, once again narrowing the gap with leading AI models from the world’s largest technology companies.
The Chinese startup released two open source versions, a high-performance V4 Pro model and a smaller, cheaper V4 Flash model. The company presents both as competitive with cutting-edge systems, highlighting strong coding performance, improved reasoning and more advanced agency capabilities.
One of the most striking updates is the jump to a 1 million token context window, which allows models to process entire code bases or extremely long documents in a single message.
But what really sets these models apart is their focus on efficiency.
V4 models are based on a combination of experts (MoE) architecture, a design that activates only a subset of the model parameters at any given time. While the system may have trillions of parameters in total, only a fraction are used per task, keeping inference costs low.
The new models come just over a year after DeepSeek first made headlines with its R1 reasoning model. That system rivaled advanced models from companies like OpenAI and Google, but was reportedly built at a fraction of the cost and used fewer AI chips for training. The news even sparked a Trillion-dollar sell-off on Wall Streetand Nvidia lost almost $600 million in a single day.
in a technical documentThe company says its latest models are competitive, although it acknowledges a small performance gap.
“Through the expansion of reasoning tokens, DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks,” the company said. “However, its performance is slightly below that of GPT-5.4 and Gemini-3.1-Pro, suggesting a development trajectory that lags behind the state-of-the-art models by approximately 3 to 6 months.”
Still, for many users, the cost savings can offset any slight performance shortfalls.
Datasette creator Simon Willison compared token prices across major models in his blog and found DeepSeek to be the cheapest in its class.
DeepSeek is charging $0.14 per million input tokens and $0.28 per million output tokens for its V4 Flash model. For comparison, GPT-5.4 Nano costs $0.20 per million input tokens and $1.25 per million output tokens, while Claude Haiku 4.5 is priced at $1 and $5 per million input and output tokens, respectively.
The gap becomes even starker when it comes to professional models. DeepSeek charges $1.74 per million input tokens and $3.48 per million output tokens for its V4 Pro model. In comparison, Gemini 3.1 Pro costs $2 per million input tokens and $12 per million output tokens, while GPT-5.5 is priced at $5 and $30 per million input and output tokens, respectively.
And of course, in line with previous DeepSeek releases, V4 is MIT licensed and open weight, so if you have the resources to run it, it’s “free” in the same way a movie on Netflix is. No one is charging you at the moment you press play, but the meter is still running somewhere. In this case, it is your electricity bill.





