DeepSeek founder Liang Wenfeng has personally confirmed in internal communications that the next-generation flagship model V4 will be officially released in late April. Leaked specifications indicate a total parameter count nearing 1 trillion, support for around a million tokens of context, and will also run fully on Huawei Ascend chips—widely seen as a key step for China’s AI to break its reliance on Nvidia.
(Background: DeepSeek V4 refuses Nvidia—goes with Huawei! Alibaba, ByteDance, Tencent all rush to buy the Huawei Ascend 950PR chips)
(Additional context: DeepSeek launched “Expert Mode” and “Vision Mode”—is V4’s official release preceded by a final warm-up?)
According to a report from Sina Finance citing insiders, DeepSeek founder Liang Wenfeng has revealed that the next-generation flagship large model DeepSeek V4 will make its debut in late April. Although the official side has not yet released a specific date, the developer community has already felt early warm-up signals: the V4-Lite variant is currently being tested on API nodes, with inference speed up 30% compared with the previous generation, and the 128K tokens context recall rate reaching 94%.
Based on currently leaked, unconfirmed information, the V4 architecture continues the Mixture-of-Experts (MoE) design. The total parameter count is about 1 trillion, but the number of parameters actually activated per token is only about 37 billion—maintaining DeepSeek’s consistent “meticulous accountant” style in compute efficiency.
As for the context window: V4, through a brand-new Engram module, is expected to support ultra-long context of up to 1 million tokens, going head-to-head with current top-tier models. The core idea of Engram is conditional memory lookup, enabling the model to access knowledge with a complexity of O(1) rather than linearly expanding with sequence length.
In terms of capabilities, leaked benchmark tests show HumanEval at 90% and SWE-bench Verified above 80%. If the data is accurate, it would also be close to existing mainstream flagship models. On modalities, V4 is natively able to take text, image, and video inputs; pricing is about $0.30/MTok (input), continuing DeepSeek’s low-cost strategy.
Beyond technical specifications, the point that has drawn the most attention for V4 is a complete shift in hardware strategy: the official claims that the entire model will be executed entirely on Huawei Ascend 950 PR chips, without relying on any Nvidia GPUs.
The impact of this decision goes far beyond DeepSeek itself. Alibaba, ByteDance, and Tencent have already begun purchasing large quantities of Huawei’s next-generation chips. If V4 successfully verifies that Ascend can support the training and inference needs of top flagship models, it would be the most convincing real-world example to date of China’s AI industry chain moving toward chip self-reliance.
In this context, U.S. export control measures on Nvidia may instead become a catalyst that accelerates the maturity of China’s homegrown ecosystem.