DeepSeek’s New Model V4? Exploring 1M‑Token Context and Updated Knowledge
DeepSeek quietly launched its latest model, reportedly supporting up to 1 million tokens, extending its knowledge cutoff to May 2025, adopting a more enthusiastic response style, and still operating as a pure‑text system, while early tests showcase impressive coding and reasoning capabilities.
Key Updates in DeepSeek V4 (2026)
1. Ultra‑long context
Model can process up to 1 000 000 tokens in a single request, roughly ten times the 128 K token limit of V3.1. This enables feeding entire books (e.g., the three‑volume “Three‑Body” series) without chunking.
2. Knowledge cutoff
Training data now includes events up to around May 2025. The model can answer questions about the first half of 2025 even though it runs without internet access.
3. Answer style
Responses are more expressive and nuanced; the model adopts a slightly more enthusiastic tone compared with earlier releases.
4. Modality
The release does not add visual perception. Input is limited to plain text and audio; images can only be processed via OCR to extract embedded text.
Practical evaluation with @PaperAgent
We used the PaperAgent tool to request a single‑file HTML fireworks animation. The prompt was:
Create a breathtaking fireworks animation! Use a single HTML file, combine CSS and JavaScript, and make the screen burst into a vibrant night sky with multiple colors and explosion trajectories that loop automatically.
The model returned a complete index.html containing HTML, CSS, and JavaScript. The generated animation displayed smoother particle motion and richer color gradients than the V3.1 baseline, and the response was delivered noticeably faster.
<!DOCTYPE html>
<html>
<head>
<style>
body{margin:0;background:#000;overflow:hidden}
.spark{position:absolute;width:2px;height:2px;background:#fff;
animation:burst 1.5s ease-out forwards}
@keyframes burst{
from{transform:scale(0);opacity:1}
to{transform:scale(1.5);opacity:0}
}
</style>
</head>
<body>
<script>
const colors=['#ff4b5c','#ffb84d','#4bffb5','#4b9eff'];
function launch(){
const s=document.createElement('div');
s.className='spark';
s.style.left=Math.random()*window.innerWidth+'px';
s.style.top=Math.random()*window.innerHeight+'px';
s.style.background=colors[Math.floor(Math.random()*colors.length)];
document.body.appendChild(s);
setTimeout(()=>s.remove(),1500);
}
setInterval(launch,100);
</script>
</body>
</html>A separate logical‑puzzle test showed that the model failed in the default “fast” mode but succeeded when the deeper reasoning mode was enabled.
Recent research releases (potential future integration)
mHC (Manifold‑Constrained Hyper‑Connection) – paper released during the New Year period, proposes a novel network architecture that constrains connections on a manifold to improve parameter efficiency.
Engram – paper and accompanying code released on 12 January 2026, introduces a “conditional memory” mechanism that allows the model to store and retrieve context‑dependent embeddings.
OCR‑2 – open‑sourced on 27 January 2026, a visual compression model that achieves higher‑fidelity OCR and image‑to‑text conversion compared with the original OCR‑1.
It is not yet confirmed whether these techniques have been incorporated into the current V4 test build, but their timing suggests they may appear in upcoming updates.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
