Data Party THU
Author

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

316
Articles
0
Likes
15
Views
0
Comments
Recent Articles

Latest from Data Party THU

100 recent articles max
Data Party THU
Data Party THU
Jan 21, 2026 · Artificial Intelligence

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Analyzing recent DeepSeek flashmla repository commits, the article uncovers that the mysterious Model1 likely corresponds to DeepSeek‑V4, detailing architectural shifts to a 512‑dimensional head, full support for NVIDIA Blackwell GPUs, token‑level sparse MLA, and new mechanisms such as Value Vector Position Awareness and Engram.

DeepSeekDeepSeek V4GPU Optimization
0 likes · 6 min read
What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM
Data Party THU
Data Party THU
Jan 19, 2026 · Artificial Intelligence

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

The article introduces Huawei's VersatileFFN, an adaptive wide‑and‑deep feed‑forward design for large language models that reuses parameters to slash memory consumption while delivering stronger inference, detailing its dual‑system inspiration, technical mechanisms, experimental gains, and implications for efficient LLM deployment.

Adaptive ComputationLLMTransformer
0 likes · 8 min read
How VersatileFFN Cuts Memory Use While Boosting LLM Performance
Data Party THU
Data Party THU
Jan 18, 2026 · Artificial Intelligence

Unlocking 3D Scene Synthesis: A Deep Dive into Neural Radiance Fields (NeRF)

This article explains the core principles of Neural Radiance Fields, detailing how a fully‑connected network maps 5‑D coordinates to color and density, the role of positional encoding and hierarchical sampling, and provides a complete PyTorch implementation with training and rendering examples.

3D Scene RepresentationHierarchical SamplingNeRF
0 likes · 18 min read
Unlocking 3D Scene Synthesis: A Deep Dive into Neural Radiance Fields (NeRF)
Data Party THU
Data Party THU
Jan 18, 2026 · Artificial Intelligence

OptScale: Probabilistic Optimal Stopping for Inference‑Time Scaling

OptScale introduces a probabilistic framework that determines the optimal number of inference samples needed to meet a target accuracy with a confidence guarantee, dramatically reducing token usage while maintaining or improving performance across various large‑language‑model benchmarks.

Optimal StoppingProbabilistic ModelingToken Efficiency
0 likes · 12 min read
OptScale: Probabilistic Optimal Stopping for Inference‑Time Scaling
Data Party THU
Data Party THU
Jan 17, 2026 · Industry Insights

Is AI Redefining Software Engineering? A 9‑Magnitude Earthquake Explained

The article analyzes how Andrej Karpathy's viral tweet sparked a seismic shift in software engineering, detailing the rapid rise of AI‑generated code, the emergence of AI agents as new programming abstractions, and practical steps developers and managers must take to stay relevant.

AI agentsAI programmingAI tools
0 likes · 13 min read
Is AI Redefining Software Engineering? A 9‑Magnitude Earthquake Explained
Data Party THU
Data Party THU
Jan 13, 2026 · Artificial Intelligence

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

DeepSeek’s newly open‑sourced Engram module introduces a scalable lookup‑based memory that separates knowledge retrieval from computation, enabling O(1) deterministic access and significantly improving large language model performance on knowledge‑heavy, reasoning, code, and math tasks without extra FLOPs.

LLMLookupMoE
0 likes · 10 min read
How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance
Data Party THU
Data Party THU
Jan 8, 2026 · Industry Insights

How Ben Tossell Built 30 B Tokens of AI‑Powered Projects Using Only CLI

Ben Tossell, a non‑programmer turned AI‑driven creator, spent four months consuming 3 billion tokens to launch multiple CLI‑based projects, illustrating a new programming paradigm where prompt engineering and system orchestration replace traditional coding, and sharing practical lessons, tools, and insights from his experiments.

AICLIno-code
0 likes · 15 min read
How Ben Tossell Built 30 B Tokens of AI‑Powered Projects Using Only CLI
Data Party THU
Data Party THU
Jan 8, 2026 · Fundamentals

Master Python Context Managers: Write Safer, Cleaner Code

Learn how Python’s context manager protocol works, explore both class‑based and generator‑based implementations, and see practical examples—from file handling and database transactions to async operations—so you can prevent resource leaks, ensure exception safety, and write more maintainable code.

AsyncException Handlingbest-practices
0 likes · 11 min read
Master Python Context Managers: Write Safer, Cleaner Code
Data Party THU
Data Party THU
Jan 7, 2026 · Artificial Intelligence

Why the Common KL Penalty in LLM RL Training Is Biased—and How to Fix It

A recent study reveals that the widely used KL regularization in LLM reinforcement learning (RLVR) is mathematically biased, leading to unstable training and poorer generalization, and shows that moving the KL term back to the reward with a simple K1 estimator can boost out‑of‑domain performance by up to 20%.

AI researchKL regularizationLLM training
0 likes · 10 min read
Why the Common KL Penalty in LLM RL Training Is Biased—and How to Fix It