Data Party THU
Author

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

316
Articles
0
Likes
14
Views
0
Comments
Recent Articles

Latest from Data Party THU

100 recent articles max
Data Party THU
Data Party THU
Feb 7, 2026 · Artificial Intelligence

How AlphaGenome Decodes 98% of the Genome’s Dark Matter

Google DeepMind’s AlphaGenome, featured on Nature’s cover, reads up to one million DNA bases at once, predicts the functional impact of any mutation across gene expression, splicing, chromatin and protein binding, and outperforms prior models by more than double on key benchmarks.

AIAlphaGenomeDeepMind
0 likes · 9 min read
How AlphaGenome Decodes 98% of the Genome’s Dark Matter
Data Party THU
Data Party THU
Feb 4, 2026 · Artificial Intelligence

How Sakana AI Redefines Long-Context Transformers: DroPE, REPO, and FwPKM Explained

This article analyzes Sakana AI's three recent papers that challenge traditional Transformer long‑sequence handling by removing positional embeddings, reconstructing position awareness, and adding a fast‑weight external memory, showing how each approach improves ultra‑long text understanding.

Memory MechanismPositional EmbeddingTransformer
0 likes · 12 min read
How Sakana AI Redefines Long-Context Transformers: DroPE, REPO, and FwPKM Explained
Data Party THU
Data Party THU
Feb 2, 2026 · Fundamentals

Why Standardize Data to Mean 0 and Variance 1?

The article explains that setting the mean to zero recenters data around the origin, making optimization algorithms converge faster, while scaling variance to one equalizes feature scales so no single feature dominates, illustrated with examples and visualizations of how standardization improves machine‑learning models.

data preprocessingfeature scalingmachine learning
0 likes · 5 min read
Why Standardize Data to Mean 0 and Variance 1?
Data Party THU
Data Party THU
Feb 1, 2026 · Artificial Intelligence

How Tiny Perturbations Can Fool 95% Accurate Image Classifiers

Despite achieving over 95% accuracy on ImageNet, popular models like ResNet, VGG, and EfficientNet can be easily misled by carefully crafted adversarial examples using FGSM, revealing deep learning’s inherent vulnerability and prompting the need for robust defense strategies.

FGSMPyTorchadversarial examples
0 likes · 11 min read
How Tiny Perturbations Can Fool 95% Accurate Image Classifiers
Data Party THU
Data Party THU
Feb 1, 2026 · Artificial Intelligence

How AutoLink Turns Schema Linking into an Interactive Database Exploration

AutoLink introduces an autonomous, iterative schema‑linking approach for Text‑to‑SQL that treats schema discovery as a progressive, agent‑driven exploration, dramatically improving recall while cutting token costs, and outperforms existing database‑level and element‑level methods on large benchmarks such as Spider 2.0‑Lite and BIRD.

AgentAutoLinkDatabase Exploration
0 likes · 19 min read
How AutoLink Turns Schema Linking into an Interactive Database Exploration
Data Party THU
Data Party THU
Jan 31, 2026 · Artificial Intelligence

Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough

The article examines the Test‑Time Training to Discover (TTT‑Discover) approach, which applies reinforcement learning during inference to let large language models continuously improve on single test problems, and reports strong results across mathematics, GPU kernel optimization, algorithm design, and biology.

AI researchLLMReinforcement learning
0 likes · 9 min read
Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough
Data Party THU
Data Party THU
Jan 29, 2026 · Big Data

How a Tsinghua Big Data Program Turned a Chemistry PhD into an AI‑Powered Process Engineer

This article recounts a Tsinghua University PhD student's journey through a multidisciplinary big‑data training program, detailing the acquisition of AI and data‑science skills, the creation of novel algorithms like MicroFlowSAM and ImageRAG, and their successful application to chemical engineering research and industry projects.

Chemical EngineeringIndustrial ApplicationProcess Systems Engineering
0 likes · 8 min read
How a Tsinghua Big Data Program Turned a Chemistry PhD into an AI‑Powered Process Engineer
Data Party THU
Data Party THU
Jan 26, 2026 · Artificial Intelligence

How PropMolFlow Boosts Property‑Guided Molecule Generation by Tenfold

PropMolFlow, a new flow‑matching model introduced by researchers from the University of Florida and NYU, generates property‑guided molecules up to ten times faster than prior SOTA methods while preserving chemical validity and achieving superior performance on benchmarks such as QM9.

AI drug discoveryPropMolFlowcomputational chemistry
0 likes · 7 min read
How PropMolFlow Boosts Property‑Guided Molecule Generation by Tenfold
Data Party THU
Data Party THU
Jan 25, 2026 · Big Data

How Tsinghua’s Big Data Initiative Boosted Refinery Energy Forecasts with GRU

The Tsinghua University Big Data Capability Project applied GRU‑based deep learning, pulse‑event encoding, and advanced feature engineering to transform discrete refinery energy data into continuous sequences, achieving prediction accuracies of 84.2%, 82.7% and 81.6% for fuel gas, medium‑pressure and low‑pressure steam respectively.

GRUenergy predictionfeature engineering
0 likes · 9 min read
How Tsinghua’s Big Data Initiative Boosted Refinery Energy Forecasts with GRU