Tagged articles
12 articles
Page 1 of 1
Machine Heart
Machine Heart
May 11, 2026 · Artificial Intelligence

UniVidX Sets New SOTA on Multiple Video Tasks – A Unified Multimodal Framework Presented at SIGGRAPH 2026

UniVidX, a unified multimodal framework for video generation and understanding accepted at SIGGRAPH 2026, reformulates diverse video graphics tasks as conditional generation, achieving or surpassing state‑of‑the‑art performance while demonstrating strong data efficiency and cross‑domain generalization.

Diffusion ModelsSIGGRAPH 2026UniVidX
0 likes · 10 min read
UniVidX Sets New SOTA on Multiple Video Tasks – A Unified Multimodal Framework Presented at SIGGRAPH 2026
Machine Heart
Machine Heart
Apr 14, 2026 · Artificial Intelligence

Why Action‑Centric World Models Outperform Generalist: The GigaWorld‑Policy Breakthrough

The article critiques the goal‑driven focus of Generalist's world models, introduces the action‑centric GigaWorld‑Policy architecture that makes video generation optional, explains its three‑stage training pipeline, and presents experimental results showing ten‑fold training efficiency, 360 ms inference per step, and an 83% success rate on real‑robot tasks.

Action‑Centric ArchitectureGigaWorld‑PolicyTransfer Scaling Law
0 likes · 11 min read
Why Action‑Centric World Models Outperform Generalist: The GigaWorld‑Policy Breakthrough
AI Explorer
AI Explorer
Mar 5, 2026 · Artificial Intelligence

Can a Thousand Hours of Data Spark True AI Emergence?

An AI startup claims that training with only a thousand hours of data produced emergent intelligence and outperformed industry leaders in benchmark tests, prompting a debate over whether this represents a paradigm shift in efficient learning or an overhyped breakthrough requiring further validation.

AIModel architecturebenchmark
0 likes · 5 min read
Can a Thousand Hours of Data Spark True AI Emergence?
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 12, 2025 · Artificial Intelligence

Why Fixing Bad Cases Beats Adding More Data in RLHF

In industrial RLHF, repairing bad cases—structural error samples—provides explicit alignment signals that improve model capability far more efficiently than simply increasing data volume, because it teaches the model how to correct mistakes rather than just exposing it to more examples.

Capability ImprovementModel AlignmentRLHF
0 likes · 9 min read
Why Fixing Bad Cases Beats Adding More Data in RLHF
Data Party THU
Data Party THU
Oct 24, 2025 · Artificial Intelligence

How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI

The paper introduces the LIMI framework, which achieves state‑of‑the‑art agent performance on AgencyBench using only 78 carefully crafted samples—outperforming baseline models trained on thousands of examples—by focusing on high‑quality, strategic data construction and demonstrating superior generalization across code, research, and tool‑use tasks.

AgencyBenchAgent AIBenchmarking
0 likes · 11 min read
How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI
Amap Tech
Amap Tech
Oct 6, 2025 · Artificial Intelligence

Breaking VLA Training Limits: World-Env’s Virtual Sandbox for Safe, Data‑Efficient Robotics

World-Env introduces a virtual training sandbox that eliminates physical interaction, dramatically improves data efficiency with just five expert demos per task, and employs a vision‑language model as a semantic judge to dynamically terminate actions, enabling safe, high‑performing VLA post‑training across diverse robotic benchmarks.

data efficiencyvirtual environmentvision-language-action
0 likes · 9 min read
Breaking VLA Training Limits: World-Env’s Virtual Sandbox for Safe, Data‑Efficient Robotics
AIWalker
AIWalker
Jan 18, 2025 · Artificial Intelligence

How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data

Shanghai AI Laboratory’s InternLM 3.0 upgrade demonstrates that a refined 4 TB token dataset can boost a large‑language model’s performance beyond that of open‑source peers trained on 18 TB, cutting training cost by over 75% while merging regular dialogue with deep reasoning capabilities.

AI EvaluationInternLMdata efficiency
0 likes · 9 min read
How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data
AIWalker
AIWalker
Jan 17, 2025 · Artificial Intelligence

InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data

Shanghai AI Laboratory’s InternLM 3.0 upgrade demonstrates that refining data quality—measured as intelligence‑per‑token—can replace massive datasets, achieving higher reasoning and dialogue capabilities with just 4 TB of tokens, cutting training cost by over 75 % while approaching GPT‑4‑level performance.

AI researchInternLMModel Evaluation
0 likes · 9 min read
InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data
AIWalker
AIWalker
Jan 16, 2025 · Artificial Intelligence

How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data

InternLM 3.0 (InternLM‑3) upgrades the Shusheng‑PuYu model by refining data to boost "thinking density", using only 4 TB of tokens to surpass peer open‑source models, cutting training cost by over 75% while merging ordinary dialogue with deep reasoning capabilities.

InternLMModel Evaluationdata efficiency
0 likes · 9 min read
How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data
Data Thinking Notes
Data Thinking Notes
Dec 26, 2024 · Artificial Intelligence

How AI Large Models are Transforming Industries: Trends, Challenges, and Opportunities

This report, jointly authored by Qianzhan Industry Research Institute, Shougang Fund CANPLUS, and Huawei Cloud, examines AI large‑model applications across four dimensions—overview, current status with case studies, pain points and solutions, and future trends—highlighting how these models boost production efficiency, balance cost, privacy, and performance, and elevate data’s role in industry.

AIArtificial Intelligencedata efficiency
0 likes · 4 min read
How AI Large Models are Transforming Industries: Trends, Challenges, and Opportunities
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 7, 2022 · Big Data

How Alibaba’s Big Data Model Governance Boosted Efficiency and Cut Costs

This article details Alibaba's large‑scale data model governance initiative, analyzing current data issues, presenting a comprehensive solution—including model digitization, public model sinking, productization, daily governance, and search‑enhancement—and outlining achieved results and future plans to further improve data quality, reuse, and operational efficiency.

Data GovernanceDataWorksModel Scoring
0 likes · 12 min read
How Alibaba’s Big Data Model Governance Boosted Efficiency and Cut Costs