Tagged articles

Scaling Law

41 articles · Page 1 of 1

Jun 19, 2026 · Artificial Intelligence

Beyond SONIC: Humanoid Robot Cerebellum Hits GPT‑Level Performance with 2 B Motion‑Capture Frames

Galaxy General unveils AstraBrain‑WBC 0.5, a transformer‑based humanoid robot control model that scales from 200 K to 2 billion motion‑capture frames, achieving up to 92.58% tracking success, 0.39 ms latency, and five‑fold speed over TWIST, thereby confirming a scaling law for robot motion control.

AstraBrain-WBCDAgger DistillationHumanoid Robot

0 likes · 16 min read

Beyond SONIC: Humanoid Robot Cerebellum Hits GPT‑Level Performance with 2 B Motion‑Capture Frames

Machine Heart

May 22, 2026 · Artificial Intelligence

How Data and Algorithms Enable Embodied Intelligence Scaling – GigaAI’s Dual‑Pyramid Physical AGI

GigaAI unveiled a dual‑pyramid framework that couples a five‑layer data hierarchy with a three‑layer algorithm hierarchy, demonstrated top‑ranked benchmark results, announced a hundred‑robot home deployment and a 12‑month roadmap toward a physical AGI "GPT‑3 moment".

Dual-Pyramid ArchitectureEmbodied IntelligenceGigaAI

0 likes · 13 min read

How Data and Algorithms Enable Embodied Intelligence Scaling – GigaAI’s Dual‑Pyramid Physical AGI

Machine Heart

Apr 29, 2026 · Artificial Intelligence

Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data

LDA‑1B unifies world modeling and VLA in a latent dynamics action model, ingesting over 30 000 hours of heterogeneous embodied data via a five‑layer AstraData pipeline, employing a unified end‑effector space and quality‑based data allocation, and achieving state‑of‑the‑art success rates on RoboCasa‑GR1 while being fully open‑sourced.

Embodied AIScaling Lawdata ingestion

0 likes · 13 min read

Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data

DataFunTalk

Apr 28, 2026 · Artificial Intelligence

Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding

Manifold AI’s WorldScape 0.2 achieved the highest overall score on the embodied world‑model benchmark WorldArena, outperforming giants like Google and Nvidia by excelling in comprehensive perception, physics compliance, and 3D accuracy while using only about 10 % of the parameters of competing models, thanks to a newly introduced MoE architecture.

Embodied AIMoEScaling Law

0 likes · 9 min read

Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding

AI Explorer

Apr 27, 2026 · Artificial Intelligence

Reinforcement Learning Scaling Law Shows How RL Fine‑Tuning Boosts Large Model Reasoning

A new study by USTC and Shanghai AI Lab uncovers a power‑law scaling relationship between RL fine‑tuning compute and large‑model reasoning performance, offering a quantitative way to predict and control AI capability growth.

AI researchScaling Lawlarge language models

0 likes · 7 min read

Reinforcement Learning Scaling Law Shows How RL Fine‑Tuning Boosts Large Model Reasoning

Machine Heart

Apr 27, 2026 · Artificial Intelligence

ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models

The paper presents a systematic empirical study that derives a power‑law scaling formula for reinforcement‑learning‑after‑training of large language models, demonstrating accurate inter‑ and intra‑model performance prediction, learning‑efficiency saturation, data‑reuse benefits, and cross‑architecture validity.

Data ReuseLlama 3Model Efficiency

0 likes · 11 min read

ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models

Machine Heart

Apr 27, 2026 · Artificial Intelligence

Domestic World Model Claims Dual Crown, Surpassing Google and Nvidia via MoE Scaling

Manifold AI's WorldScape 0.2 topped the WorldArena benchmark by excelling in visual quality, physics compliance, and 3D accuracy, while using only 10% of the parameters of competing models, thanks to a newly introduced MoE architecture that drives a new scaling law for world models.

Embodied AIManifold AIMixture of Experts

0 likes · 8 min read

Domestic World Model Claims Dual Crown, Surpassing Google and Nvidia via MoE Scaling

Machine Learning Algorithms & Natural Language Processing

Apr 17, 2026 · Artificial Intelligence

Can Table Modeling Scale? Rethinking Tree Models in the Age of Massive Compute

The article examines how the dramatic increase in GPU compute power—illustrated by a single H100 GPU equaling about 200 Hadoop instances—challenges the dominance of tree‑based models for structured data, presents scaling‑law experiments with KMLP and FOUND, and argues that pre‑training can redefine the balance between compute, data, and algorithms.

FOUNDGPUKMLP

0 likes · 10 min read

Can Table Modeling Scale? Rethinking Tree Models in the Age of Massive Compute

Machine Heart

Apr 17, 2026 · Artificial Intelligence

Can Table Modeling Scale? Rethinking the Tree Model Era Amid Compute Shifts

The article examines how a single NVIDIA H100 GPU delivers roughly 200‑fold more FP16 compute than a 96‑core CPU Hadoop node, explores the "Bitter Lesson" of scaling‑driven AI breakthroughs, and presents large‑scale pretraining experiments that show table and sequence models now exhibit clear scaling laws, challenging the dominance of traditional tree‑based approaches.

FOUNDKMLPScaling Law

0 likes · 10 min read

Can Table Modeling Scale? Rethinking the Tree Model Era Amid Compute Shifts

DeWu Technology

Apr 15, 2026 · Industry Insights

How Generative AI is Transforming Recommendation: A Deep Dive into DeWu’s Recall System

This article analyzes DeWu's generative recall system, detailing its background, technical design of the Generative and Rerank models, inference workflow, experimental gains in core consumption and diversity metrics, and future engineering directions such as framework migration, LLM integration, and multimodal generation.

Generative AIIndustry insightScaling Law

0 likes · 12 min read

How Generative AI is Transforming Recommendation: A Deep Dive into DeWu’s Recall System

Tencent Advertising Technology

Mar 23, 2026 · Industry Insights

Why Tencent’s $885K KDD Cup Challenge Could Redefine Recommendation Systems

The 2026 KDD Cup, powered by Tencent’s Advertising Algorithm Competition with an $885,000 prize pool, challenges participants to unify sequence modeling and feature interaction in large‑scale recommendation systems, offering academic publication paths, real‑world deployment opportunities, and strict latency constraints that push both research and engineering innovation.

AIKDD CupRecommendation Systems

0 likes · 16 min read

Why Tencent’s $885K KDD Cup Challenge Could Redefine Recommendation Systems

Machine Learning Algorithms & Natural Language Processing

Mar 15, 2026 · Artificial Intelligence

Is RL Dead in LLM Post-Training? MIT’s RandOpt Challenges Traditional Methods

The MIT‑CSAIL paper introduces RandOpt, a single‑step, gradient‑free, fully parallel post‑training algorithm that adds Gaussian noise to pretrained LLM weights and ensembles the results, achieving or surpassing PPO/GRPO performance by exploiting dense "neural thickets" that emerge as model scale grows.

EnsembleLLMRandOpt

0 likes · 12 min read

Is RL Dead in LLM Post-Training? MIT’s RandOpt Challenges Traditional Methods

Baidu Intelligent Cloud Tech Hub

Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI InfrastructureBaidu BaigeCloud Computing

0 likes · 26 min read

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

JD Tech

Nov 6, 2025 · Artificial Intelligence

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

This article surveys the evolution of generative recommendation systems powered by large language models, detailing their technical foundations, engineering challenges, recent breakthroughs, and future research directions, while highlighting why the paradigm shift is occurring now.

AI EngineeringLLMRecommendation Systems

0 likes · 30 min read

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

Baobao Algorithm Notes

Oct 31, 2025 · Artificial Intelligence

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

Meta’s recent paper reveals a sigmoid‑shaped scaling law for LLM reinforcement learning, presents extensive 40‑k GPU‑hour experiments, compares various RL designs such as PPO‑off‑policy‑k and Pipeline‑RL‑k, and distills the findings into a practical “ScaleRL” recipe that improves performance and efficiency.

LLMRL OptimizationScaling Law

0 likes · 10 min read

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

AntTech

Oct 29, 2025 · Artificial Intelligence

Inside Ant’s Baoling: Balancing Efficiency and Reasoning in a 1‑Trillion‑Parameter Model

At the Ant Star Innovation Journey event, the Baoling team unveiled their roadmap for trillion‑parameter models, detailing the development of Ling‑1T, Ring‑1T and multimodal Ming series, the scaling‑law‑guided architecture, training innovations, evaluation methods, and open‑source releases that aim to advance efficient, high‑performance AI.

Efficient InferenceLarge Language ModelScaling Law

0 likes · 24 min read

Inside Ant’s Baoling: Balancing Efficiency and Reasoning in a 1‑Trillion‑Parameter Model

DataFunSummit

Sep 11, 2025 · Artificial Intelligence

How Meituan’s MTGR is Redefining Generative Recommendation at Scale

This article explains why Meituan introduced a generative recommendation model, describes the MTGR architecture, data organization, training and inference engines built on TorchRec and TensorRT, reports performance gains and cost reductions, and outlines future directions such as simplifying the recommendation funnel and cross‑business heterogeneous modeling.

Inference OptimizationMTGRScaling Law

0 likes · 15 min read

How Meituan’s MTGR is Redefining Generative Recommendation at Scale

DataFunSummit

Sep 9, 2025 · Artificial Intelligence

How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking

This article explains Baidu's generative ranking model GRAB, detailing how scaling laws from large language models inspire a new recommendation paradigm, the model's architecture, custom attention mechanisms, training strategies, deployment optimizations, and the resulting business gains in CTR and revenue.

BaiduCTR PredictionGenerative AI

0 likes · 22 min read

How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking

Data Party THU

Jul 29, 2025 · Artificial Intelligence

Can 2‑Simplicial Attention Outperform Standard Transformers? A Deep Dive

This article reviews Meta's rotation‑invariant 2‑simplicial attention, explains its trilinear formulation and windowed implementation, analyzes its impact on scaling laws compared with standard dot‑product attention, and presents experimental results showing when the new mechanism offers advantages.

2-simplicial attentionMetaNeural architecture

0 likes · 12 min read

Can 2‑Simplicial Attention Outperform Standard Transformers? A Deep Dive

AI Frontier Lectures

Jul 10, 2025 · Artificial Intelligence

Can 2‑Simplicial Attention Redefine Transformer Scaling Laws?

A recent Meta paper introduces a rotation‑invariant 2‑simplicial attention mechanism, demonstrates its superior scaling‑law coefficients over standard dot‑product attention, and provides experimental evidence of improved token efficiency and model performance under constrained token budgets.

2-simplicialMetaScaling Law

0 likes · 11 min read

Can 2‑Simplicial Attention Redefine Transformer Scaling Laws?

JD Tech

Jun 16, 2025 · Artificial Intelligence

How JD Engineers Leverage LLMs and Sparse Models to Boost Search and Ads

This article showcases three JD tech case studies—using large language models for e‑commerce query expansion, applying sparse large models with scaling‑law experiments to improve ad prediction, and building proactive risk‑prevention systems—to illustrate practical AI engineering that drives higher recall, conversion, and system robustness.

AdvertisingLarge Language ModelQuery Expansion

0 likes · 8 min read

How JD Engineers Leverage LLMs and Sparse Models to Boost Search and Ads

Meituan Technology Team

May 15, 2025 · Artificial Intelligence

How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws

Meituan’s recommendation team introduced the MTGR framework, aligning traditional DLRM features with a unified HSTU‑based Transformer to explore scaling laws, delivering a 65‑fold FLOPs boost, 12% lower inference cost, and record gains in online CTR and order volume across its food‑delivery platform.

Inference OptimizationLarge‑Scale TrainingMTGR

0 likes · 26 min read

How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws

Network Intelligence Research Center (NIRC)

Apr 9, 2025 · Artificial Intelligence

Why Scaling Laws Fail for Video MLLMs: Uncovering the Temporal Hacking Problem

The article analyzes the anti‑scaling phenomenon in video large‑language models, identifies a “temporal hacking” shortcut where models focus on a few key frames, formalizes it via reward‑hacking theory, introduces the Temporal Perplexity (TPL) metric, and proposes an Unhackable Temporal Rewarding (UTR) framework to mitigate the issue.

Scaling LawTemporal PerplexityUTR

0 likes · 14 min read

Why Scaling Laws Fail for Video MLLMs: Uncovering the Temporal Hacking Problem

Architects' Tech Alliance

Mar 28, 2025 · Artificial Intelligence

How DeepSeek Leverages Huawei Ascend to Boost AI Inference Efficiency

The report analyzes DeepSeek's latest V3 and R1 models, highlights their scaling‑law‑driven cost reductions, explains how Huawei Ascend optimizes inference by cutting KV‑Cache storage and improving compute efficiency, and surveys the model’s deployments across finance, government, manufacturing, and healthcare sectors.

AI efficiencyAI inferenceDeepSeek

0 likes · 4 min read

How DeepSeek Leverages Huawei Ascend to Boost AI Inference Efficiency

Alimama Tech

Mar 14, 2025 · Artificial Intelligence

Advances in Search Advertising Models with Large Language Models (2024)

In 2024 Alibaba Mama outlines how large‑language models transform search advertising through a three‑line scaling roadmap—explicit inductive‑bias design, implicit compute growth, and auxiliary CV/NLP advances—implemented via a pre‑train/post‑train/CTR paradigm and the LUM user‑behavior model, promising gains in relevance, recall, and real‑time serving while highlighting inference efficiency challenges.

CTR PredictionScaling Lawlarge language models

0 likes · 25 min read

Advances in Search Advertising Models with Large Language Models (2024)

JD Retail Technology

Mar 6, 2025 · Artificial Intelligence

Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training

Jia Xing’s research introduces Dynamic Margin Selection, a technique that repeatedly refreshes a core set of boundary‑close samples to train large language models efficiently on limited resources, achieving comparable loss to full‑data training, enabling six‑fold model compression, faster inference, and a proposed exponential scaling law for data‑efficient AI.

ICLRLow-Resource TrainingScaling Law

0 likes · 10 min read

Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training

Architect

Feb 19, 2025 · Artificial Intelligence

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

The article critically examines whether the pre‑training Scaling Law still applies to Grok 3, compares its compute usage and model size with DeepSeek and OpenAI models, evaluates the cost‑effectiveness of pre‑training, RL and test‑time scaling, and explores how these insights shape future large‑language‑model development strategies.

Grok 3Model EfficiencyPre‑training

0 likes · 11 min read

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

Architect

Feb 12, 2025 · Artificial Intelligence

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

The article analyses how S‑shaped growth curves can model the apparent scaling laws of large language models, discusses the three phases of model development, proposes an ability‑density hypothesis, and explores future scenarios where scaling laws may plateau or shift.

AI growthAbility DensityModel Training

0 likes · 16 min read

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

Alibaba Cloud Developer

Feb 10, 2025 · Artificial Intelligence

Understanding the AI Wave: A Deep Dive into Large Models and Their Impact

This article offers a comprehensive overview of large models, covering their historical evolution, technical foundations, the current "hundred‑model" competition, practical use cases across industries, and future challenges such as safety, controllability, and efficient deployment.

Scaling Lawlarge modelsretrieval‑augmented generation

0 likes · 33 min read

Understanding the AI Wave: A Deep Dive into Large Models and Their Impact

DaTaobao Tech

Jan 22, 2025 · Artificial Intelligence

AI Trends 2025: Paths to AGI, Scaling Law Evolution, and Industry Impact

The article surveys the AI revolution driven by foundation models and an evolving Scaling Law, outlining four AGI pathways—large models, intelligent robots, brain‑computer interfaces, and digital life—while highlighting transformer‑based convergence, generative‑first‑principle breakthroughs like DeepSeek‑V3, and transformative industry impacts ranging from consumer robots to Medical 2.0, personalized education, and digital‑simulation platforms such as NVIDIA’s Omniverse.

AGIAIAI industry

0 likes · 23 min read

AI Trends 2025: Paths to AGI, Scaling Law Evolution, and Industry Impact

NewBeeNLP

Dec 3, 2024 · Artificial Intelligence

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

The article reflects on open‑source LLMs like Qwen2 and Llama 3.1, questioning whether models should self‑review answers, how hidden states might signal uncertainty, the role of loss‑function design, scaling laws, and the trade‑offs between PPO and DPO in alignment.

Reward ModelScaling Lawlarge language models

0 likes · 9 min read

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

DataFunSummit

Nov 20, 2024 · Artificial Intelligence

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practice

This article reviews the evolution of large‑model recommendation techniques, analyzes the specific challenges of health‑oriented e‑commerce recommendation, and details practical deployments such as LLM‑enhanced cold‑start recall, DeepI2I expansion, and scaling‑law‑driven CTR models within JD Health.

CTRScaling Lawe-commerce

0 likes · 18 min read

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practice

Baobao Algorithm Notes

Oct 13, 2024 · Artificial Intelligence

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

This article provides a comprehensive analysis of the HLLM paper, detailing its hierarchical LLM architecture for item and user modeling, the training objectives, fusion strategies, extensive offline and online experiments, scaling behavior, ablation studies, and practical deployment insights in large‑scale recommendation systems.

Industrial DeploymentLLMScaling Law

0 likes · 12 min read

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

DataFunTalk

Sep 16, 2024 · Artificial Intelligence

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practical Deployments

This article reviews the evolution of large‑model recommendation techniques, analyzes the specific demands and obstacles of health‑focused e‑commerce, and details JD Health's practical implementations—including LLM‑enhanced recall, deep item‑to‑item models, and scaling‑law‑driven CTR improvements—while discussing open research questions and future directions.

CTRHealthcareLLM-enhancement

0 likes · 17 min read

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practical Deployments

Tencent Advertising Technology

Jul 24, 2024 · Artificial Intelligence

Multi-Embedding Paradigm for Scaling Recommendation Models: Mitigating Embedding Dimensional Collapse

This paper investigates the embedding dimensional collapse problem that hinders scaling of recommendation models and proposes a Multi-Embedding paradigm that learns multiple embeddings per feature with independent expert networks, demonstrating consistent performance gains across major CTR benchmarks and real‑world ad systems.

CTR PredictionScaling Lawartificial-intelligence

0 likes · 10 min read

Multi-Embedding Paradigm for Scaling Recommendation Models: Mitigating Embedding Dimensional Collapse

Kuaishou Tech

Jul 17, 2024 · Artificial Intelligence

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

The article details Kuaishou’s development of the 175B “Kuaiyi” multimodal large model, presenting eight novel technical innovations—from Temporal Scaling Law and MiLe Loss to MoE‑enhanced reward modeling—and describes how these advances enable high‑performance AI services such as the AI Xiao Kuai chatbot across diverse real‑world scenarios.

AI ApplicationsLarge Language ModelModel Optimization

0 likes · 12 min read

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

NewBeeNLP

Jul 5, 2024 · Artificial Intelligence

Unveiling Meta’s Wukong: How Scaling Laws Boost Large‑Scale Recommendation Performance

Meta’s new paper introduces the Wukong model, demonstrating that expanding dense‑layer parameters and computational FLOPs in large‑scale recommendation systems follows a clear scaling law, yielding consistent performance gains across massive internal datasets, with detailed analysis of feature modules, parameter impacts, and experimental results.

CTR modelsMetaRecommendation Systems

0 likes · 10 min read

Unveiling Meta’s Wukong: How Scaling Laws Boost Large‑Scale Recommendation Performance

Baobao Algorithm Notes

Apr 21, 2024 · Artificial Intelligence

Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data

The article analyzes Llama 3’s architecture, training data expansion, model variants, Meta’s open‑source strategy, the evolving gap between open and closed models, and how future breakthroughs in synthetic data will shape scaling laws and large‑model progress through 2025 and beyond.

AI trendsLlama3Scaling Law

0 likes · 12 min read

Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data

NewBeeNLP

Apr 10, 2024 · Artificial Intelligence

What Scaling Laws Reveal About LLM Fine‑Tuning and RLHF Performance

This article reviews recent scaling‑law research on large‑language‑model fine‑tuning and RLHF, explaining how data quantity, model size, PET parameters, reward‑model size and KL‑penalty affect downstream performance and offering practical insights for efficient training.

LLMRLHFScaling Law

0 likes · 11 min read

What Scaling Laws Reveal About LLM Fine‑Tuning and RLHF Performance

NewBeeNLP

Mar 28, 2024 · Industry Insights

How Meta’s HSTU Architecture Scales Recommendation Systems Beyond Decades of Deep Models

Meta introduces a generative recommendation framework (GR) built on the Hierarchical Sequential Transduction Unit (HSTU) that unifies heterogeneous features, treats user behavior as a new modality, and leverages novel encoder and inference optimizations to achieve order‑of‑magnitude scaling in model size, training compute, and online latency while delivering 12‑18% online gains over traditional deep recommendation models.

HSTUMetaPerformance Optimization

0 likes · 36 min read

How Meta’s HSTU Architecture Scales Recommendation Systems Beyond Decades of Deep Models

DataFunSummit

Nov 5, 2023 · Artificial Intelligence

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

This article presents a memory‑driven architecture (HCNet and MemoNet) that equips recommendation models with scaling‑law characteristics by storing and retrieving arbitrary feature‑combination embeddings, evaluates multi‑hash codebooks, memory‑restoring strategies, key‑feature selection, and demonstrates significant offline and online performance gains.

Scaling Lawfeature interactionlarge language models

0 likes · 15 min read

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach