Tagged articles
39 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 29, 2026 · Artificial Intelligence

Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data

LDA‑1B unifies world modeling and VLA in a latent dynamics action model, ingesting over 30 000 hours of heterogeneous embodied data via a five‑layer AstraData pipeline, employing a unified end‑effector space and quality‑based data allocation, and achieving state‑of‑the‑art success rates on RoboCasa‑GR1 while being fully open‑sourced.

Embodied AIRoboticsdata ingestion
0 likes · 13 min read
Beyond VLA and World Models: Galaxy General Unveils LDA‑1B to Scale Embodied Data
DataFunTalk
DataFunTalk
Apr 28, 2026 · Artificial Intelligence

Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding

Manifold AI’s WorldScape 0.2 achieved the highest overall score on the embodied world‑model benchmark WorldArena, outperforming giants like Google and Nvidia by excelling in comprehensive perception, physics compliance, and 3D accuracy while using only about 10 % of the parameters of competing models, thanks to a newly introduced MoE architecture.

BenchmarkEmbodied AIMoE
0 likes · 9 min read
Manifold AI’s WorldScape 0.2 Tops WorldArena: How MoE Drives Superior Physics and 3D Understanding
Machine Heart
Machine Heart
Apr 27, 2026 · Artificial Intelligence

ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models

The paper presents a systematic empirical study that derives a power‑law scaling formula for reinforcement‑learning‑after‑training of large language models, demonstrating accurate inter‑ and intra‑model performance prediction, learning‑efficiency saturation, data‑reuse benefits, and cross‑architecture validity.

Data ReuseLlama 3Qwen2.5
0 likes · 11 min read
ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Apr 17, 2026 · Artificial Intelligence

Can Table Modeling Scale? Rethinking Tree Models in the Age of Massive Compute

The article examines how the dramatic increase in GPU compute power—illustrated by a single H100 GPU equaling about 200 Hadoop instances—challenges the dominance of tree‑based models for structured data, presents scaling‑law experiments with KMLP and FOUND, and argues that pre‑training can redefine the balance between compute, data, and algorithms.

FOUNDGPUKMLP
0 likes · 10 min read
Can Table Modeling Scale? Rethinking Tree Models in the Age of Massive Compute
Machine Heart
Machine Heart
Apr 17, 2026 · Artificial Intelligence

Can Table Modeling Scale? Rethinking the Tree Model Era Amid Compute Shifts

The article examines how a single NVIDIA H100 GPU delivers roughly 200‑fold more FP16 compute than a 96‑core CPU Hadoop node, explores the "Bitter Lesson" of scaling‑driven AI breakthroughs, and presents large‑scale pretraining experiments that show table and sequence models now exhibit clear scaling laws, challenging the dominance of traditional tree‑based approaches.

FOUNDKMLPStructured Data
0 likes · 10 min read
Can Table Modeling Scale? Rethinking the Tree Model Era Amid Compute Shifts
DeWu Technology
DeWu Technology
Apr 15, 2026 · Industry Insights

How Generative AI is Transforming Recommendation: A Deep Dive into DeWu’s Recall System

This article analyzes DeWu's generative recall system, detailing its background, technical design of the Generative and Rerank models, inference workflow, experimental gains in core consumption and diversity metrics, and future engineering directions such as framework migration, LLM integration, and multimodal generation.

Deep Learninggenerative AIindustry insight
0 likes · 12 min read
How Generative AI is Transforming Recommendation: A Deep Dive into DeWu’s Recall System
Tencent Advertising Technology
Tencent Advertising Technology
Mar 23, 2026 · Industry Insights

Why Tencent’s $885K KDD Cup Challenge Could Redefine Recommendation Systems

The 2026 KDD Cup, powered by Tencent’s Advertising Algorithm Competition with an $885,000 prize pool, challenges participants to unify sequence modeling and feature interaction in large‑scale recommendation systems, offering academic publication paths, real‑world deployment opportunities, and strict latency constraints that push both research and engineering innovation.

AIKDD CupRecommendation Systems
0 likes · 16 min read
Why Tencent’s $885K KDD Cup Challenge Could Redefine Recommendation Systems
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 15, 2026 · Artificial Intelligence

Is RL Dead in LLM Post-Training? MIT’s RandOpt Challenges Traditional Methods

The MIT‑CSAIL paper introduces RandOpt, a single‑step, gradient‑free, fully parallel post‑training algorithm that adds Gaussian noise to pretrained LLM weights and ensembles the results, achieving or surpassing PPO/GRPO performance by exploiting dense "neural thickets" that emerge as model scale grows.

LLMRandOptensemble
0 likes · 12 min read
Is RL Dead in LLM Post-Training? MIT’s RandOpt Challenges Traditional Methods
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI InfrastructureBaidu BaigeGPU computing
0 likes · 26 min read
From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure
JD Tech
JD Tech
Nov 6, 2025 · Artificial Intelligence

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

This article surveys the evolution of generative recommendation systems powered by large language models, detailing their technical foundations, engineering challenges, recent breakthroughs, and future research directions, while highlighting why the paradigm shift is occurring now.

AI EngineeringGenerative RecommendationLLM
0 likes · 30 min read
LLMs Revolutionize Recommendation Systems: From Generative Models to Production
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 31, 2025 · Artificial Intelligence

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

Meta’s recent paper reveals a sigmoid‑shaped scaling law for LLM reinforcement learning, presents extensive 40‑k GPU‑hour experiments, compares various RL designs such as PPO‑off‑policy‑k and Pipeline‑RL‑k, and distills the findings into a practical “ScaleRL” recipe that improves performance and efficiency.

LLMRL Optimizationreinforcement learning
0 likes · 10 min read
Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study
AntTech
AntTech
Oct 29, 2025 · Artificial Intelligence

Inside Ant’s Baoling: Balancing Efficiency and Reasoning in a 1‑Trillion‑Parameter Model

At the Ant Star Innovation Journey event, the Baoling team unveiled their roadmap for trillion‑parameter models, detailing the development of Ling‑1T, Ring‑1T and multimodal Ming series, the scaling‑law‑guided architecture, training innovations, evaluation methods, and open‑source releases that aim to advance efficient, high‑performance AI.

efficient inferencelarge language modelscaling law
0 likes · 24 min read
Inside Ant’s Baoling: Balancing Efficiency and Reasoning in a 1‑Trillion‑Parameter Model
DataFunSummit
DataFunSummit
Sep 11, 2025 · Artificial Intelligence

How Meituan’s MTGR is Redefining Generative Recommendation at Scale

This article explains why Meituan introduced a generative recommendation model, describes the MTGR architecture, data organization, training and inference engines built on TorchRec and TensorRT, reports performance gains and cost reductions, and outlines future directions such as simplifying the recommendation funnel and cross‑business heterogeneous modeling.

Generative RecommendationInference OptimizationMTGR
0 likes · 15 min read
How Meituan’s MTGR is Redefining Generative Recommendation at Scale
DataFunSummit
DataFunSummit
Sep 9, 2025 · Artificial Intelligence

How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking

This article explains Baidu's generative ranking model GRAB, detailing how scaling laws from large language models inspire a new recommendation paradigm, the model's architecture, custom attention mechanisms, training strategies, deployment optimizations, and the resulting business gains in CTR and revenue.

BaiduCTR predictionRecommendation Systems
0 likes · 22 min read
How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking
Data Party THU
Data Party THU
Jul 29, 2025 · Artificial Intelligence

Can 2‑Simplicial Attention Outperform Standard Transformers? A Deep Dive

This article reviews Meta's rotation‑invariant 2‑simplicial attention, explains its trilinear formulation and windowed implementation, analyzes its impact on scaling laws compared with standard dot‑product attention, and presents experimental results showing when the new mechanism offers advantages.

2-simplicial attentionMetaNeural architecture
0 likes · 12 min read
Can 2‑Simplicial Attention Outperform Standard Transformers? A Deep Dive
AI Frontier Lectures
AI Frontier Lectures
Jul 10, 2025 · Artificial Intelligence

Can 2‑Simplicial Attention Redefine Transformer Scaling Laws?

A recent Meta paper introduces a rotation‑invariant 2‑simplicial attention mechanism, demonstrates its superior scaling‑law coefficients over standard dot‑product attention, and provides experimental evidence of improved token efficiency and model performance under constrained token budgets.

2-simplicialMetaTransformer
0 likes · 11 min read
Can 2‑Simplicial Attention Redefine Transformer Scaling Laws?
JD Tech
JD Tech
Jun 16, 2025 · Artificial Intelligence

How JD Engineers Leverage LLMs and Sparse Models to Boost Search and Ads

This article showcases three JD tech case studies—using large language models for e‑commerce query expansion, applying sparse large models with scaling‑law experiments to improve ad prediction, and building proactive risk‑prevention systems—to illustrate practical AI engineering that drives higher recall, conversion, and system robustness.

Advertisinge‑commercelarge language model
0 likes · 8 min read
How JD Engineers Leverage LLMs and Sparse Models to Boost Search and Ads
Meituan Technology Team
Meituan Technology Team
May 15, 2025 · Artificial Intelligence

How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws

Meituan’s recommendation team introduced the MTGR framework, aligning traditional DLRM features with a unified HSTU‑based Transformer to explore scaling laws, delivering a 65‑fold FLOPs boost, 12% lower inference cost, and record gains in online CTR and order volume across its food‑delivery platform.

Inference OptimizationLarge-Scale TrainingMTGR
0 likes · 26 min read
How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Apr 9, 2025 · Artificial Intelligence

Why Scaling Laws Fail for Video MLLMs: Uncovering the Temporal Hacking Problem

The article analyzes the anti‑scaling phenomenon in video large‑language models, identifies a “temporal hacking” shortcut where models focus on a few key frames, formalizes it via reward‑hacking theory, introduces the Temporal Perplexity (TPL) metric, and proposes an Unhackable Temporal Rewarding (UTR) framework to mitigate the issue.

Temporal PerplexityUTRreinforcement learning
0 likes · 14 min read
Why Scaling Laws Fail for Video MLLMs: Uncovering the Temporal Hacking Problem
Architects' Tech Alliance
Architects' Tech Alliance
Mar 28, 2025 · Artificial Intelligence

How DeepSeek Leverages Huawei Ascend to Boost AI Inference Efficiency

The report analyzes DeepSeek's latest V3 and R1 models, highlights their scaling‑law‑driven cost reductions, explains how Huawei Ascend optimizes inference by cutting KV‑Cache storage and improving compute efficiency, and surveys the model’s deployments across finance, government, manufacturing, and healthcare sectors.

AI efficiencyAI inferenceDeepSeek
0 likes · 4 min read
How DeepSeek Leverages Huawei Ascend to Boost AI Inference Efficiency
Alimama Tech
Alimama Tech
Mar 14, 2025 · Artificial Intelligence

Advances in Search Advertising Models with Large Language Models (2024)

In 2024 Alibaba Mama outlines how large‑language models transform search advertising through a three‑line scaling roadmap—explicit inductive‑bias design, implicit compute growth, and auxiliary CV/NLP advances—implemented via a pre‑train/post‑train/CTR paradigm and the LUM user‑behavior model, promising gains in relevance, recall, and real‑time serving while highlighting inference efficiency challenges.

CTR predictionlarge language modelsmultimodal embedding
0 likes · 25 min read
Advances in Search Advertising Models with Large Language Models (2024)
JD Retail Technology
JD Retail Technology
Mar 6, 2025 · Artificial Intelligence

Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training

Jia Xing’s research introduces Dynamic Margin Selection, a technique that repeatedly refreshes a core set of boundary‑close samples to train large language models efficiently on limited resources, achieving comparable loss to full‑data training, enabling six‑fold model compression, faster inference, and a proposed exponential scaling law for data‑efficient AI.

ICLRdynamic data selectionlarge language models
0 likes · 10 min read
Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training
Architect
Architect
Feb 19, 2025 · Artificial Intelligence

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

The article critically examines whether the pre‑training Scaling Law still applies to Grok 3, compares its compute usage and model size with DeepSeek and OpenAI models, evaluates the cost‑effectiveness of pre‑training, RL and test‑time scaling, and explores how these insights shape future large‑language‑model development strategies.

Grok-3Pre‑trainingRL scaling
0 likes · 11 min read
Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics
Architect
Architect
Feb 12, 2025 · Artificial Intelligence

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

The article analyses how S‑shaped growth curves can model the apparent scaling laws of large language models, discusses the three phases of model development, proposes an ability‑density hypothesis, and explores future scenarios where scaling laws may plateau or shift.

AI growthAbility DensityModel Training
0 likes · 16 min read
Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 10, 2025 · Artificial Intelligence

Understanding the AI Wave: A Deep Dive into Large Models and Their Impact

This article offers a comprehensive overview of large models, covering their historical evolution, technical foundations, the current "hundred‑model" competition, practical use cases across industries, and future challenges such as safety, controllability, and efficient deployment.

large modelsretrieval‑augmented generationscaling law
0 likes · 33 min read
Understanding the AI Wave: A Deep Dive into Large Models and Their Impact
DaTaobao Tech
DaTaobao Tech
Jan 22, 2025 · Artificial Intelligence

AI Trends 2025: Paths to AGI, Scaling Law Evolution, and Industry Impact

The article surveys the AI revolution driven by foundation models and an evolving Scaling Law, outlining four AGI pathways—large models, intelligent robots, brain‑computer interfaces, and digital life—while highlighting transformer‑based convergence, generative‑first‑principle breakthroughs like DeepSeek‑V3, and transformative industry impacts ranging from consumer robots to Medical 2.0, personalized education, and digital‑simulation platforms such as NVIDIA’s Omniverse.

AGIAIAI industry
0 likes · 23 min read
AI Trends 2025: Paths to AGI, Scaling Law Evolution, and Industry Impact
DataFunSummit
DataFunSummit
Nov 20, 2024 · Artificial Intelligence

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practice

This article reviews the evolution of large‑model recommendation techniques, analyzes the specific challenges of health‑oriented e‑commerce recommendation, and details practical deployments such as LLM‑enhanced cold‑start recall, DeepI2I expansion, and scaling‑law‑driven CTR models within JD Health.

CTRe‑commercehealth tech
0 likes · 18 min read
Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practice
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 13, 2024 · Artificial Intelligence

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

This article provides a comprehensive analysis of the HLLM paper, detailing its hierarchical LLM architecture for item and user modeling, the training objectives, fusion strategies, extensive offline and online experiments, scaling behavior, ablation studies, and practical deployment insights in large‑scale recommendation systems.

Industrial DeploymentLLMSequential Modeling
0 likes · 12 min read
Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive
DataFunTalk
DataFunTalk
Sep 16, 2024 · Artificial Intelligence

Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practical Deployments

This article reviews the evolution of large‑model recommendation techniques, analyzes the specific demands and obstacles of health‑focused e‑commerce, and details JD Health's practical implementations—including LLM‑enhanced recall, deep item‑to‑item models, and scaling‑law‑driven CTR improvements—while discussing open research questions and future directions.

CTRHealthcareLLM-enhancement
0 likes · 17 min read
Integrating Large Language Models into Health E‑commerce Recommendation Systems: Development, Challenges, and Practical Deployments
Tencent Advertising Technology
Tencent Advertising Technology
Jul 24, 2024 · Artificial Intelligence

Multi-Embedding Paradigm for Scaling Recommendation Models: Mitigating Embedding Dimensional Collapse

This paper investigates the embedding dimensional collapse problem that hinders scaling of recommendation models and proposes a Multi-Embedding paradigm that learns multiple embeddings per feature with independent expert networks, demonstrating consistent performance gains across major CTR benchmarks and real‑world ad systems.

CTR predictionDeep Learningartificial intelligence
0 likes · 10 min read
Multi-Embedding Paradigm for Scaling Recommendation Models: Mitigating Embedding Dimensional Collapse
Kuaishou Tech
Kuaishou Tech
Jul 17, 2024 · Artificial Intelligence

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

The article details Kuaishou’s development of the 175B “Kuaiyi” multimodal large model, presenting eight novel technical innovations—from Temporal Scaling Law and MiLe Loss to MoE‑enhanced reward modeling—and describes how these advances enable high‑performance AI services such as the AI Xiao Kuai chatbot across diverse real‑world scenarios.

AI applicationsModel OptimizationMultimodal AI
0 likes · 12 min read
Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications
NewBeeNLP
NewBeeNLP
Jul 5, 2024 · Artificial Intelligence

Unveiling Meta’s Wukong: How Scaling Laws Boost Large‑Scale Recommendation Performance

Meta’s new paper introduces the Wukong model, demonstrating that expanding dense‑layer parameters and computational FLOPs in large‑scale recommendation systems follows a clear scaling law, yielding consistent performance gains across massive internal datasets, with detailed analysis of feature modules, parameter impacts, and experimental results.

CTR modelsDeep LearningMeta
0 likes · 10 min read
Unveiling Meta’s Wukong: How Scaling Laws Boost Large‑Scale Recommendation Performance
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 21, 2024 · Artificial Intelligence

Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data

The article analyzes Llama 3’s architecture, training data expansion, model variants, Meta’s open‑source strategy, the evolving gap between open and closed models, and how future breakthroughs in synthetic data will shape scaling laws and large‑model progress through 2025 and beyond.

AI trendsLlama3large language models
0 likes · 12 min read
Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data
NewBeeNLP
NewBeeNLP
Apr 10, 2024 · Artificial Intelligence

What Scaling Laws Reveal About LLM Fine‑Tuning and RLHF Performance

This article reviews recent scaling‑law research on large‑language‑model fine‑tuning and RLHF, explaining how data quantity, model size, PET parameters, reward‑model size and KL‑penalty affect downstream performance and offering practical insights for efficient training.

LLMRLHFartificial intelligence
0 likes · 11 min read
What Scaling Laws Reveal About LLM Fine‑Tuning and RLHF Performance
NewBeeNLP
NewBeeNLP
Mar 28, 2024 · Industry Insights

How Meta’s HSTU Architecture Scales Recommendation Systems Beyond Decades of Deep Models

Meta introduces a generative recommendation framework (GR) built on the Hierarchical Sequential Transduction Unit (HSTU) that unifies heterogeneous features, treats user behavior as a new modality, and leverages novel encoder and inference optimizations to achieve order‑of‑magnitude scaling in model size, training compute, and online latency while delivering 12‑18% online gains over traditional deep recommendation models.

Generative ModelsHSTUMeta
0 likes · 36 min read
How Meta’s HSTU Architecture Scales Recommendation Systems Beyond Decades of Deep Models
DataFunSummit
DataFunSummit
Nov 5, 2023 · Artificial Intelligence

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

This article presents a memory‑driven architecture (HCNet and MemoNet) that equips recommendation models with scaling‑law characteristics by storing and retrieving arbitrary feature‑combination embeddings, evaluates multi‑hash codebooks, memory‑restoring strategies, key‑feature selection, and demonstrates significant offline and online performance gains.

feature interactionlarge language modelsmemory networks
0 likes · 15 min read
Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach