Tagged articles
23 articles
Page 1 of 1
Machine Heart
Machine Heart
May 16, 2026 · Artificial Intelligence

GIPO: Overcoming Utilization Collapse for Efficient Large‑Model Reinforcement Learning

GIPO (Gaussian Importance Sampling Policy Optimization) replaces PPO’s hard clipping with a smooth Gaussian‑weighted trust region, achieving log‑space symmetry and bias‑variance balance that mitigates policy lag and utilization collapse, and demonstrates superior stability and sample efficiency on GridWorld, LIBERO, MetaWorld, and 7‑billion‑parameter VLA experiments.

Bias-Variance TradeoffGIPOLarge-Scale Training
0 likes · 17 min read
GIPO: Overcoming Utilization Collapse for Efficient Large‑Model Reinforcement Learning
Machine Heart
Machine Heart
May 5, 2026 · Artificial Intelligence

Agent-World: Scaling Real-World Environments for Co‑Evolving Agents and Their Worlds

Agent-World introduces a universal training arena that automatically mines real‑world data from the internet to build over 1,900 diverse environments and 19,800 tools, then generates long‑horizon tasks through graph‑based and programmatic synthesis, creating a self‑evolving loop where agents are evaluated, diagnosed, and the environment is refined, achieving state‑of‑the‑art results on 23 benchmarks.

AI agentsAgent-WorldLarge-Scale Training
0 likes · 14 min read
Agent-World: Scaling Real-World Environments for Co‑Evolving Agents and Their Worlds
JD Tech Talk
JD Tech Talk
Jan 30, 2026 · Artificial Intelligence

How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale

This article details JD Retail’s 9N‑LLM unified training engine, explaining the background of generative recommendation, the challenges of massive sparse and dense parameters, and the multi‑framework, multi‑hardware solutions—including efficient sample processing, large‑scale sparse embedding, dense scaling, UniAttention acceleration, and reinforcement‑learning integration—that enable industrial‑scale deployment.

AI InfrastructureGenerative RecommendationLarge-Scale Training
0 likes · 26 min read
How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale
Architect
Architect
Jan 1, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost Large Model Training Efficiency

DeepSeek’s new paper introduces mHC, a manifold‑constrained version of Hyper‑Connections that stabilizes gradient flow, adds only 6.7% training overhead, and enables reliable training of 27‑billion‑parameter models while improving benchmark performance by about 2%.

AI ArchitectureDeep LearningLarge-Scale Training
0 likes · 7 min read
How Manifold-Constrained Hyper-Connections Boost Large Model Training Efficiency
Architects' Tech Alliance
Architects' Tech Alliance
Dec 28, 2025 · Artificial Intelligence

Google’s TPU v7: How 1.5 & 2.6 Optical Modules per Chip Power AI Supercomputers

The article explains how Google’s TPU v7 supercomputer uses a simple yet powerful networking scheme—1.5 optical modules per TPU for intra‑rack communication and an additional 2.6 modules per TPU for inter‑rack high‑speed links—enabling massive AI model training with balanced cost and performance.

AI supercomputingGoogleLarge-Scale Training
0 likes · 13 min read
Google’s TPU v7: How 1.5 & 2.6 Optical Modules per Chip Power AI Supercomputers
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI InfrastructureBaidu BaigeGPU computing
0 likes · 26 min read
From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure
Architects' Tech Alliance
Architects' Tech Alliance
Sep 2, 2025 · Artificial Intelligence

Designing High‑Performance Networks for Massive AI Model Training

This article examines how AI large‑model training demands massive GPU clusters and low‑latency, high‑throughput networks, compares Clos/Fat‑Tree, Spine‑Leaf, Dragonfly, Group‑wise Dragonfly+ and Torus topologies, and discusses design choices for scaling to hundreds of thousands of GPUs while noting related data‑center resources.

AILarge-Scale Trainingdata center
0 likes · 8 min read
Designing High‑Performance Networks for Massive AI Model Training
AI Algorithm Path
AI Algorithm Path
Aug 16, 2025 · Artificial Intelligence

Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks

Meta's DINOv3 is a 70‑billion‑parameter self‑supervised visual foundation model trained on 17 billion Instagram images without any labels, introducing dense feature extraction, Gram‑Anchoring to prevent feature collapse, high‑resolution adaptation, and multi‑student distillation that together enable out‑of‑the‑box performance on segmentation, depth estimation, 3D matching, and tracking while surpassing prior models such as DINOv2, CLIP, and SAM.

Computer VisionDINOv3Gram Anchoring
0 likes · 8 min read
Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks
Meituan Technology Team
Meituan Technology Team
May 15, 2025 · Artificial Intelligence

How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws

Meituan’s recommendation team introduced the MTGR framework, aligning traditional DLRM features with a unified HSTU‑based Transformer to explore scaling laws, delivering a 65‑fold FLOPs boost, 12% lower inference cost, and record gains in online CTR and order volume across its food‑delivery platform.

Inference OptimizationLarge-Scale TrainingMTGR
0 likes · 26 min read
How Meituan’s MTGR Framework Achieved 65× Faster Inference with Scaling Laws
AI Algorithm Path
AI Algorithm Path
May 11, 2025 · Artificial Intelligence

How to Parallelize Ultra‑Large Model Training with PyTorch

The article explains the core concepts and trade‑offs of five parallelism techniques—data, tensor, context, pipeline, and expert parallelism—plus the ZeRO optimizer, showing when each method is appropriate for training ultra‑large PyTorch models and providing concrete code snippets and performance considerations.

Context ParallelismData ParallelismExpert Parallelism
0 likes · 21 min read
How to Parallelize Ultra‑Large Model Training with PyTorch
DataFunSummit
DataFunSummit
Mar 20, 2025 · Artificial Intelligence

Evolution of AI Training Stability and Baidu Baige’s Full-Stack Solutions for Large-Scale Model Training

The article traces the evolution of AI training stability from early manual operations on small GPU clusters to sophisticated, fault‑tolerant infrastructures for thousand‑card and ten‑thousand‑card models, detailing Baidu Baige’s metrics, monitoring, eBPF‑based diagnostics, and checkpoint strategies that reduce invalid training time and accelerate fault recovery.

Distributed SystemsLarge-Scale Trainingcheckpointing
0 likes · 22 min read
Evolution of AI Training Stability and Baidu Baige’s Full-Stack Solutions for Large-Scale Model Training
AIWalker
AIWalker
Feb 12, 2025 · Artificial Intelligence

Goku: How HKU and ByteDance’s New Model Sets New Benchmarks in Commercial Image and Video Generation

The paper presents Goku, a rectified‑flow transformer that jointly generates high‑quality images and videos at commercial scale, detailing its novel architecture, massive high‑quality data pipeline, efficient large‑scale training tricks, and state‑of‑the‑art results on GenEval, DPG‑Bench, VBench and UCF‑101.

Image GenerationLarge-Scale TrainingMultimodal AI
0 likes · 29 min read
Goku: How HKU and ByteDance’s New Model Sets New Benchmarks in Commercial Image and Video Generation
DataFunTalk
DataFunTalk
Apr 3, 2023 · Artificial Intelligence

Large‑Scale Recommendation System Training with TorchRec and Dynamic Embedding

This article explains how Tencent’s AI team leverages the PyTorch‑based TorchRec library and a custom dynamic embedding solution to train billion‑scale recommendation models efficiently, detailing the benefits of TorchRec, GPU embedding, optimized kernels, embedding partition strategies, experimental results, and practical deployment guidance.

GPU EmbeddingLarge-Scale TrainingPyTorch
0 likes · 15 min read
Large‑Scale Recommendation System Training with TorchRec and Dynamic Embedding
DataFunSummit
DataFunSummit
Apr 2, 2023 · Artificial Intelligence

Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)

This article introduces the challenges of scaling deep‑learning model training, explains the design and components of the open‑source Easy Parallel Library (EPL) that unifies data, pipeline, and operator‑split parallelism, and demonstrates its best‑practice results on large‑scale classification, BERT‑large, and massive multimodal models.

Distributed TrainingEPLLarge-Scale Training
0 likes · 15 min read
Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)
Tencent Advertising Technology
Tencent Advertising Technology
Mar 10, 2023 · Artificial Intelligence

Optimizing Large-Scale Model Training with Tencent's AngelPTM and ZeRO-Cache

This article presents Tencent's latest advancements in large‑scale model training, detailing the AngelPTM framework and its ZeRO‑Cache optimization techniques that reduce memory and storage costs, improve hardware utilization, and achieve high‑performance training for trillion‑parameter AI models across various applications.

AI modelsAngelPTMLarge-Scale Training
0 likes · 14 min read
Optimizing Large-Scale Model Training with Tencent's AngelPTM and ZeRO-Cache
Baidu Geek Talk
Baidu Geek Talk
Feb 17, 2023 · Artificial Intelligence

How PGLBox Achieves 27× Faster GPU‑Powered Large‑Scale Graph Learning

PGLBox, Baidu’s GPU‑based large‑scale graph training framework, delivers up to 27× speedup over CPU clusters by fully GPU‑accelerating storage, sampling, and training, supporting billions of nodes, advanced GNN algorithms, multi‑level storage, and seamless integration of massive pretrained models.

GPULarge-Scale TrainingPGLBox
0 likes · 7 min read
How PGLBox Achieves 27× Faster GPU‑Powered Large‑Scale Graph Learning
Meituan Technology Team
Meituan Technology Team
Nov 24, 2022 · Artificial Intelligence

Large-Scale Graph Retrieval for Meituan In-Store Advertising: Design, Optimization, and Deployment

The article details Meituan's deployment of large-scale heterogeneous graph recall for in‑store recommendation ads, covering full‑scene graph construction, graph pruning, dynamic negative sampling, spatiotemporal sub‑graph fusion, and performance optimizations that together raise offline hit‑rate by over 5% and online revenue per search by 10‑15%.

Large-Scale TrainingMeituangraph neural networks
0 likes · 25 min read
Large-Scale Graph Retrieval for Meituan In-Store Advertising: Design, Optimization, and Deployment
DataFunSummit
DataFunSummit
Sep 9, 2022 · Artificial Intelligence

Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation

The presentation by Tencent expert Yuan Yi details the Wuliang deep learning system for recommendation, covering its background, technical challenges such as massive data and real‑time requirements, the parameter‑server based solutions for training and inference, model compression techniques, and continuous online deployment strategies.

Deep LearningLarge-Scale TrainingParameter Server
0 likes · 14 min read
Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation
Alimama Tech
Alimama Tech
Aug 17, 2022 · Artificial Intelligence

How Multimodal AI Transforms Advertising Copy: From Image Text to Video Scripts

Alibaba’s advertising AI team presents a comprehensive study of four new multimodal copywriting tasks—image overlay text generation, video narration, text style transfer, and detail-page extraction—detailing model architectures, training on billions of images, experimental results, and practical deployment in the “Xiyu” product.

Large-Scale TrainingMultimodal AIStyle Transfer
0 likes · 17 min read
How Multimodal AI Transforms Advertising Copy: From Image Text to Video Scripts
DataFunSummit
DataFunSummit
Feb 10, 2022 · Artificial Intelligence

Baidu's PGL2.2: A Graph Neural Network Framework, Techniques, and Real‑World Applications

This article introduces Baidu's PGL2.2 graph learning platform, explains graph modeling and message‑passing GNN techniques, details training strategies for small, medium and large graphs, showcases node classification and link‑prediction methods, and describes how the framework is applied in search, recommendation, risk control, and knowledge‑graph competitions.

Knowledge GraphsLarge-Scale TrainingPGL2.2
0 likes · 15 min read
Baidu's PGL2.2: A Graph Neural Network Framework, Techniques, and Real‑World Applications
Ctrip Technology
Ctrip Technology
Apr 9, 2021 · Artificial Intelligence

Algorithm Optimization for Hotel Recommendation and Large‑Scale Discrete DNN Training at Ctrip

This article describes how Ctrip improved hotel recommendation by iterating from logistic regression to GBDT and deep neural networks, designing continuous and discrete features, adopting multi‑task learning with click and conversion signals, and building a large‑scale distributed DNN training and unified feature‑processing framework to boost model accuracy and engineering efficiency.

CtripDNNLarge-Scale Training
0 likes · 15 min read
Algorithm Optimization for Hotel Recommendation and Large‑Scale Discrete DNN Training at Ctrip
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 16, 2018 · Artificial Intelligence

How Alibaba’s Deep Learning Transformed CTR Prediction: From MLR to Multi‑Interest Networks

This article recounts Alibaba‑Mama researcher Jing Shi’s presentation on the evolution of deep learning for click‑through‑rate (CTR) estimation, covering the shift from handcrafted features and linear models to piecewise linear MLR, end‑to‑end neural networks, multi‑interest user modeling, and large‑scale distributed training challenges.

AdvertisingCTR predictionDeep Learning
0 likes · 16 min read
How Alibaba’s Deep Learning Transformed CTR Prediction: From MLR to Multi‑Interest Networks