Tagged articles

large-scale models

14 articles · Page 1 of 1

Jul 2, 2026 · Artificial Intelligence

Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

ATHENA introduces a data‑curation framework for billion‑parameter multi‑task Vision‑Language‑Action models that extends influence functions via Kronecker gradient compression and a multitask influence interaction scheme, achieving a 313× reduction in compute (from 8054.6 to 25.7 GPU‑hours) and improving task success rates while using fewer, higher‑value demonstrations.

data curationinfluence functionslarge-scale models

0 likes · 9 min read

Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

AIWalker

May 20, 2026 · Artificial Intelligence

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

AnyFlow introduces a flow‑map distillation framework that enables video diffusion models to produce high‑quality results in just four sampling steps while still gaining quality as the number of steps increases, supporting both causal and bidirectional architectures and scaling up to 14 B parameters.

On‑Policy Distillationbidirectional videocausal video

0 likes · 14 min read

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

AIWalker

Apr 10, 2026 · Artificial Intelligence

How RealRestorer Bridges the Gap in Real‑World Image Restoration

RealRestorer leverages large‑scale image‑editing models, a hybrid synthetic‑and‑real degradation pipeline, and a two‑stage training strategy to deliver state‑of‑the‑art open‑source restoration that generalizes across nine real‑world degradation types while preserving content consistency.

benchmarkcomputer visiondeep learning

0 likes · 13 min read

How RealRestorer Bridges the Gap in Real‑World Image Restoration

PaperAgent

Jan 1, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost Large-Scale Model Training Efficiency

The article introduces mHC, a Manifold‑Constrained Hyper‑Connections technique that replaces standard residual links with multiple learned pathways, using double‑stochastic matrices to lock gradients, achieving stable training of 27‑billion‑parameter models with only 6.7% extra compute and superior performance across eight downstream benchmarks.

AI ArchitectureEfficient ImplementationManifold-Constrained

0 likes · 6 min read

How Manifold-Constrained Hyper-Connections Boost Large-Scale Model Training Efficiency

DataFunSummit

May 8, 2024 · Artificial Intelligence

Kuaishou’s Practices for Large‑Scale Model Data Processing and Storage

This article shares Kuaishou’s real‑time, massive‑scale model data processing pipeline, covering model scenarios, recommendation workflow complexity, large‑scale data storage, streaming joins, feature computation, NVM‑based storage solutions, strong consistency mechanisms, and future outlook for AI recommendation systems.

AIKuaishouNVM storage

0 likes · 16 min read

Kuaishou’s Practices for Large‑Scale Model Data Processing and Storage

DataFunSummit

Apr 24, 2024 · Artificial Intelligence

Multimodal Content Understanding in Baidu Commercial Systems: The ViCAN Model and Its Applications

This article presents Baidu's exploration of multimodal content understanding for commercial advertising, detailing the ViCAN pre‑training model, its contrastive and mask‑language learning tasks, integration across recall, ranking and risk‑control pipelines, quantization with MMDict, and future AIGC‑driven generation, all backed by extensive experiments and Q&A.

AIAIGCAdvertising

0 likes · 27 min read

Multimodal Content Understanding in Baidu Commercial Systems: The ViCAN Model and Its Applications

Tencent Tech

Mar 26, 2024 · Artificial Intelligence

How Tencent Angel’s AI Platform Won the 2023 CIE Science & Tech Award

Tencent’s Angel machine‑learning platform, recognized with the 2023 China Institute of Electronics Science & Technology Award, showcases breakthrough distributed training, high‑efficiency caching, adaptive sampling, multimodal fusion, and graph‑model search technologies that dramatically improve large‑scale AI model performance and cost.

AITencentdistributed training

0 likes · 8 min read

How Tencent Angel’s AI Platform Won the 2023 CIE Science & Tech Award

iQIYI Technical Product Team

Mar 1, 2024 · Artificial Intelligence

Advertising Data Characteristics and Sparse Large‑Model Practices at iQIYI

iQIYI’s ad ranking system replaces static, hash‑based embeddings with TFRA dynamic embeddings to efficiently handle massive sparse ID features, eliminates collisions and I/O bottlenecks, isolates memory during hot model swaps, enabling billion‑parameter models that boost revenue by 4.3 % while planning adaptive embedding sizes for future improvements.

AI recommendationAdvertisingSparse Embedding

0 likes · 10 min read

Advertising Data Characteristics and Sparse Large‑Model Practices at iQIYI

DataFunTalk

Feb 7, 2024 · Big Data

Kuaishou's Practices for Large‑Scale Model Data Processing, Real‑Time Feature Handling, and Storage

This article presents Kuaishou's end‑to‑end engineering solutions for handling massive, real‑time recommendation model data, covering scenario description, complex business pipelines, trillion‑parameter model storage, high‑throughput processing with Flink and NVM, and future directions for cloud‑native scalability.

KuaishouNVM storageRecommendation Systems

0 likes · 15 min read

Kuaishou's Practices for Large‑Scale Model Data Processing, Real‑Time Feature Handling, and Storage

DataFunTalk

Apr 24, 2023 · Artificial Intelligence

Evolution of Large‑Scale Recommendation Models at Weibo: Technical Roadmap and Recent Advances

This article reviews the evolution of Weibo's large‑scale recommendation technology, covering the system's business scenarios, technical roadmap, recent large model iterations, multi‑task and multi‑scenario modeling, feature engineering, consistency between recall and ranking, and emerging techniques such as causal inference and graph methods.

Multi-Task LearningRecommendation Systemscausal inference

0 likes · 18 min read

Evolution of Large‑Scale Recommendation Models at Weibo: Technical Roadmap and Recent Advances

DataFunSummit

Apr 17, 2023 · Artificial Intelligence

Large‑Scale Table Pretraining Model SPACE‑T: Background, Architecture, and Applications

The article presents Alibaba DAMO Academy's large‑scale table pretraining model SPACE‑T, explaining the background and trends of TableQA and Text‑to‑SQL, detailing the model’s design and training data, showcasing its deployment on ModelScope and Alibaba Cloud, and outlining future directions and practical impact.

AIModelScopeSPACE-T

0 likes · 11 min read

Large‑Scale Table Pretraining Model SPACE‑T: Background, Architecture, and Applications

Tencent Advertising Technology

Nov 17, 2022 · Artificial Intelligence

Scaling Huge Embedding Model Training with Cache-Enabled Distributed Framework (HET): VLDB 2022 Best Paper and Its Industrial Deployment

The award‑winning VLDB 2022 paper introduces HET, a cache‑enabled distributed framework that dramatically reduces communication overhead for sparse trillion‑parameter embedding models, and Tencent Ads has industrialized this technology to train 10 TB‑scale models with up to 7×24‑hour online deep learning.

CacheEmbeddingParameter Server

0 likes · 9 min read

Scaling Huge Embedding Model Training with Cache-Enabled Distributed Framework (HET): VLDB 2022 Best Paper and Its Industrial Deployment

Baidu Geek Talk

Oct 31, 2022 · Artificial Intelligence

PaddleBox: A GPU‑Based Ultra‑Large‑Scale Sparse DNN Training Framework

PaddleBox is Baidu’s GPU‑based ultra‑large‑scale sparse DNN training framework that combines a three‑tier hierarchical parameter server (SSD, DRAM, HBM) with pipelined scheduling and multi‑machine multi‑GPU communication, delivering 5–40× cost‑performance gains over traditional CPU solutions and powering Baidu’s advertising services.

GPUPaddleBoxSparse Parameters

0 likes · 15 min read

PaddleBox: A GPU‑Based Ultra‑Large‑Scale Sparse DNN Training Framework

Sohu Tech Products

Sep 16, 2020 · Artificial Intelligence

Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions

This article reviews the latest advances in open-domain dialogue systems, covering classification, end‑to‑end generation challenges, knowledge‑controlled generation, automated evaluation, large‑scale latent‑space models such as PLATO, and outlines future research directions for building more coherent and controllable conversational AI.

Dialogue SystemsEvaluationknowledge grounding

0 likes · 14 min read

Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions