Tag

real-time inference

0 views collected around this technical thread.

iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 10, 2024 · Artificial Intelligence

Online Deep Learning (ODL) for Real‑Time Advertising Effectiveness: Challenges and Solutions

iQIYI’s minute‑level online deep‑learning framework overcomes stability, timeliness, compatibility, delayed feedback, catastrophic forgetting, and i.i.d. constraints through high‑availability pipelines, TensorFlow Example serialization, rapid P2P model distribution, flexible scheduling, disaster‑recovery rollbacks, PU‑loss adjustment, and knowledge‑distillation, delivering a 6.2% revenue boost.

CTR predictionadvertisingdeep learning
0 likes · 9 min read
Online Deep Learning (ODL) for Real‑Time Advertising Effectiveness: Challenges and Solutions
DataFunSummit
DataFunSummit
Sep 9, 2022 · Artificial Intelligence

Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation

The presentation by Tencent expert Yuan Yi details the Wuliang deep learning system for recommendation, covering its background, technical challenges such as massive data and real‑time requirements, the parameter‑server based solutions for training and inference, model compression techniques, and continuous online deployment strategies.

Large-Scale TrainingRecommendation systemsdeep learning
0 likes · 14 min read
Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation
Youku Technology
Youku Technology
Jun 7, 2022 · Artificial Intelligence

Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough

To enable real‑time bullet‑comment passthrough on Youku’s mobile app, the team built a million‑scale portrait dataset and designed the AirSegNet series—CPU, GPU, and server variants—using VGG‑style nets, edge‑aware losses, and hybrid CPU‑GPU inference, achieving 0.98 IoU and sub‑15 ms latency on most devices.

Edge ComputingMNN FrameworkMobile Deep Learning
0 likes · 13 min read
Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough
Baidu App Technology
Baidu App Technology
Nov 25, 2021 · Game Development

Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas

The article details how to create the AI‑driven “Object Hunt Battle” game by processing data, designing and training a PP‑LCNet model with PaddleClas, converting it for Paddle.js, and integrating real‑time WebGL inference on mobile devices, achieving sub‑50 ms latency and encouraging developers to explore further.

AI game developmentPaddle.jsPaddleClas
0 likes · 9 min read
Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas
DataFunTalk
DataFunTalk
Sep 28, 2021 · Artificial Intelligence

Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions

The talk presents an overview of graph neural network development, explains 极验's graph modeling research and evolution, and details offline and real‑time GCN solutions, including self‑supervised training, large‑scale handling, and performance comparisons, highlighting practical applications in fraud detection and risk control.

Anomaly DetectionGCNGraph Neural Networks
0 likes · 26 min read
Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions
DataFunSummit
DataFunSummit
Mar 9, 2021 · Artificial Intelligence

Weibo Multimodal Content Understanding Service Architecture and GPU Heterogeneous Cluster Solutions

This article details Weibo's multimodal content understanding platform, covering its massive data challenges, heterogeneous model support, standardized pipelines, platformization, workflow architecture, GPU heterogeneous cluster management, resource scheduling, performance optimization, and full‑stack monitoring to achieve stable, low‑latency AI services at scale.

GPU ClusterWeibodistributed training
0 likes · 18 min read
Weibo Multimodal Content Understanding Service Architecture and GPU Heterogeneous Cluster Solutions
DataFunTalk
DataFunTalk
Aug 27, 2020 · Artificial Intelligence

Model Serving in Real-Time: Insights from Alibaba’s User Interest Center

This article explains Alibaba’s User Interest Center approach to real‑time model serving, detailing how it separates offline sequence modeling from lightweight online inference, uses an online interest‑embedding store, and dramatically reduces latency for recommendation models such as DIEN and MIMN.

AlibabaRecommendation systemsembedding
0 likes · 8 min read
Model Serving in Real-Time: Insights from Alibaba’s User Interest Center
DataFunTalk
DataFunTalk
Aug 18, 2020 · Artificial Intelligence

COLD: A Next‑Generation Pre‑Ranking System for Online Advertising

The article introduces COLD, a computing‑power‑aware online and lightweight deep pre‑ranking system for Alibaba's targeted ads, detailing its evolution from static CTR models to vector‑inner‑product models, its flexible network architecture with feature‑selection via SE blocks, engineering optimizations such as parallelism, column‑wise computation, Float16 and MPS, and demonstrates superior offline and online performance through extensive experiments.

COLDfeature selectionmachine learning
0 likes · 11 min read
COLD: A Next‑Generation Pre‑Ranking System for Online Advertising
iQIYI Technical Product Team
iQIYI Technical Product Team
Jun 12, 2020 · Artificial Intelligence

Deepthought: An End‑to‑End Machine Learning Platform at iQIYI

Deepthought is iQIYI’s end‑to‑end machine‑learning platform that unifies distributed frameworks, decouples pipeline stages, integrates with Tongtian Tower, and offers visual drag‑and‑drop configuration, evolving from a fraud‑detection prototype to a generic system with real‑time inference, automated hyper‑parameter optimization, and support for large‑scale data across anti‑fraud, recommendation, and analytics workloads.

AI PlatformAutoMLData Engineering
0 likes · 13 min read
Deepthought: An End‑to‑End Machine Learning Platform at iQIYI
DataFunTalk
DataFunTalk
May 8, 2020 · Artificial Intelligence

Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems

The article explains how the fourth paradigm's distributed machine learning framework GDBT tackles the massive data, high‑dimensional features, and real‑time requirements of modern recommendation systems by leveraging heterogeneous computing, parameter servers, RDMA networking, and optimized workloads.

GDBTRDMARecommendation systems
0 likes · 18 min read
Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems
Tencent Cloud Developer
Tencent Cloud Developer
Mar 6, 2020 · Artificial Intelligence

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

The paper presents a lightweight, anchor‑free CenterNet‑based object‑ness detector for WeChat’s Scan feature, built on a ShuffleNetV2 backbone with enlarged 5×5 depth‑wise convolutions, a streamlined detection head, and a Pyramid Interpolation Module, then quantized, ONNX‑converted and NCNN‑deployed to achieve a 436 KB model running in ~15 ms per frame on an iPhone 8 CPU.

CenterNetShuffleNetV2anchor-free
0 likes · 12 min read
WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment