Tagged articles
23 articles
Page 1 of 1
Machine Heart
Machine Heart
May 14, 2026 · Artificial Intelligence

Introducing TTFA: Hong Kong University’s Open‑Source FASTER Gives VLA Models Instant Reaction

The paper identifies real‑time latency as the main obstacle for deploying VLA models on robots, proposes the TTFA metric and the FASTER framework with a Horizon‑Aware Schedule, mixed scheduling and streaming inference, and demonstrates through extensive GPU and task experiments that TTFA and reaction time can be cut by up to three‑fold without sacrificing motion quality.

Embodied AIFASTERReal-time inference
0 likes · 14 min read
Introducing TTFA: Hong Kong University’s Open‑Source FASTER Gives VLA Models Instant Reaction
AI Engineering
AI Engineering
May 8, 2026 · Artificial Intelligence

How GPT‑Realtime‑2 Leverages GPT‑5‑Level Reasoning to Redefine Voice AI Architecture

OpenAI’s GPT‑Realtime‑2 embeds GPT‑5‑class reasoning into a continuous‑audio loop, achieving 96.6% accuracy on Big Bench Audio, offering adjustable inference intensity with latency from 1.12 s to 2.33 s, a 128 K context window, and demonstrable gains in real‑world call success rates, while prompting industry debate over pricing and competitive impact.

GPT-5GPT-Realtime-2Latency
0 likes · 5 min read
How GPT‑Realtime‑2 Leverages GPT‑5‑Level Reasoning to Redefine Voice AI Architecture
Machine Heart
Machine Heart
Apr 11, 2026 · Artificial Intelligence

How PiLoT Enables Monocular Drones to Navigate 10 km Drift‑Free and Lock onto Targets

PiLoT, a CVPR 2026 Highlight paper, introduces a neural pixel‑to‑3D registration framework that lets a single‑camera UAV achieve drift‑free 6‑DoF pose and real‑time target locking over 10 km without GNSS, running at 25‑30 FPS on an NVIDIA Jetson Orin and outperforming existing hybrid and absolute‑pose methods.

GNSS-denied navigationPiLoTReal-time inference
0 likes · 12 min read
How PiLoT Enables Monocular Drones to Navigate 10 km Drift‑Free and Lock onto Targets
Machine Heart
Machine Heart
Apr 2, 2026 · Artificial Intelligence

From Tokens to Revenue: Kuaishou’s GR4AD Pioneers Full‑Stack Generative Recommendation for Ads

GR4AD, Kuaishou’s generative recommendation system, redesigns the entire ad pipeline—from tokenizing multimodal ad material to value‑aware learning, lazy decoding, and dynamic beam search—delivering over 4 % revenue lift, higher eCPM, and sub‑100 ms latency for more than 400 million users.

AdvertisingGenerative RecommendationOnline Learning
0 likes · 17 min read
From Tokens to Revenue: Kuaishou’s GR4AD Pioneers Full‑Stack Generative Recommendation for Ads
AIWalker
AIWalker
Mar 9, 2026 · Artificial Intelligence

How EFSI‑DETR Achieves 188 FPS and Boosts Small‑Object Detection Accuracy by 5.8%

The article dissects EFSI‑DETR, a UAV small‑object detector that combines simulated frequency processing with dynamic semantic enhancement to overcome pixel scarcity, static fusion, and ignored frequency cues, delivering 188 FPS and a 5.8% APₛ gain on VisDrone while remaining lightweight.

DETRReal-time inferenceUAV vision
0 likes · 16 min read
How EFSI‑DETR Achieves 188 FPS and Boosts Small‑Object Detection Accuracy by 5.8%
DataFunSummit
DataFunSummit
Feb 7, 2026 · Big Data

How Flink Enables Real‑Time AI Inference and Agent Construction

This article explains Apache Flink’s stream processing fundamentals, introduces the open‑source Flink Agents framework for building event‑driven AI agents, details Alibaba Cloud’s Flink AI Function for real‑time LLM inference, and showcases demos, architecture, integration patterns, and practical use cases such as VOC analysis, live‑stream analytics, and intelligent operations.

Apache FlinkBig DataReal-time inference
0 likes · 24 min read
How Flink Enables Real‑Time AI Inference and Agent Construction
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 30, 2026 · Artificial Intelligence

Qwen3-ASR: Open‑Source Speech Recognition Supporting 52 Languages and Dialects, Outperforming Whisper

The Qwen3‑ASR series, now open‑sourced by Alibaba, offers three models (1.7B, 0.6B, and a 0.6B forced aligner) that cover 52 languages and 22 Chinese dialects, support streaming and offline inference, achieve an RTF of 0.064 with 2000× realtime throughput, handle singing with background music, and provide detailed deployment guides, benchmarks, and comparisons with other ASR solutions.

Qwen3-ASRReal-time inferenceforced aligner
0 likes · 15 min read
Qwen3-ASR: Open‑Source Speech Recognition Supporting 52 Languages and Dialects, Outperforming Whisper
HyperAI Super Neural
HyperAI Super Neural
Jan 3, 2026 · Artificial Intelligence

Clone a Voice in 5 seconds with One‑Step Generation: Inside Chatterbox‑Turbo’s High‑Fidelity TTS

Resemble AI’s open‑source Chatterbox‑Turbo reduces TTS generation from ten steps to one, enabling high‑sample‑rate, lossless voice cloning from a 5‑10 second reference while supporting emotional control, side‑language tags, and embedded watermarking for real‑time applications across chatbots, games, podcasts, and education.

Chatterbox‑TurboReal-time inferenceknowledge distillation
0 likes · 7 min read
Clone a Voice in 5 seconds with One‑Step Generation: Inside Chatterbox‑Turbo’s High‑Fidelity TTS
Tencent Architect
Tencent Architect
Jul 2, 2025 · Artificial Intelligence

How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge

Tencent TEG Shannon Lab won the NTIRE 2025 UGC Video Enhancement competition with a progressive training framework that combines adaptive color enhancement, high‑speed denoising, and temporal stability under bitrate constraints, achieving top subjective scores, significant inference speed‑ups, and successful INT8 quantization for real‑time deployment.

AI video codecDeep LearningNTIRE2025
0 likes · 18 min read
How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 10, 2024 · Artificial Intelligence

Online Deep Learning (ODL) for Real‑Time Advertising Effectiveness: Challenges and Solutions

iQIYI’s minute‑level online deep‑learning framework overcomes stability, timeliness, compatibility, delayed feedback, catastrophic forgetting, and i.i.d. constraints through high‑availability pipelines, TensorFlow Example serialization, rapid P2P model distribution, flexible scheduling, disaster‑recovery rollbacks, PU‑loss adjustment, and knowledge‑distillation, delivering a 6.2% revenue boost.

AdvertisingCTR predictionDeep Learning
0 likes · 9 min read
Online Deep Learning (ODL) for Real‑Time Advertising Effectiveness: Challenges and Solutions
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 5, 2023 · Artificial Intelligence

How Alibaba’s DGS Enables Real‑Time GNN Inference on Massive Dynamic Graphs

The Dynamic Graph Sampling (DGS) service, built on GraphLearn, delivers sub‑20 ms latency for real‑time GNN inference on large, constantly evolving graphs by separating storage from computation, using event‑driven pre‑sampling, lazy multi‑hop concatenation, and a publish‑subscribe architecture that scales linearly across distributed workers.

Alibaba CloudDistributed SystemsGraphLearn
0 likes · 12 min read
How Alibaba’s DGS Enables Real‑Time GNN Inference on Massive Dynamic Graphs
Alipay Experience Technology
Alipay Experience Technology
Nov 28, 2022 · Artificial Intelligence

Why Edge Intelligence Is Shaping the Future of Mobile Apps

This article explains the concept of edge intelligence, its advantages over cloud‑based AI, the technical challenges of deploying AI on mobile devices, Ant Group's development timeline, core technology stack, and future directions for edge‑cloud collaboration.

AI OptimizationMobile AIReal-time inference
0 likes · 10 min read
Why Edge Intelligence Is Shaping the Future of Mobile Apps
DataFunSummit
DataFunSummit
Sep 9, 2022 · Artificial Intelligence

Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation

The presentation by Tencent expert Yuan Yi details the Wuliang deep learning system for recommendation, covering its background, technical challenges such as massive data and real‑time requirements, the parameter‑server based solutions for training and inference, model compression techniques, and continuous online deployment strategies.

Deep LearningLarge-Scale TrainingParameter Server
0 likes · 14 min read
Wuliang: Tencent's Deep Learning Framework for Real‑Time Large‑Scale Recommendation
Youku Technology
Youku Technology
Jun 7, 2022 · Artificial Intelligence

Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough

To enable real‑time bullet‑comment passthrough on Youku’s mobile app, the team built a million‑scale portrait dataset and designed the AirSegNet series—CPU, GPU, and server variants—using VGG‑style nets, edge‑aware losses, and hybrid CPU‑GPU inference, achieving 0.98 IoU and sub‑15 ms latency on most devices.

Computer VisionEdge ComputingMNN Framework
0 likes · 13 min read
Mobile Real-Time Portrait Segmentation for Youku Bullet Comment Passthrough
Baidu App Technology
Baidu App Technology
Nov 25, 2021 · Game Development

Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas

The article details how to create the AI‑driven “Object Hunt Battle” game by processing data, designing and training a PP‑LCNet model with PaddleClas, converting it for Paddle.js, and integrating real‑time WebGL inference on mobile devices, achieving sub‑50 ms latency and encouraging developers to explore further.

AI game developmentMobile AIPaddle.js
0 likes · 9 min read
Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas
DataFunTalk
DataFunTalk
Sep 28, 2021 · Artificial Intelligence

Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions

The talk presents an overview of graph neural network development, explains 极验's graph modeling research and evolution, and details offline and real‑time GCN solutions, including self‑supervised training, large‑scale handling, and performance comparisons, highlighting practical applications in fraud detection and risk control.

GCNGraph ModelingReal-time inference
0 likes · 26 min read
Graph Modeling and GCN Exploration at 极验: Evolution, Offline and Real‑time Solutions
DataFunSummit
DataFunSummit
Mar 9, 2021 · Artificial Intelligence

Weibo Multimodal Content Understanding Service Architecture and GPU Heterogeneous Cluster Solutions

This article details Weibo's multimodal content understanding platform, covering its massive data challenges, heterogeneous model support, standardized pipelines, platformization, workflow architecture, GPU heterogeneous cluster management, resource scheduling, performance optimization, and full‑stack monitoring to achieve stable, low‑latency AI services at scale.

Distributed TrainingGPU clusterModel Serving
0 likes · 18 min read
Weibo Multimodal Content Understanding Service Architecture and GPU Heterogeneous Cluster Solutions
DataFunTalk
DataFunTalk
Aug 27, 2020 · Artificial Intelligence

Model Serving in Real-Time: Insights from Alibaba’s User Interest Center

This article explains Alibaba’s User Interest Center approach to real‑time model serving, detailing how it separates offline sequence modeling from lightweight online inference, uses an online interest‑embedding store, and dramatically reduces latency for recommendation models such as DIEN and MIMN.

AlibabaEmbeddingModel Serving
0 likes · 8 min read
Model Serving in Real-Time: Insights from Alibaba’s User Interest Center
iQIYI Technical Product Team
iQIYI Technical Product Team
Jun 12, 2020 · Artificial Intelligence

Deepthought: An End‑to‑End Machine Learning Platform at iQIYI

Deepthought is iQIYI’s end‑to‑end machine‑learning platform that unifies distributed frameworks, decouples pipeline stages, integrates with Tongtian Tower, and offers visual drag‑and‑drop configuration, evolving from a fraud‑detection prototype to a generic system with real‑time inference, automated hyper‑parameter optimization, and support for large‑scale data across anti‑fraud, recommendation, and analytics workloads.

AI PlatformAutoMLParameter Server
0 likes · 13 min read
Deepthought: An End‑to‑End Machine Learning Platform at iQIYI
DataFunTalk
DataFunTalk
May 8, 2020 · Artificial Intelligence

Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems

The article explains how the fourth paradigm's distributed machine learning framework GDBT tackles the massive data, high‑dimensional features, and real‑time requirements of modern recommendation systems by leveraging heterogeneous computing, parameter servers, RDMA networking, and optimized workloads.

GDBTParameter ServerRDMA
0 likes · 18 min read
Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems
Tencent Cloud Developer
Tencent Cloud Developer
Mar 6, 2020 · Artificial Intelligence

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

The paper presents a lightweight, anchor‑free CenterNet‑based object‑ness detector for WeChat’s Scan feature, built on a ShuffleNetV2 backbone with enlarged 5×5 depth‑wise convolutions, a streamlined detection head, and a Pyramid Interpolation Module, then quantized, ONNX‑converted and NCNN‑deployed to achieve a 436 KB model running in ~15 ms per frame on an iPhone 8 CPU.

CenterNetMobile AIModel Optimization
0 likes · 12 min read
WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 20, 2019 · Artificial Intelligence

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

This article details how Alibaba leveraged AI-driven hand‑gesture detection and a lightweight SSD‑based object detection model to create an interactive rock‑paper‑scissors game for Double‑11, addressing challenges of undefined gestures, real‑time mobile performance, and data collection, and achieving over 16 million page views and high accuracy.

Mobile AIReal-time inferenceSSD
0 likes · 22 min read
How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game