Tagged articles
15 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 18, 2026 · Artificial Intelligence

Breaking the ‘See‑then‑Think’ Barrier: Real‑Time ‘See‑and‑Think’ for VLMs (CVPR 2026)

The paper introduces TaYS (Think‑as‑You‑See), a streaming chain‑of‑thought framework that replaces the traditional “watch‑then‑think” video inference pipeline with a parallel, real‑time “watch‑and‑think” approach, dramatically reducing latency and improving accuracy on complex video reasoning tasks.

Dual KV-CacheReal-time VideoStreaming Inference
0 likes · 8 min read
Breaking the ‘See‑then‑Think’ Barrier: Real‑Time ‘See‑and‑Think’ for VLMs (CVPR 2026)
Code Wrench
Code Wrench
Mar 8, 2026 · Artificial Intelligence

How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC

This article breaks down the latency challenges of combining AI with WebRTC, compares edge and cloud processing architectures, and provides a detailed Go‑based implementation—including RTP interception, AI model integration, real‑time translation pipelines, and performance optimizations—for ultra‑responsive video conferencing.

AIEdge ComputingGo
0 likes · 7 min read
How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC
Kuaishou Tech
Kuaishou Tech
Sep 17, 2025 · Artificial Intelligence

How MIDAS Achieves Real‑Time Multimodal Digital‑Human Video Generation

The MIDAS framework introduced by the Kling Team combines autoregressive video generation with a lightweight diffusion denoising head to deliver real‑time, high‑quality digital‑human synthesis under multimodal control, achieving sub‑500 ms latency, 64× compression, and robust performance across multilingual dialogue, singing, and interactive world modeling tasks.

AIDigital HumanReal-time Video
0 likes · 6 min read
How MIDAS Achieves Real‑Time Multimodal Digital‑Human Video Generation
Open Source Linux
Open Source Linux
Dec 6, 2024 · Cloud Computing

How Live Streaming Works: From Encoder to Viewer in Real Time

Live streaming faces challenges due to real‑time video transmission and heavy computation, but by using globally distributed edge servers, transcoding streams into multiple resolutions, segmenting them into short clips, packaging them into formats like HLS, caching via CDNs, and optionally storing them in the cloud, the process delivers video seamlessly to viewers and enables replay.

CDNReal-time VideoVideo Transcoding
0 likes · 2 min read
How Live Streaming Works: From Encoder to Viewer in Real Time
DaTaobao Tech
DaTaobao Tech
Apr 13, 2022 · Artificial Intelligence

Machine‑Learning Based Bandwidth Prediction and Adaptive Streaming for Taobao Live: Concerto, OnRL, and Loki

Alibaba’s Taobao Live team replaced rule‑based bandwidth estimators with three machine‑learning solutions—Concerto, OnRL, and Loki—trained on over a million hours of global live‑stream data, achieving up to 13% throughput gain, threefold stall reduction, and up to 44% lower 95th‑percentile stalls, now deployed commercially.

Real-time Videoadaptive bitratebandwidth prediction
0 likes · 14 min read
Machine‑Learning Based Bandwidth Prediction and Adaptive Streaming for Taobao Live: Concerto, OnRL, and Loki
Kuaishou Large Model
Kuaishou Large Model
Feb 25, 2021 · Artificial Intelligence

How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video

This article details Kuaishou Y‑tech’s Gorgeous beauty platform, covering traditional smoothing, advanced skin‑tone effects, AI‑driven blemish removal, clarity enhancement, local facial tuning, and the UNet‑based GorgeousGAN that delivers one‑click high‑definition beauty for live‑stream and short‑video applications.

AI beautyComputer VisionDeep Learning
0 likes · 13 min read
How Kuaishou’s AI‑Powered Beauty Engine Transforms Real‑Time Video
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Jan 5, 2021 · Artificial Intelligence

How AI-Powered Super-Resolution is Transforming Real-Time Video Communication

AI-driven super-resolution, once limited to academic research, is now tackling real-time video communication challenges by evolving from early interpolation methods to deep learning models, addressing issues of model size, generalization, and real-world degradation, while lightweight networks and encoding-aware techniques promise practical deployment.

AIReal-time Videoimage enhancement
0 likes · 12 min read
How AI-Powered Super-Resolution is Transforming Real-Time Video Communication
Programmer DD
Programmer DD
Apr 17, 2020 · Artificial Intelligence

How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet

Jason Mayes, a Google web engineer, open‑sourced a TensorFlow.js demo that removes people from live webcam video in real time using a lightweight MobileNet model, with only about 200 lines of code, and provides GitHub and CodePen links for experimentation.

Computer VisionMobileNetReal-time Video
0 likes · 9 min read
How to Make People Vanish in Real‑Time Using TensorFlow.js and MobileNet
360 Tech Engineering
360 Tech Engineering
Nov 5, 2019 · Frontend Development

Developing Real‑Time Interactive AI Video Applications with WebRTC and JSMpeg

This article explains how to build a browser‑based AI video interaction system by comparing two approaches—streaming RTSP video via a JSMpeg‑powered WebSocket relay and capturing local media directly with WebRTC’s getUserMedia API—along with code samples, constraints handling, and frame‑extraction techniques.

JSMpegMediaDevicesReal-time Video
0 likes · 15 min read
Developing Real‑Time Interactive AI Video Applications with WebRTC and JSMpeg
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 14, 2018 · Backend Development

Inside Ant's Real-Time Video Call System: Architecture & Optimizations

This article explores Ant Financial's real-time video call platform, detailing its technical choices, system architecture, signaling reliability design, network optimization strategies, and future directions for multi‑party video conferencing and interactive live streaming.

Ant FinancialReal-time VideoSignal Reliability
0 likes · 19 min read
Inside Ant's Real-Time Video Call System: Architecture & Optimizations
Tencent Cloud Developer
Tencent Cloud Developer
Jul 27, 2018 · Mobile Development

Integrating Tencent Cloud Real-Time Audio/Video SDK for Android: A Step-by-Step Guide

To add multi‑person video calling with in‑call text chat on Android, developers register a Tencent Cloud account, obtain an APPID, purchase a low‑cost minute package, import the IMSDK/AVSDK/ILiveSDK/BeautySDK, generate a userSig, initialize and log in the SDK, create or join a room, bind an AVRootView for video rendering, and enable IM messaging, resulting in a functional solution built within a day.

Android SDKInstant MessagingReal-time Video
0 likes · 8 min read
Integrating Tencent Cloud Real-Time Audio/Video SDK for Android: A Step-by-Step Guide
MaGe Linux Operations
MaGe Linux Operations
Sep 23, 2017 · Artificial Intelligence

How to Build a Real-Time Deep Learning Object Detector with OpenCV and Python

This guide walks you through extending a deep‑learning object detection project to process live video streams using OpenCV, Python, and the VideoStream class, covering environment setup, command‑line arguments, model loading, frame‑by‑frame detection, FPS measurement, and performance‑boosting tips.

OpenCVPythonReal-time Video
0 likes · 9 min read
How to Build a Real-Time Deep Learning Object Detector with OpenCV and Python
Alibaba Cloud Developer
Alibaba Cloud Developer
May 18, 2017 · Cloud Computing

Scaling Alibaba Live Streaming for Double 11: Architecture & Performance Secrets

This article analyzes how Alibaba built a highly scalable, low‑latency mobile live‑streaming platform for the 2016 Double 11 event, covering user growth, system architecture, latency reduction, bandwidth savings, interactive features, and the technical challenges and solutions behind the success.

Performance OptimizationReal-time Videocloud architecture
0 likes · 16 min read
Scaling Alibaba Live Streaming for Double 11: Architecture & Performance Secrets
High Availability Architecture
High Availability Architecture
Aug 19, 2016 · Fundamentals

Design and Implementation of a Sub‑500 ms Ultra‑HD Real‑Time Video Transmission System

This article details the architecture, encoding choices, network‑level optimizations, transmission model, measurement methods, and practical pitfalls involved in building a 1080p real‑time video streaming solution that consistently keeps end‑to‑end latency below 500 ms for interactive online education.

H.264Low latencyReal-time Video
0 likes · 27 min read
Design and Implementation of a Sub‑500 ms Ultra‑HD Real‑Time Video Transmission System