Tagged articles
8 articles
Page 1 of 1
Lobster Programming
Lobster Programming
May 11, 2026 · Backend Development

Designing Effective Ad Mixing in Short‑Video Feed Streams

The article examines common pitfalls of naïve ad insertion in short‑video feeds, explains how cursor‑based pagination prevents duplicate ads, and outlines a client‑side mixing architecture that isolates services, meets strict latency requirements, and ensures accurate ad billing.

ad mixingbackend designclient-side rendering
0 likes · 4 min read
Designing Effective Ad Mixing in Short‑Video Feed Streams
Tencent Advertising Technology
Tencent Advertising Technology
Jul 17, 2025 · Artificial Intelligence

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

The paper introduces LEADRE, a multi‑faceted knowledge‑enhanced large language model‑driven display advertisement recommender that tackles user interest modeling, knowledge alignment, and low‑latency deployment, achieving significant GMV gains in Tencent’s ad platforms through innovative prompt engineering, semantic alignment, and TensorRT‑accelerated inference.

Knowledge AlignmentLLMPrompt Engineering
0 likes · 16 min read
LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations
Bilibili Tech
Bilibili Tech
Apr 29, 2025 · Cloud Computing

Bilibili Live Streaming Technology for the Spring Festival Gala: Experience Enhancement and Interactive Features

Bilibili's R&D built a cloud-based broadcast console for the 2024 CCTV Spring Festival Gala, delivering 4K HDR streaming, AI SDR-to-HDR conversion, low latency, bandwidth‑aware transcoding, and a synchronized “send bullet screen” interactive feature using custom SEI timestamps for hundreds of millions of viewers.

HDRSEIcloud broadcasting
0 likes · 15 min read
Bilibili Live Streaming Technology for the Spring Festival Gala: Experience Enhancement and Interactive Features
DeWu Technology
DeWu Technology
Apr 14, 2023 · Backend Development

Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level

Async‑fork shifts the costly page‑table copying from Redis’s parent process to its child, allowing the parent to resume handling queries instantly and cutting snapshot‑induced latency spikes by over 98%, thereby dramatically improving tail latency during AOF rewrites, RDB backups, and master‑slave synchronizations.

Async-forkBackend DevelopmentPage Table
0 likes · 21 min read
Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Jul 1, 2022 · Operations

Linux Kernel Performance Profiling: A Comprehensive Guide to On-CPU and Off-CPU Analysis

This comprehensive guide explains Linux kernel performance profiling—both on‑CPU and off‑CPU—by stressing the need to target the critical 3 % of code, covering throughput, latency and power metrics, scalability laws, flame‑graph visualizations, perf and eBPF tools, lock‑contention analysis, and further reading recommendations.

Linux kernelThroughputeBPF
0 likes · 27 min read
Linux Kernel Performance Profiling: A Comprehensive Guide to On-CPU and Off-CPU Analysis
HaoDF Tech Team
HaoDF Tech Team
Nov 8, 2021 · Operations

Service Risk Governance: Exploration, Mitigation, and Hands‑On Workshop

This talk recounts how the Good Doctor platform tackled severe online incidents by launching the DOA project, then a service risk governance initiative that identifies, quantifies, and mitigates latency‑related risks through metrics‑driven development, dependency analysis, middleware reliability, and a dedicated risk‑management platform.

MicroservicesSRElatency optimization
0 likes · 16 min read
Service Risk Governance: Exploration, Mitigation, and Hands‑On Workshop
vivo Internet Technology
vivo Internet Technology
Oct 27, 2021 · Backend Development

JVM Garbage Collection Tuning for a Video Service to Reduce P99 Latency

By replacing the default Parallel GC with a ParNew‑CMS collector, enlarging the Young generation, fixing Metaspace settings, and tuning CMS occupancy thresholds, the video service cut Young and Full GC pauses dramatically, lowered Full GC count by over 80%, and achieved more than 30% P99 latency reduction, with some APIs improving up to 80%.

CMSGarbage CollectionJVM
0 likes · 16 min read
JVM Garbage Collection Tuning for a Video Service to Reduce P99 Latency
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 27, 2020 · Artificial Intelligence

Optimizing TensorFlow Serving Model Hot‑Update to Eliminate Latency Spikes in CTR Recommendation Systems

By adding model warm‑up files, separating load/unload threads, switching to the Jemalloc allocator, and isolating TensorFlow’s parameter memory from RPC request buffers, iQIYI’s engineers reduced TensorFlow Serving hot‑update latency spikes in high‑throughput CTR recommendation services from over 120 ms to about 2 ms, eliminating jitter.

Model Hot UpdateTensorFlow ServingWarmup
0 likes · 11 min read
Optimizing TensorFlow Serving Model Hot‑Update to Eliminate Latency Spikes in CTR Recommendation Systems