Tag

high QPS

0 views collected around this technical thread.

Alimama Tech
Alimama Tech
May 12, 2025 · Artificial Intelligence

Universal Recommendation Model (URM): A General Large‑Model Recall System for Advertising

The article presents the Universal Recommendation Model (URM), a large‑language‑model‑based recall framework that integrates world knowledge and e‑commerce expertise through knowledge injection and prompt‑driven alignment, achieving significant offline recall gains and a 3.1% increase in ad consumption while meeting high‑QPS, low‑latency production constraints.

advertisinghigh QPSlarge language model
0 likes · 17 min read
Universal Recommendation Model (URM): A General Large‑Model Recall System for Advertising
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Nov 7, 2024 · Artificial Intelligence

RTAMS-GANNS: A Real-Time Adaptive Multi-Stream GPU System for Online Approximate Nearest Neighbor Search

RTAMS‑GANNS, the award‑winning real‑time adaptive multi‑stream GPU system for online approximate nearest neighbor search, eliminates costly memory allocations and serial execution by using a dynamic memory‑block insertion algorithm and separate CUDA streams, cutting latency by 40‑80% and reliably serving over 100 million daily users in production.

GPUMulti‑StreamOnline Search
0 likes · 19 min read
RTAMS-GANNS: A Real-Time Adaptive Multi-Stream GPU System for Online Approximate Nearest Neighbor Search
Architect's Guide
Architect's Guide
Aug 24, 2022 · Backend Development

Optimizing Long‑Connection Services with Netty: From Millions of Connections to High QPS

This article summarizes the challenges and optimization techniques for building a high‑performance long‑connection service with Netty, covering non‑blocking I/O, Linux kernel tuning, client‑side testing, VM‑based scaling, data‑structure tweaks, CPU and GC bottlenecks, and the final results of achieving hundreds of thousands of connections and tens of thousands of QPS on a single server.

GC TuningJava NIOLinux Tuning
0 likes · 14 min read
Optimizing Long‑Connection Services with Netty: From Millions of Connections to High QPS
Xianyu Technology
Xianyu Technology
Apr 13, 2022 · Big Data

Real-time Multi-system Data Aggregation for Fan Tag System

The Xianyu fan‑tag system solves the challenge of displaying full‑history purchase counts with real‑time updates and low‑latency, high‑throughput queries by daily exporting multi‑system data to a LevelDB‑based KV store, converting schemas, and applying real‑time compensation from transaction and follow‑change messages, merging offline and live data to produce sorted fan lists at ~10 k QPS.

KV storageReal-time Processingdata aggregation
0 likes · 6 min read
Real-time Multi-system Data Aggregation for Fan Tag System
ByteFE
ByteFE
Apr 11, 2022 · Backend Development

ByteDance Wallet Asset Middle Platform Design for 2022 Spring Festival High‑Traffic Reward Distribution

This article details ByteDance's wallet asset middle platform designed for the 2022 Spring Festival, covering eight‑app reward interoperability, high‑QPS challenges, token‑based asynchronous入账, budget control, stability measures, and fund‑safety guarantees, and includes practical solutions for hot‑key handling, budget throttling, and multi‑stage activity isolation.

ByteDanceFund SafetyStability
0 likes · 22 min read
ByteDance Wallet Asset Middle Platform Design for 2022 Spring Festival High‑Traffic Reward Distribution
IT Xianyu
IT Xianyu
Sep 14, 2021 · Backend Development

Design and Implementation of a High‑Throughput 10‑Billion Red‑Envelope System Simulation

This article describes how to design, implement, and evaluate a scalable backend that can simulate 10 billion WeChat red‑envelope requests by supporting up to 1 million concurrent users and handling 30 k–60 k QPS per server using Go, Linux tools, and custom monitoring.

Gobackend architecturedistributed systems
0 likes · 18 min read
Design and Implementation of a High‑Throughput 10‑Billion Red‑Envelope System Simulation
High Availability Architecture
High Availability Architecture
Jul 6, 2021 · Backend Development

Tuning a Go Service to Reach 200k QPS: GC Adjustment and UDP Optimizations

The article describes how a Go‑based high‑throughput service was tuned from 80k to over 200k QPS by enlarging the GC heap, reusing UDP connections with sync.Pool, reducing system‑call overhead, and applying several lightweight logging and discovery optimizations.

GC TuningGoUDP
0 likes · 8 min read
Tuning a Go Service to Reach 200k QPS: GC Adjustment and UDP Optimizations
Ctrip Technology
Ctrip Technology
Feb 25, 2021 · Backend Development

Design and Implementation of a Cache Access Component and Update Platform for High‑QPS Scenarios

This article describes a backend architecture for a high‑traffic e‑commerce project, detailing a cache access component and a cache update platform that use asynchronous messaging, hotspot‑key handling, versioned cache entries, and Redis to achieve low latency, high QPS support and strong data consistency.

Message QueueRedisbackend
0 likes · 18 min read
Design and Implementation of a Cache Access Component and Update Platform for High‑QPS Scenarios