Author

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

Articles

Likes

181

Views

Comments

Latest from Tencent Architect

48 recent articles

Tencent Architect

Apr 9, 2021 · Databases

Analysis of Inconsistencies in MySQL Slave Crash Recovery and Sync Master/Relay Log Info

This article analyzes how MySQL slave crashes can cause inconsistencies between master info and relay log positions during sync_master_info and sync_relay_log_info processes, presents case studies of duplicated events, explains the impact of different storage repositories (FILE vs TABLE), and recommends configuration settings to achieve server‑crash‑safe replication.

MySQLRelay LogReplication

0 likes · 15 min read

Analysis of Inconsistencies in MySQL Slave Crash Recovery and Sync Master/Relay Log Info

Tencent Architect

Feb 23, 2021 · Artificial Intelligence

Analysis and Optimization of CephFS I/O Performance for AI Training on the Xingchen Compute Platform

This article investigates why AI training tasks on Tencent's Xingchen compute platform experience severe I/O slowdown when using CephFS, analyzes the underlying Ceph‑FUSE and MDS mechanisms, and proposes metadata‑caching and file‑caching optimizations that can accelerate training speed by three to four times.

AI trainingCeph-FUSECephFS

0 likes · 21 min read

Analysis and Optimization of CephFS I/O Performance for AI Training on the Xingchen Compute Platform

Tencent Architect

Jul 30, 2018 · Artificial Intelligence

Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record

Tencent’s intelligent machine‑learning platform achieved a world‑record by training AlexNet in 4 minutes and ResNet‑50 in 6.6 minutes on ImageNet, using large batch sizes, mixed‑precision, LARS optimization, hierarchical synchronization, gradient fusion, and pipeline I/O techniques to overcome accuracy and scalability challenges.

AI accelerationDeep LearningImageNet

0 likes · 24 min read

Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record

Tencent Architect

Jan 27, 2018 · Fundamentals

Advances in Image Compression: From JPEG to WebP, HEVC, WXAM, SHARP, and Guetzli Optimizations at Tencent TPS

The article reviews recent developments in image compression formats such as JPEG, WebP, HEVC, and Tencent's proprietary WXAM/SHARP, explains Guetzli's perceptual encoding, details extensive GPU‑based performance optimizations, and demonstrates how these techniques dramatically reduce bandwidth usage in Tencent's massive image storage platform.

GPU AccelerationGuetzliJPEG

0 likes · 13 min read

Advances in Image Compression: From JPEG to WebP, HEVC, WXAM, SHARP, and Guetzli Optimizations at Tencent TPS

Tencent Architect

Dec 30, 2017 · Databases

An Overview of Time Series Databases and Tencent CTSDB

This article introduces the concept, characteristics, and use cases of time series databases, explains the data model and challenges of traditional solutions, and provides a detailed overview of Tencent's Cloud Time Series Database (CTSDB) along with performance comparisons against InfluxDB.

Big DataCTSDBTime Series Database

0 likes · 12 min read

An Overview of Time Series Databases and Tencent CTSDB

Tencent Architect

Dec 8, 2017 · Databases

Modern Processors, Emerging Storage, and Database System Design: Challenges and Opportunities

This article reviews the evolution of modern multi‑core processors and non‑volatile memory, analyzes their impact on database system architecture, discusses cache‑friendly designs, distributed logging, and benchmark results, and highlights the opportunities and challenges for DBMS developers in the era of NVRAM.

BenchmarkingDatabasesNon-volatile Memory

0 likes · 17 min read

Modern Processors, Emerging Storage, and Database System Design: Challenges and Opportunities

Tencent Architect

Nov 21, 2017 · Operations

Redesign and Optimization of the WeChat Pay Transaction Record System

This article presents a comprehensive case study of how WeChat Pay rebuilt its transaction record storage to handle massive data growth, improve performance, ensure data completeness, and strengthen security through a distributed key‑value architecture, hierarchical archiving, and robust operational safeguards.

Data SecurityDistributed KVWeChat Pay

0 likes · 10 min read

Redesign and Optimization of the WeChat Pay Transaction Record System

Tencent Architect

Nov 13, 2017 · Artificial Intelligence

Survey of Bandwidth Optimization Techniques in AI Accelerators

This article reviews various architectural strategies—including streaming processing, on‑chip memory optimization, bit‑width compression, sparsity techniques, on‑chip models with chip‑level interconnects, and emerging technologies such as binary networks, memristors, and HBM—to alleviate bandwidth bottlenecks in FPGA/ASIC/TPU AI accelerators.

AIASICAccelerators

0 likes · 20 min read

Survey of Bandwidth Optimization Techniques in AI Accelerators

Tencent Architect

Nov 9, 2017 · Artificial Intelligence

Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design

The article analyzes the limitations of general‑purpose CPUs for deep‑learning workloads, explains how semiconductor scaling and memory‑bandwidth constraints drive the shift toward specialized heterogeneous processors such as GPUs, FPGAs, and ASICs, and discusses the design trade‑offs of embedded versus cloud AI accelerators.

AIASICCPU

0 likes · 13 min read

Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design

Tencent Architect

Oct 20, 2017 · Artificial Intelligence

Design and Performance of a General‑Purpose FPGA CNN Accelerator for Real‑Time AI Services

This article presents a comprehensive overview of a universal FPGA‑based CNN accelerator, detailing its motivation, flexible architecture, compiler workflow, memory and compute unit designs, and performance comparisons that demonstrate significant latency and cost advantages over CPU and GPU solutions for real‑time AI inference.

AI inferenceCNN accelerationCompiler

0 likes · 13 min read

Design and Performance of a General‑Purpose FPGA CNN Accelerator for Real‑Time AI Services