Tencent Architect
Author

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

47
Articles
0
Likes
97
Views
0
Comments
Recent Articles

Latest from Tencent Architect

47 recent articles
Tencent Architect
Tencent Architect
Feb 23, 2021 · Artificial Intelligence

Analysis and Optimization of CephFS I/O Performance for AI Training on the Xingchen Compute Platform

This article investigates why AI training tasks on Tencent's Xingchen compute platform experience severe I/O slowdown when using CephFS, analyzes the underlying Ceph‑FUSE and MDS mechanisms, and proposes metadata‑caching and file‑caching optimizations that can accelerate training speed by three to four times.

AI trainingCeph-FUSECephFS
0 likes · 21 min read
Analysis and Optimization of CephFS I/O Performance for AI Training on the Xingchen Compute Platform
Tencent Architect
Tencent Architect
Jul 30, 2018 · Artificial Intelligence

Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record

Tencent’s intelligent machine‑learning platform achieved a world‑record by training AlexNet in 4 minutes and ResNet‑50 in 6.6 minutes on ImageNet, using large batch sizes, mixed‑precision, LARS optimization, hierarchical synchronization, gradient fusion, and pipeline I/O techniques to overcome accuracy and scalability challenges.

AI accelerationImageNetdeep learning
0 likes · 24 min read
Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record
Tencent Architect
Tencent Architect
Jan 27, 2018 · Fundamentals

Advances in Image Compression: From JPEG to WebP, HEVC, WXAM, SHARP, and Guetzli Optimizations at Tencent TPS

The article reviews recent developments in image compression formats such as JPEG, WebP, HEVC, and Tencent's proprietary WXAM/SHARP, explains Guetzli's perceptual encoding, details extensive GPU‑based performance optimizations, and demonstrates how these techniques dramatically reduce bandwidth usage in Tencent's massive image storage platform.

GPU accelerationGuetzliJPEG
0 likes · 13 min read
Advances in Image Compression: From JPEG to WebP, HEVC, WXAM, SHARP, and Guetzli Optimizations at Tencent TPS
Tencent Architect
Tencent Architect
Dec 30, 2017 · Databases

An Overview of Time Series Databases and Tencent CTSDB

This article introduces the concept, characteristics, and use cases of time series databases, explains the data model and challenges of traditional solutions, and provides a detailed overview of Tencent's Cloud Time Series Database (CTSDB) along with performance comparisons against InfluxDB.

CTSDBPerformance BenchmarkTime Series Database
0 likes · 12 min read
An Overview of Time Series Databases and Tencent CTSDB
Tencent Architect
Tencent Architect
Dec 8, 2017 · Databases

Modern Processors, Emerging Storage, and Database System Design: Challenges and Opportunities

This article reviews the evolution of modern multi‑core processors and non‑volatile memory, analyzes their impact on database system architecture, discusses cache‑friendly designs, distributed logging, and benchmark results, and highlights the opportunities and challenges for DBMS developers in the era of NVRAM.

Non-volatile MemoryPerformance optimizationbenchmarking
0 likes · 17 min read
Modern Processors, Emerging Storage, and Database System Design: Challenges and Opportunities
Tencent Architect
Tencent Architect
Nov 21, 2017 · Operations

Redesign and Optimization of the WeChat Pay Transaction Record System

This article presents a comprehensive case study of how WeChat Pay rebuilt its transaction record storage to handle massive data growth, improve performance, ensure data completeness, and strengthen security through a distributed key‑value architecture, hierarchical archiving, and robust operational safeguards.

Distributed KVSystem ArchitectureWeChat Pay
0 likes · 10 min read
Redesign and Optimization of the WeChat Pay Transaction Record System
Tencent Architect
Tencent Architect
Nov 13, 2017 · Artificial Intelligence

Survey of Bandwidth Optimization Techniques in AI Accelerators

This article reviews various architectural strategies—including streaming processing, on‑chip memory optimization, bit‑width compression, sparsity techniques, on‑chip models with chip‑level interconnects, and emerging technologies such as binary networks, memristors, and HBM—to alleviate bandwidth bottlenecks in FPGA/ASIC/TPU AI accelerators.

AIASICAccelerators
0 likes · 20 min read
Survey of Bandwidth Optimization Techniques in AI Accelerators
Tencent Architect
Tencent Architect
Nov 9, 2017 · Artificial Intelligence

Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design

The article analyzes the limitations of general‑purpose CPUs for deep‑learning workloads, explains how semiconductor scaling and memory‑bandwidth constraints drive the shift toward specialized heterogeneous processors such as GPUs, FPGAs, and ASICs, and discusses the design trade‑offs of embedded versus cloud AI accelerators.

AIASICCPU
0 likes · 13 min read
Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design
Tencent Architect
Tencent Architect
Oct 20, 2017 · Artificial Intelligence

Design and Performance of a General‑Purpose FPGA CNN Accelerator for Real‑Time AI Services

This article presents a comprehensive overview of a universal FPGA‑based CNN accelerator, detailing its motivation, flexible architecture, compiler workflow, memory and compute unit designs, and performance comparisons that demonstrate significant latency and cost advantages over CPU and GPU solutions for real‑time AI inference.

AI inferenceCNN accelerationFPGA
0 likes · 13 min read
Design and Performance of a General‑Purpose FPGA CNN Accelerator for Real‑Time AI Services