Tag

GPU programming

0 views collected around this technical thread.

Tencent Technical Engineering
Tencent Technical Engineering
Mar 21, 2025 · Fundamentals

Fundamentals of GPU Architecture and Programming

The article explains GPU fundamentals—from the end of Dennard scaling and why GPUs excel in parallel throughput, through CUDA programming basics like the SAXPY kernel and SIMT versus SIMD execution, to the evolution of the SIMT stack, modern scheduling, and a three‑step core architecture design.

CUDAGPUGPU programming
0 likes · 42 min read
Fundamentals of GPU Architecture and Programming
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Aug 11, 2023 · Game Development

FidelityFX Super Resolution 1.0: Technical Analysis and Implementation

The article delivers an in‑depth technical dissection of AMD’s FidelityFX Super Resolution 1.0, detailing the EASU spatial upscaling pipeline—its Lanczos2‑based polynomial fitting, 12‑point sampling, gradient calculations, and edge handling—and the RCAS contrast‑adaptive sharpening stage, while also outlining mobile‑friendly optimizations such as half‑precision arithmetic and reduced texture fetches.

EASUFSR 1.0GPU programming
0 likes · 6 min read
FidelityFX Super Resolution 1.0: Technical Analysis and Implementation
政采云技术
政采云技术
Aug 10, 2021 · Frontend Development

WebGL Concepts and Fundamentals

This article introduces WebGL, covering its definition, history, basic concepts, working principles, and practical examples of drawing shapes using both native WebGL API and the Three.js framework.

3D graphics3D web developmentFragment Shader
0 likes · 17 min read
WebGL Concepts and Fundamentals
Tencent Music Tech Team
Tencent Music Tech Team
Apr 30, 2020 · Mobile Development

Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies

Edge deep learning inference on mobile devices faces hardware and software fragmentation, diverse CPUs, GPUs, DSPs, and NPUs, and limited programmability; optimization techniques such as model selection, quantization, and architecture‑specific tuning enable real‑time performance, with most inference on CPUs, GPUs offering 5–10× speedups, and co‑processor support varying across Android and iOS.

DSPGPU programmingNPU
0 likes · 17 min read
Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies