Tagged articles
14 articles
Page 1 of 1
Baidu Geek Talk
Baidu Geek Talk
May 30, 2022 · Mobile Development

Advanced OpenCL Optimization Techniques for Qualcomm Adreno GPUs on Mobile Devices

The article presents advanced OpenCL optimization techniques for Qualcomm Adreno mobile GPUs, explaining the programming model, profiling methods, bottleneck identification, and kernel‑level strategies such as fast math, fp16, vectorized memory accesses, and hardware‑specific features to improve compute‑ and memory‑bound performance on Android devices.

AdrenoGPUMobile Computing
0 likes · 12 min read
Advanced OpenCL Optimization Techniques for Qualcomm Adreno GPUs on Mobile Devices
Baidu App Technology
Baidu App Technology
Jan 24, 2022 · Mobile Development

Introduction to OpenCL Programming for Mobile GPU Computing

As mobile CPUs plateau, developers increasingly use OpenCL to harness Android GPUs like Qualcomm Adreno and Huawei Mali for heterogeneous computing, leveraging its platform, execution, and memory models to write portable kernels—illustrated by a simple array‑addition example that demonstrates device initialization, kernel creation, buffer management, and parallel execution.

AndroidC programmingGPU computing
0 likes · 8 min read
Introduction to OpenCL Programming for Mobile GPU Computing
DataFunTalk
DataFunTalk
Mar 25, 2021 · Artificial Intelligence

Optimizing MNN Mobile Neural Network Inference on GPU with OpenCL: Memory Objects, Work‑Group Tuning, and Auto‑Tuning

This article explains how the MNN deep‑learning framework leverages OpenCL to achieve high‑performance inference on mobile, PC and embedded GPUs by diversifying memory objects, aligning data, using local‑memory reductions, selecting optimal work‑group sizes, applying pre‑inference auto‑tuning, caching compiled programs, and providing practical GPU‑friendly model design guidelines.

GPU OptimizationMNNOpenCL
0 likes · 20 min read
Optimizing MNN Mobile Neural Network Inference on GPU with OpenCL: Memory Objects, Work‑Group Tuning, and Auto‑Tuning
Java Architect Essentials
Java Architect Essentials
Mar 15, 2021 · Industry Insights

Can Apple M1 Macs Mine Ethereum Effectively? A Hands‑On Test

This article documents a technical experiment that runs Ethereum mining software on an M1‑based MacBook Air, detailing the required code patches, build process, performance logs, and the resulting profit of roughly one Chinese yuan per day, while comparing the M1’s capabilities to traditional GPU miners.

Apple SiliconEthereumM1
0 likes · 9 min read
Can Apple M1 Macs Mine Ethereum Effectively? A Hands‑On Test
Tencent Music Tech Team
Tencent Music Tech Team
Apr 30, 2020 · Mobile Development

Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies

Edge deep learning inference on mobile devices faces hardware and software fragmentation, diverse CPUs, GPUs, DSPs, and NPUs, and limited programmability; optimization techniques such as model selection, quantization, and architecture‑specific tuning enable real‑time performance, with most inference on CPUs, GPUs offering 5–10× speedups, and co‑processor support varying across Android and iOS.

DSPGPU programmingNPU
0 likes · 17 min read
Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies
Architects' Tech Alliance
Architects' Tech Alliance
Apr 21, 2019 · Fundamentals

Differences Between CPU and GPU Architectures and the Relationship Between OpenCL and CUDA

This article explains the fundamental architectural differences between CPUs and GPUs, their design goals and performance characteristics, and compares OpenCL and CUDA, highlighting OpenCL’s cross‑platform flexibility versus CUDA’s NVIDIA‑specific optimization, while illustrating how each fits various parallel computing tasks.

CPUCUDAGPU
0 likes · 7 min read
Differences Between CPU and GPU Architectures and the Relationship Between OpenCL and CUDA