PaddlePaddle Neural Network Compiler (CINN): Architecture, Optimization Techniques, and Performance

The PaddlePaddle Neural Network Compiler (CINN) combines a PIR‑based frontend and a hardware‑specific backend to apply graph‑level optimizations, operator fusion, schedule transformations and automatic tuning, delivering up to 4× faster kernels and 30‑60% overall speed‑ups for deep‑learning and scientific workloads.

CINNGPU OptimizationOperator fusion

0 likes · 19 min read

PaddlePaddle Neural Network Compiler (CINN): Architecture, Optimization Techniques, and Performance

Baidu Geek Talk

Aug 19, 2024 · Artificial Intelligence

PaddlePaddle Neural Network Compiler (CINN): Architecture, Optimization Techniques, and Performance Gains

The PaddlePaddle Neural Network Compiler (CINN) combines a PIR‑based frontend that performs graph‑level optimizations such as constant folding, dead‑code elimination and operator fusion with a backend that applies schedule transformations and auto‑tuning, delivering up to 4× faster RMSNorm kernels and 30‑60% overall speed‑ups for generative AI and scientific‑computing workloads.

CINNDeep LearningGPU

0 likes · 18 min read

PaddlePaddle Neural Network Compiler (CINN): Architecture, Optimization Techniques, and Performance Gains

Efficient Ops

Jul 17, 2021 · Databases

How AutoTiKV’s Machine Learning Optimizes Beaver Search Engine Performance

This article describes how the Beaver search engine’s many performance‑related configuration parameters can be automatically tuned using machine‑learning techniques from OtterTune and AutoTiKV, detailing the background research, Gaussian Process regression model, Bayesian optimization process, implementation steps, test results, and future improvements.

Bayesian OptimizationBeaverDatabase Performance

0 likes · 23 min read

How AutoTiKV’s Machine Learning Optimizes Beaver Search Engine Performance

DataFunTalk

Mar 25, 2021 · Artificial Intelligence

Optimizing MNN Mobile Neural Network Inference on GPU with OpenCL: Memory Objects, Work‑Group Tuning, and Auto‑Tuning

This article explains how the MNN deep‑learning framework leverages OpenCL to achieve high‑performance inference on mobile, PC and embedded GPUs by diversifying memory objects, aligning data, using local‑memory reductions, selecting optimal work‑group sizes, applying pre‑inference auto‑tuning, caching compiled programs, and providing practical GPU‑friendly model design guidelines.

GPU OptimizationMNNOpenCL

0 likes · 20 min read

Optimizing MNN Mobile Neural Network Inference on GPU with OpenCL: Memory Objects, Work‑Group Tuning, and Auto‑Tuning

iQIYI Technical Product Team

Nov 27, 2020 · Artificial Intelligence

Evolution and Experience of iQIYI's Machine Learning Platform

iQIYI’s Machine Learning Platform evolved from the specialized Javis deep‑learning system into a unified, low‑threshold solution for algorithm engineers, analysts, and developers, adding visual pipeline building, multi‑framework scheduling, automatic hyper‑parameter tuning, parameter‑server training, and scalable online prediction, dramatically boosting business efficiency and detection performance.

AIauto-tuningmachine learning

0 likes · 13 min read

Evolution and Experience of iQIYI's Machine Learning Platform