Tagged articles
2 articles
Page 1 of 1
DataFunSummit
DataFunSummit
Jul 4, 2023 · Artificial Intelligence

PPL: A Full‑Platform Deep Learning Deployment Framework by SenseTime

The article presents SenseTime's PPL framework, detailing its toolchain, inference engine, multi‑backend operator library, quantization tools, CUDA optimizations, performance benchmarks across CPUs, GPUs, DSPs and DSAs, and outlines future plans for broader chip support and AI for Science.

AI inferenceCUDA optimizationDeep Learning Deployment
0 likes · 23 min read
PPL: A Full‑Platform Deep Learning Deployment Framework by SenseTime
TiPaiPai Technical Team
TiPaiPai Technical Team
Jun 25, 2021 · Artificial Intelligence

Mastering TensorRT: Deploy Deep Learning Models Efficiently

This article introduces TensorRT, explains its deployment workflow from model training to engine generation, shows how to register custom operators for ONNX and create TensorRT plugins, and explores deformable convolution (DCN) implementation strategies for high‑performance AI inference.

AI inferenceCUDACustom Operators
0 likes · 8 min read
Mastering TensorRT: Deploy Deep Learning Models Efficiently