DataFunSummit
Dec 14, 2020 · Artificial Intelligence
LightSeq: High‑Performance Open‑Source Inference Engine for Transformers, GPT and Other NLP Models
This article introduces LightSeq, an open‑source, GPU‑accelerated inference engine that dramatically speeds up Transformer‑based models such as BERT and GPT by up to 14× over TensorFlow, supports multiple decoding strategies, integrates seamlessly with major deep‑learning frameworks, and provides detailed performance benchmarks and technical optimizations.
GPUInferenceLightSeq
0 likes · 15 min read
