Tagged articles
1 articles
Page 1 of 1
DataFunTalk
DataFunTalk
Dec 17, 2022 · Artificial Intelligence

Efficient Spatiotemporal Self‑Attention Transformer (Patch Shift Transformer) for Video Action Recognition

This article introduces a lightweight spatiotemporal self‑attention transformer, called Patch Shift Transformer, which achieves competitive video action recognition performance on datasets such as Kinetics‑400, Sth‑v1/v2, and Diving48 without increasing computational cost or parameters, and details its design, experiments, and speed advantages.

ECCV 2022Transformerpatch shift
0 likes · 5 min read
Efficient Spatiotemporal Self‑Attention Transformer (Patch Shift Transformer) for Video Action Recognition