Meituan Technology Team
Mar 24, 2022 · Artificial Intelligence
Twins: Efficient Visual Attention Models for Vision Transformers
The Twins series, a collaboration between Meituan and the University of Adelaide, introduces conditional positional encoding and spatially separable self‑attention to improve efficiency and performance of vision transformers, achieving state‑of‑the‑art results on ImageNet, ADE20K, COCO and high‑precision map segmentation.
ADE20KCOCOConditional Positional Encoding
0 likes · 20 min read
