Tagged articles
1 articles
Page 1 of 1
AIWalker
AIWalker
Mar 6, 2025 · Artificial Intelligence

How SCMHSA Improves Transformer Next‑Frame Prediction by Reducing Semantic Dilution

The paper introduces a Semantic‑Concentrated Multi‑Head Self‑Attention (SCMHSA) module and a new embedding‑space loss to address semantic dilution and loss‑target mismatch in Transformer‑based video next‑frame prediction, demonstrating significant PSNR and MSE gains across four benchmark datasets.

Computer VisionEmbedding LossSCMHSA
0 likes · 23 min read
How SCMHSA Improves Transformer Next‑Frame Prediction by Reducing Semantic Dilution