Tag

temporal modeling

0 views collected around this technical thread.

360 Tech Engineering
360 Tech Engineering
Aug 29, 2024 · Artificial Intelligence

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

FancyVideo is an open‑source UNet‑based video generation model that supports arbitrary resolutions, aspect ratios, styles, and motion dynamics by introducing a Cross‑frame Textual Guidance Module (CTGM) with temporal injectors, refiners, and boosters, achieving state‑of‑the‑art results on multiple benchmarks and enabling versatile applications such as video extension, backtracking, and frame interpolation.

AI researchUNetcross-frame guidance
0 likes · 6 min read
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
AntTech
AntTech
Dec 20, 2022 · Artificial Intelligence

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Researchers from multiple institutions propose a GAN‑based video generation framework that explicitly models short‑, medium‑, and long‑range temporal relations, introduces B‑spline motion embeddings and temporal shift modules, and demonstrates substantial quality improvements across several video datasets.

B-splineDeep LearningGAN
0 likes · 7 min read
Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Apr 22, 2022 · Artificial Intelligence

How Temporal Residual Modeling Boosts Video Super‑Resolution Performance

This article introduces a novel video super‑resolution framework that unifies low‑ and high‑resolution temporal modeling using adjacent‑frame residual maps, achieving state‑of‑the‑art results on multiple benchmarks while maintaining high speed and flexibility.

Deep Learningcomputer visionresidual maps
0 likes · 14 min read
How Temporal Residual Modeling Boosts Video Super‑Resolution Performance
NetEase Media Technology Team
NetEase Media Technology Team
Jul 24, 2020 · Artificial Intelligence

Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training

This survey reviews video action recognition, comparing 3D convolutional networks that jointly model spatial‑temporal cues but are computationally heavy with 2D‑based approaches like TSM and TIN that embed temporal shifts efficiently, and emphasizes how large‑scale pre‑training markedly improves performance despite limited labeled data.

2D convolutional networks3D convolutional networksDeep Learning
0 likes · 13 min read
Survey of Video Action Recognition Algorithms: 3D and 2D Convolutional Networks and Pre‑training