Tagged articles

video generation

153 articles · Page 2 of 2

Jan 19, 2025 · Artificial Intelligence

Weekly AI Digest Issue 11: Recommendation Algorithms, Video Generation Advances, and AGI Research

This issue of the weekly AI digest explores Xiaohongshu’s NoteLLM recommendation system, compares Chinese text generation in video AI across major platforms, highlights Alibaba’s Tongyi Wanxiang breakthroughs, discusses Keras founder François Chollet’s new AGI‑focused lab, and reviews Google’s Veo 2 and Imagen‑3 advancements.

AGIAIGenerative AI

0 likes · 11 min read

Weekly AI Digest Issue 11: Recommendation Algorithms, Video Generation Advances, and AGI Research

Alibaba Cloud Native

Jan 16, 2025 · Cloud Native

Build an AI‑Powered Audiobook Production Pipeline with Cloud Native CAP

This guide explains how to use Alibaba Cloud's Cloud Native Application Platform (CAP), Function Compute, and Baillian model service to create an end‑to‑end automated workflow that transforms text into audio, subtitles, images, and finally a compiled video audiobook.

AIAutomationCloud Native

0 likes · 6 min read

Build an AI‑Powered Audiobook Production Pipeline with Cloud Native CAP

AIWalker

Jan 15, 2025 · Artificial Intelligence

Magic Mirror: Zero‑Shot Identity‑Preserved High‑Quality Personalized Video Generation

Magic Mirror introduces a single‑stage, zero‑shot framework that fuses dual facial embeddings with a conditional adaptive normalization module inside a Video Diffusion Transformer, achieving superior identity consistency, natural dynamics, and high visual quality compared with existing video generation methods.

conditional adaptive normalizationdiffusion transformeridentity preservation

0 likes · 16 min read

Magic Mirror: Zero‑Shot Identity‑Preserved High‑Quality Personalized Video Generation

58UXD

Dec 18, 2024 · Artificial Intelligence

Transform Your Designs with AI: 5 Steps to Create Stunning Videos

Learn how designers can harness AI tools in five practical steps—from script generation and AI‑driven image creation to video synthesis, music production, and final editing—to craft compelling, high‑quality videos that boost creativity and efficiency.

AI toolsAI videocreative AI

0 likes · 4 min read

Transform Your Designs with AI: 5 Steps to Create Stunning Videos

php Courses

Dec 13, 2024 · Artificial Intelligence

OpenAI Releases Sora Video Generation Model: Three Key Implications and Core Features

OpenAI's new Sora model introduces AI-powered video generation, empowering creators, expanding interaction beyond text, and marking a pivotal step toward AGI by enabling machines to understand and produce visual content, with a suite of tools such as Explore, StoryBoard, Remix, Loop, and Blend.

Artificial IntelligenceOpenAISora

0 likes · 4 min read

OpenAI Releases Sora Video Generation Model: Three Key Implications and Core Features

Alibaba Cloud Big Data AI Platform

Dec 4, 2024 · Artificial Intelligence

How EasyAnimate V5 Advances AI Video Generation with Multimodal Control

EasyAnimate V5, an Alibaba Cloud AI video generation framework, expands model size to 7B/12B, introduces multimodal control, token‑length based training, and inpaint‑based image‑to‑video strategies, while providing easy deployment via PAI, DSW, and local ComfyUI integration.

AILoRAMMDiT

0 likes · 11 min read

How EasyAnimate V5 Advances AI Video Generation with Multimodal Control

Alipay Experience Technology

Nov 27, 2024 · Artificial Intelligence

EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs

EchoMimicV2 is an open‑source digital‑human framework that generates high‑quality half‑body animation videos from a single reference image, an audio clip, and a hand‑gesture sequence, addressing challenges of facial portrait limits, complex condition injection, and inference latency in audio‑driven animation.

AI researchDiffusion Modelsaudio-driven animation

0 likes · 18 min read

EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs

ZhongAn Tech Team

Nov 16, 2024 · Artificial Intelligence

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

This weekly AI roundup discusses emerging video generation tools like PixelDance and Vidu 1.5, debates on scaling limits of large models, AGI geopolitical considerations, and a MIT study comparing LoRA with full fine‑tuning for domain adaptation.

AGIAILarge Language Models

0 likes · 8 min read

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

Baobao Algorithm Notes

Oct 17, 2024 · Artificial Intelligence

How Meta’s Movie Gen Pushes Text‑to‑Video Generation to New Heights

Meta’s newly released 92‑page Movie Gen paper introduces a multimodal LLM that unifies text‑to‑image, text‑to‑video, personalized video, precise video editing, and audio generation, detailing its dual‑model architecture, training pipeline, temporal auto‑encoder design, scaling strategies, evaluation benchmark, and ablation studies.

Deep LearningEvaluationModel Scaling

0 likes · 34 min read

How Meta’s Movie Gen Pushes Text‑to‑Video Generation to New Heights

DataFunSummit

Oct 10, 2024 · Artificial Intelligence

AIGC‑Assisted Marketing Material Generation at Shujia Technology

This article describes Shujia Technology's use of artificial intelligence to generate marketing images and videos, outlining the background, challenges of high-volume content production, detailed solutions for image and video assets—including layout models, diffusion models, and digital human synthesis—and future research directions.

AIGCMarketingdigital human

0 likes · 12 min read

AIGC‑Assisted Marketing Material Generation at Shujia Technology

Volcano Engine Developer Services

Sep 11, 2024 · Artificial Intelligence

How Large Language Models are Transforming Computer Vision: From Image Understanding to Video Generation

This article reviews recent advances in applying large language models to computer vision, covering background challenges, unified multimodal modeling, the PixelLM architecture for pixel‑level understanding and generation, and new approaches to image and video creation such as StoryDiffusion, while outlining future research directions.

PixelLMStoryDiffusioncomputer vision

0 likes · 22 min read

How Large Language Models are Transforming Computer Vision: From Image Understanding to Video Generation

360 Tech Engineering

Aug 29, 2024 · Artificial Intelligence

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

FancyVideo is an open‑source UNet‑based video generation model that supports arbitrary resolutions, aspect ratios, styles, and motion dynamics by introducing a Cross‑frame Textual Guidance Module (CTGM) with temporal injectors, refiners, and boosters, achieving state‑of‑the‑art results on multiple benchmarks and enabling versatile applications such as video extension, backtracking, and frame interpolation.

AI researchUNetcross-frame guidance

0 likes · 6 min read

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

Tencent Advertising Technology

Jul 31, 2024 · Artificial Intelligence

MimicMotion: A Controllable Video Generation Framework for High-Quality Human Motion Synthesis

MimicMotion is a controllable video generation framework that produces smooth, high-quality human motion videos by leveraging skeletal action guidance, addressing challenges in video generation such as limited length, weak controllability, and lack of dynamic detail.

AIDiffusion ModelsMimicMotion

0 likes · 13 min read

MimicMotion: A Controllable Video Generation Framework for High-Quality Human Motion Synthesis

Qunar Tech Salon

Jul 25, 2024 · Artificial Intelligence

AI-Generated Video Practices for International Hotels

At the WOT2024 conference, Qunar Travel’s CTO Zheng Jimin presented a comprehensive overview of AI-generated video production for international hotels, detailing challenges, AI-driven workflow automation, practical implementation steps, multilingual translation enhancements, and performance results, offering valuable insights for scaling high‑quality hotel video content.

AIAIGCAutomation

0 likes · 11 min read

AI-Generated Video Practices for International Hotels

Baidu Geek Talk

Jul 24, 2024 · Artificial Intelligence

AI-Driven Fusion of Peking Opera Characters with Ink-Wash Painting Style Using PaddleGAN

Li Yilin’s AI project blends Peking Opera characters with traditional ink‑wash painting by using PaddleHub for style transfer and PaddleGAN’s First‑Order Motion model for facial motion, then adds music and Wav2Lip lip‑sync, producing videos that modernize Chinese heritage and gauge public cultural awareness.

AIDeep LearningPaddleGAN

0 likes · 9 min read

AI-Driven Fusion of Peking Opera Characters with Ink-Wash Painting Style Using PaddleGAN

Alibaba Cloud Big Data AI Platform

Jul 15, 2024 · Artificial Intelligence

How EasyAnimate v3 Generates High‑Resolution Videos with Diffusion Transformers

EasyAnimate v3, an open‑source video generation system from Alibaba Cloud AI Platform, introduces Diffusion Transformer‑based architecture, Hybrid Motion Module, and Slice VAE to enable image‑to‑video, text‑to‑video, and unlimited‑length video creation with up to 720p/144 fps resolution on modest GPU memory.

AIEasyAnimateGenerative AI

0 likes · 5 min read

How EasyAnimate v3 Generates High‑Resolution Videos with Diffusion Transformers

Kuaishou Large Model

Jun 27, 2024 · Artificial Intelligence

How I2V-Adapter Turns Images into Videos with Minimal Training

Fast‑forwarding image‑to‑video generation, the article introduces I2V‑Adapter, a lightweight plug‑in for Stable Diffusion‑based video diffusion models that converts a single static image into a coherent video without altering the original T2V architecture, and details its design, frame‑similarity prior, experimental results, and real‑world applications.

AIDiffusion ModelsI2V-Adapter

0 likes · 9 min read

How I2V-Adapter Turns Images into Videos with Minimal Training

Kuaishou Tech

Jun 26, 2024 · Artificial Intelligence

I2V-Adapter: A Lightweight Image‑to‑Video Adapter for Stable Diffusion Video Diffusion Models

The I2V-Adapter paper introduces a plug‑and‑play lightweight module that enables static images to be converted into dynamic videos using Stable Diffusion‑based text‑to‑video diffusion models without altering the original architecture or pretrained parameters, achieving competitive quality with far less training cost.

AIDiffusion ModelsI2V-Adapter

0 likes · 8 min read

I2V-Adapter: A Lightweight Image‑to‑Video Adapter for Stable Diffusion Video Diffusion Models

Alibaba Cloud Big Data AI Platform

Jun 19, 2024 · Artificial Intelligence

Deploy and Fine‑Tune EasyAnimate for High‑Res Video Generation on Alibaba Cloud PAI

EasyAnimate is Alibaba Cloud PAI's DiT video generation framework that provides a complete HD video generation solution, and this guide walks you through integrating EasyAnimate on PAI, setting up prerequisites, creating DSW instances, installing the model, performing inference via code or WebUI, fine‑tuning LoRA, and using the API.

Alibaba Cloud PAIDSWEasyAnimate

0 likes · 14 min read

Deploy and Fine‑Tune EasyAnimate for High‑Res Video Generation on Alibaba Cloud PAI

Alibaba Cloud Big Data AI Platform

Jun 4, 2024 · Artificial Intelligence

EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers

EasyAnimate, an open‑source DiT‑based video generation framework from Alibaba Cloud AI Platform PAI, offers a complete pipeline—including data preprocessing, VAE and DiT training, LoRA fine‑tuning, motion‑module integration, and scalable inference up to 768×768 resolution and 144 frames—leveraging Diffusion Transformers to produce longer, higher‑quality videos.

AI videoLoRAVAE

0 likes · 14 min read

EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers

JD Cloud Developers

May 14, 2024 · Artificial Intelligence

Create Digital Avatars and Face Swaps with EasyPhoto on JD Cloud

Learn how to install and use the EasyPhoto plugin on JD Cloud’s Stable Diffusion WebUI to generate digital avatars, perform multi‑person face swaps, and create AI‑generated videos, with step‑by‑step instructions, screenshots, and tips for optimal settings and coupon usage.

AI Avatarcloud-computingeasyphoto

0 likes · 6 min read

Create Digital Avatars and Face Swaps with EasyPhoto on JD Cloud

MoonWebTeam

May 14, 2024 · Frontend Development

Top 9 Front-End & AI Trends Shaping 2024: From Apple’s MM1 to Micro‑Frontends

This monthly roundup highlights nine cutting‑edge topics—from Apple’s multimodal MM1 model and the Signals standardization proposal to Stable Video Diffusion, digital humans, micro‑frontend frameworks, Monkey testing automation, Tango low‑code sandbox, and cross‑platform app frameworks—offering deep insights and practical takeaways for modern developers.

Artificial IntelligenceMicro Frontendsfrontend development

0 likes · 17 min read

Top 9 Front-End & AI Trends Shaping 2024: From Apple’s MM1 to Micro‑Frontends

DataFunTalk

May 3, 2024 · Artificial Intelligence

Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora

This article reviews the rapid progress of text‑to‑video generation, explains diffusion‑based video synthesis, outlines key technical challenges such as motion modeling, semantic alignment and quality, and presents Tencent’s solutions and real‑world applications, while also discussing future directions and the impact of OpenAI’s Sora model.

AIDiffusion ModelsSora

0 likes · 23 min read

Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora

Architect

Apr 16, 2024 · Artificial Intelligence

Unraveling Sora: How OpenAI Might Build a 60‑Second Video Generator

This article dissects the possible architecture of OpenAI's Sora video model, tracing its visual encoder‑decoder, Spacetime Latent Patch, transformer‑based diffusion backbone, long‑time consistency strategies, and training pipeline, while comparing alternatives such as MAGVIT‑v2, TECO, NaViT, and FDM to reveal why each design choice may have been made.

AI ArchitectureSoraTransformer

0 likes · 51 min read

Unraveling Sora: How OpenAI Might Build a 60‑Second Video Generator

Alimama Tech

Apr 10, 2024 · Artificial Intelligence

SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising

SizeCube leverages Stable Diffusion‑based diffusion models and a sophisticated pipeline—including quality filtering, feature mining, latent‑space UNet denoising, super‑resolution, and temporal 3D‑U‑Net video processing—to automatically outpaint images and videos to any size, boosting Alibaba advertisers’ creative flexibility, click‑through rates, and asset adaptability across diverse ad placements.

AIAdvertisingImage Outpainting

0 likes · 14 min read

SizeCube: AI‑Driven Arbitrary‑Size Image and Video Outpainting for Advertising

Architects' Tech Alliance

Apr 7, 2024 · Artificial Intelligence

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Sora, the newly announced text‑to‑video large model, can generate one‑minute high‑fidelity videos from textual prompts or static images, handling complex scenes, expressive characters, and sophisticated camera motions while also supporting video extension and frame‑filling, positioning it at the forefront of multimodal AI research.

AI modelMultimodalSora

0 likes · 6 min read

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Architect

Mar 28, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

This article explains OpenAI's Sora video generation model, detailing its latent diffusion foundation, video compression network, spacetime patch representation, Diffusion Transformer processing, and decoding pipeline, while also reviewing related Stable Diffusion and Transformer concepts that enable high‑quality text‑to‑video synthesis.

AIDeep LearningSora

0 likes · 17 min read

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

DevOps

Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora

0 likes · 8 min read

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

DaTaobao Tech

Mar 25, 2024 · Artificial Intelligence

Survey of AIGC Video Generation Algorithms

Since 2023, AI‑generated video research has expanded across six algorithmic categories—text‑to‑video, image‑to‑video, editing, style transfer, human motion, and long‑video generation—highlighting works such as CogVideo, Imagen Video, MagicVideo, ControlVideo, DCTNet, NUWA‑XL and OpenAI’s Sora, while analysis shows short‑clip diffusion models excel, editing remains costly, style transfer is efficient, and truly long, temporally consistent videos remain an open challenge.

AIAIGCDiffusion Models

0 likes · 13 min read

Survey of AIGC Video Generation Algorithms

NewBeeNLP

Mar 22, 2024 · Artificial Intelligence

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

This article provides a step‑by‑step technical analysis of OpenAI’s Sora model, examining its possible overall architecture, video encoder‑decoder design, Spacetime Latent Patch mechanism, transformer‑based diffusion process, training strategies, and long‑term consistency techniques, while grounding each speculation in publicly available reports and related research.

AI analysisSoraTransformer

0 likes · 50 min read

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

NewBeeNLP

Mar 20, 2024 · Artificial Intelligence

How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights

This article provides a comprehensive technical walkthrough of Open‑Sora 1.0, covering its Diffusion‑Transformer architecture, three‑stage training strategy, data‑preprocessing scripts, generation quality, and the Colossal‑AI acceleration that together make Sora‑level video synthesis openly reproducible.

AI videoOpen-Soradiffusion transformer

0 likes · 12 min read

How Open‑Sora 1.0 Replicates Sora: Architecture, Training Pipeline & Performance Insights

21CTO

Mar 17, 2024 · Artificial Intelligence

What Data Powers OpenAI’s Upcoming Video Model Sora?

OpenAI CTO Mira Murati provided vague answers about Sora’s training data, confirming the use of publicly available, licensed, and Shutterstock content while acknowledging uncertainty about social‑media sources, amid ongoing legal disputes over AI model data usage.

AI training dataOpenAISora

0 likes · 4 min read

What Data Powers OpenAI’s Upcoming Video Model Sora?

Alimama Tech

Mar 14, 2024 · Artificial Intelligence

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

Alibaba’s AI team introduced AtomoVideo, a diffusion‑based image‑to‑video generator enhanced by a training‑free Noise Rectification module that adds and corrects controlled noise to eliminate first‑frame errors, enabling merchants to automatically create high‑fidelity 4‑second 720p product videos with strong temporal consistency for e‑commerce advertising.

AIAIGCdiffusion model

0 likes · 10 min read

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

DeWu Technology

Mar 11, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space

OpenAI's Sora video generation model uses latent diffusion, a video compression encoder-decoder, tokenizes spatio-temporal patches, processes them with a diffusion‑trained Transformer conditioned on DALL·E‑style text annotations, then decodes to high‑resolution videos up to a minute long.

AISoraTransformer

0 likes · 18 min read

Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space

Sohu Tech Products

Mar 6, 2024 · Artificial Intelligence

Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications

OpenAI’s Sora video model unifies image and video data into latent spacetime patches via a VAE, trains on original resolutions with GPT‑4‑expanded captions, employs a Diffusion Transformer backbone for patch‑wise denoising, and demonstrates 3D‑consistent, long‑term world‑model capabilities that hint at a unified computer‑vision paradigm and steps toward AGI.

AI researchOpenAI SoraTransformer

0 likes · 9 min read

Analysis of OpenAI Sora: Data Engineering, Network Architecture, and World Model Implications

Architects' Tech Alliance

Feb 25, 2024 · Artificial Intelligence

How Sora Redefined Video Generation: Breakthroughs and Industry Impact

The article provides an in‑depth technical analysis of OpenAI's Sora, highlighting its 60‑second 1080p video generation capability, the novel patches‑vectorization and transformer training pipeline that leverages GPT‑generated prompts for multimodal alignment, and its potential to become a universal video‑generation base model that could reshape the AI industry.

AGIMultimodal AISora

0 likes · 6 min read

How Sora Redefined Video Generation: Breakthroughs and Industry Impact

CSS Magic

Feb 20, 2024 · Artificial Intelligence

OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges

The article walks through OpenAI’s own admission of Sora’s shortcomings—such as unrealistic physics, misplaced spatial details, and erratic object behavior—by showcasing concrete demo failures, additional observations, and technical notes about its diffusion‑based, transformer architecture and metadata embedding.

AI limitationsOpenAISora

0 likes · 7 min read

OpenAI’s Sora Video Model Is Hyped—But Here Are the Flaws OpenAI Itself Acknowledges

Rare Earth Juejin Tech Community

Feb 19, 2024 · Artificial Intelligence

Technical Review of OpenAI's Sora Video Generation Model

This article reviews OpenAI's Sora video generation model, summarizing its technical report, key innovations such as patch-based visual tokens, compression networks, scaling transformers, language understanding, and discussing its capabilities, highlights, and current limitations in AI video synthesis.

AIDiffusion ModelsOpenAI

0 likes · 9 min read

Technical Review of OpenAI's Sora Video Generation Model

Architects' Tech Alliance

Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation

OpenAI’s Sora model introduces a diffusion‑transformer approach that generates high‑fidelity, 60‑second videos with consistent 3‑D camera motion, long‑term object persistence, and the ability to simulate interactive digital worlds, backed by a detailed technical report and research paper.

Artificial IntelligenceOpenAISora

0 likes · 9 min read

How OpenAI’s Sora Redefines Video Generation with 3‑D Consistency and World Simulation

21CTO

Feb 17, 2024 · Artificial Intelligence

How OpenAI’s Sora Is Pushing Video Generation to New Frontiers

OpenAI’s Sora model demonstrates large‑scale text‑conditional video generation using a diffusion transformer that operates on spatiotemporal patches, supporting variable durations, resolutions, and aspect ratios while showcasing emergent simulation abilities, flexible sampling, and multimodal editing capabilities, though it still has notable limitations.

AI researchDiffusion ModelsMultimodal

0 likes · 19 min read

How OpenAI’s Sora Is Pushing Video Generation to New Frontiers

NewBeeNLP

Feb 17, 2024 · Artificial Intelligence

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

The article analyzes OpenAI's Sora video model, arguing that its integration of large‑language‑model reasoning with diffusion techniques marks a major step toward true world understanding, reshapes creative workflows, widens the AI talent gap, and accelerates the path to artificial general intelligence.

AGIAI trendsLarge Language Models

0 likes · 7 min read

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

Architect

Feb 16, 2024 · Artificial Intelligence

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

OpenAI’s newly unveiled Sora model transforms short text prompts into up‑to‑one‑minute high‑definition videos, showcasing advanced diffusion‑Transformer architecture, improved occlusion handling, and detailed visual fidelity, while the article examines its technical breakthroughs, compares it to earlier models, and discusses emerging safety and misuse concerns.

AI safetyDiffusion ModelsGenerative AI

0 likes · 12 min read

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

AntTech

Dec 20, 2022 · Artificial Intelligence

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Researchers from multiple institutions propose a GAN‑based video generation framework that explicitly models short‑, medium‑, and long‑range temporal relations, introduces B‑spline motion embeddings and temporal shift modules, and demonstrates substantial quality improvements across several video datasets.

B-splineGaNStyleGAN-V

0 likes · 7 min read

Towards Smooth Video Composition: A New Benchmark for GAN‑Based Video Generation

Alimama Tech

Oct 26, 2022 · Artificial Intelligence

GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service

The paper analyzes why Alibaba Mama’s intelligent creative video service suffers low GPU utilization—due to Python GIL blocking, lack of kernel fusion, and serialized CUDA streams—and details service‑level changes (separate CPU/GPU processes, shared‑memory queues, priority scheduling) and operator‑level kernel‑fusion techniques (channels‑last layouts, custom pooling, TensorRT conversion) that raise utilization from ~30 % to near 100 % and boost throughput by 75 %.

Deep LearningGPU OptimizationPython

0 likes · 20 min read

GPU Utilization Analysis and Optimization for Alibaba's Intelligent Creative Video Service

MaGe Linux Operations

Jul 3, 2022 · Backend Development

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

This guide shows how to use Python to scrape high‑quality chat screenshots, apply OCR, generate silent chat videos, batch‑download matching audio from short‑video platforms, and combine them into thousands of unique WeChat Video Channel clips, leveraging volume to outsmart recommendation algorithms and boost traffic.

AutomationOCRPython

0 likes · 11 min read

How to Automate 10,000 Video‑Channel Posts with Python and OCR for Massive Traffic

MaGe Linux Operations

Feb 16, 2022 · Artificial Intelligence

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

This tutorial shows how to use Python, OpenCV, and Pillow to capture, process, and animate Wuhan University’s cherry blossom scenes, turning pixel data into a time‑lapse video with custom text overlays and frame‑by‑frame control.

Image processingPythoncomputer vision

0 likes · 5 min read

Recreate Wuhan University’s Cherry Blossom Bloom with Python and OpenCV

Python Programming Learning Circle

Jan 17, 2022 · Fundamentals

Creating a Cherry Blossom Animation with Python, OpenCV, and Pillow

This article demonstrates how to use Python, OpenCV, and Pillow to capture, annotate, and assemble cherry‑blossom images into a video, explaining pixel color representation, frame saving, canvas creation, text rendering, and video encoding steps with complete code examples.

image-processingtutorialvideo generation

0 likes · 5 min read

Creating a Cherry Blossom Animation with Python, OpenCV, and Pillow

Tencent Advertising Technology

Nov 2, 2021 · Artificial Intelligence

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

This article presents Tencent's advertising multimedia AI platform, detailing its intelligent video creation engine, fine‑grained ad content understanding, large‑scale similar‑ad retrieval system, and automated ad review pipeline, while also introducing the team and current recruitment opportunities.

ad understandingadvertising AIcontent moderation

0 likes · 22 min read

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

Python Crawling & Data Mining

Jul 9, 2021 · Fundamentals

Build a Free Taobao Main Image Video Generator with Python, Tkinter & FFmpeg

This guide walks you through building a free Python Tkinter desktop application that merges multiple PNG or JPG images with background audio into a video using FFmpeg, covering environment setup, GUI design, file handling, log capture, video generation, and preview steps.

AutomationFFmpegGUI

0 likes · 11 min read

Build a Free Taobao Main Image Video Generator with Python, Tkinter & FFmpeg

Alibaba Terminal Technology

Jun 2, 2021 · Operations

Turn Your Git History into a Stunning Video with Gource and Avconv

This guide shows how to install Gource and Avconv, configure Chinese font support, and use a series of command‑line options to transform any Git repository’s commit history into a high‑resolution video, optionally adding background music for a polished visual celebration of your project.

avconvcommand-linegit visualization

0 likes · 4 min read

Turn Your Git History into a Stunning Video with Gource and Avconv

Yanxuan Tech Team

Apr 19, 2021 · Artificial Intelligence

How AI Powers Personalized Ad Creatives: From Templates to Automated Video

This article explains how algorithmic "smart creative" technology automates personalized advertising by using data‑driven templates, image and video synthesis, and aesthetic scoring to generate high‑click‑through ad content while reducing manual production costs.

AI-generated creativesimage compositionpersonalized advertising

0 likes · 7 min read

How AI Powers Personalized Ad Creatives: From Templates to Automated Video

DataFunTalk

Nov 22, 2020 · Artificial Intelligence

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI-driven short video analysis workflow, covering industry trends, multi‑label video classification, intelligent cover selection, and video generation techniques, while discussing challenges, model building, label expansion, continuous data iteration, and future outlook for video AI in local services.

AIMeituancomputer vision

0 likes · 16 min read

Short Video Analysis in Local Life Scenarios: Techniques and Practices at Meituan

DataFunSummit

Nov 5, 2020 · Artificial Intelligence

Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan

This article presents Meituan's AI‑driven short‑video analysis pipeline for local‑life scenarios, covering industry trends, multi‑label classification, intelligent cover selection, and video generation, and discusses model construction, label‑system expansion, continuous data iteration, and practical applications in restaurant and hotel domains.

AIMeituanintelligent cover

0 likes · 16 min read

Short Video Analysis for Local Life Scenarios: Techniques and Practices at Meituan