Tagged articles
44 articles
Page 1 of 1
AntTech
AntTech
Apr 14, 2026 · Artificial Intelligence

AT-ADD Challenge: Pushing All‑Type Audio Deepfake Detection Forward

The AT‑ADD competition, organized for ACM MM 2026, invites researchers to develop robust audio deepfake detection models across speech, environmental sounds, singing, and music, providing diverse real‑world datasets, baseline code, clear evaluation metrics, and a two‑stage submission process to advance AI security.

AT-ADDAudio DeepfakeChallenge
0 likes · 10 min read
AT-ADD Challenge: Pushing All‑Type Audio Deepfake Detection Forward
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Dec 13, 2024 · Fundamentals

Overview of H.266/VVC Video Coding Standard and Its Key Technologies

H.266/VVC, the next‑generation video coding standard finalized in 2020, delivers roughly 50 % bitrate savings over H.265/HEVC with modestly higher decoding complexity, introduces advanced intra‑ and inter‑prediction, transform, quantization, entropy and loop‑filtering tools, and faces patent‑pool and adoption challenges before widespread smartphone integration around 2026.

H.266MultimediaVVC
0 likes · 20 min read
Overview of H.266/VVC Video Coding Standard and Its Key Technologies
Kuaishou Tech
Kuaishou Tech
Jul 31, 2024 · Artificial Intelligence

Kuaishou Showcases AI‑Driven Multimedia Innovations at China Multimedia 2024

At the China Multimedia 2024 conference in Yinchuan, Kuaishou presented its latest AI‑driven large‑model technologies—including text‑to‑image, text‑to‑video, and audio models—alongside advances in intelligent video coding, a new research‑fund initiative, and recent industry awards.

AIKuaishouMultimedia
0 likes · 5 min read
Kuaishou Showcases AI‑Driven Multimedia Innovations at China Multimedia 2024
Bilibili Tech
Bilibili Tech
Jun 11, 2024 · Artificial Intelligence

Intelligent Restoration System for Legacy Video Quality

Bilibili’s Multimedia Lab created an end‑to‑end intelligent restoration system that assesses video resolution, frame‑rate and quality, then automatically selects and applies image‑level enhancement, frame‑rate up‑sampling, background and face restoration, and optical‑flow interpolation to transform blurry, jittery, artifact‑laden legacy videos into clear, smooth, high‑quality streams, now deployed for on‑demand content and slated for live‑stream expansion.

AIMultimediaframe interpolation
0 likes · 12 min read
Intelligent Restoration System for Legacy Video Quality
360 Smart Cloud
360 Smart Cloud
Apr 3, 2024 · Backend Development

Understanding FFmpeg Hardware Acceleration Architecture and Implementation

FFmpeg provides a comprehensive, cross‑platform hardware acceleration framework that abstracts diverse GPU and dedicated video codec interfaces, defines HWContext types, device and frame contexts, and various codec configuration methods, enabling efficient video encoding, decoding, and filtering while addressing performance, compatibility, and pipeline complexity challenges.

GPUHardware accelerationMultimedia
0 likes · 10 min read
Understanding FFmpeg Hardware Acceleration Architecture and Implementation
DaTaobao Tech
DaTaobao Tech
Jan 31, 2024 · Artificial Intelligence

Highlights of Recent AI Research Papers from Top Conferences (2023)

The article curates standout AI papers from 2023 CCF‑A conferences—including CVPR, ICLR, ACM MM, and INFORMS—showcasing advances such as Swin‑Transformer video quality assessment, cross‑modal e‑commerce product search, transformer‑based vehicle routing heuristics, diffusion‑driven dance generation, and reinforcement‑learning inventory replenishment.

AIComputer VisionMultimedia
0 likes · 23 min read
Highlights of Recent AI Research Papers from Top Conferences (2023)
HelloTech
HelloTech
Jan 25, 2024 · Backend Development

Design and Implementation of a Custom Multimedia Framework Using FFmpeg

The Haro Street Cat mobile team created a custom multimedia framework that wraps FFmpeg 4.2.2 in a C++ core library with Android/iOS compatibility layers and Java wrappers for transcoding, live streaming, and composition, delivering hardware‑accelerated decoding, flexible filter pipelines, and reliable transcoding that boosted coverage to over 99 %, cut storage by more than 30 %, accelerated video start‑up, and improved streaming and watermarking performance.

C++Filter GraphMultimedia
0 likes · 27 min read
Design and Implementation of a Custom Multimedia Framework Using FFmpeg
Test Development Learning Exchange
Test Development Learning Exchange
Aug 20, 2023 · Fundamentals

Python Multimedia Service Modules: audioop, aifc, sunau, wave, chunk, colorsys, imghdr, sndhdr, ossaudiodev

This article introduces Python's multimedia service modules, explaining how to process raw audio data, read and write various audio file formats, detect image and sound file types, convert color systems, and access OSS‑compatible audio devices, all illustrated with practical code examples.

Multimediaaudiofile-handling
0 likes · 7 min read
Python Multimedia Service Modules: audioop, aifc, sunau, wave, chunk, colorsys, imghdr, sndhdr, ossaudiodev
DataFunTalk
DataFunTalk
May 13, 2023 · Artificial Intelligence

Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization

This article presents Weibo's comprehensive multimedia content understanding pipeline, covering video summarization techniques, quality assessment models, OCR advancements, video embedding strategies, and the performance benefits of CV‑CUDA acceleration, while highlighting real‑world applications and engineering trade‑offs.

CV-CUDAComputer VisionDeep Learning
0 likes · 32 min read
Multimedia Content Understanding at Weibo: Video Summarization, Quality Assessment, OCR, Embedding, and CV‑CUDA Optimization
DaTaobao Tech
DaTaobao Tech
Apr 26, 2023 · Artificial Intelligence

MD-VQA: Multi-Dimensional No-Reference Video Quality Assessment for CVPR NTIRE 2023

Alibaba’s Taobao VQA team won the CVPR NTIRE 2023 Video Enhancement Challenge by introducing MD‑VQA, a multi‑dimensional no‑reference video quality model that combines a Swin‑Transformer‑V2 spatial backbone, a pre‑trained SlowFast motion encoder, and a convolutional fusion module, pre‑trained on LSVQ, fine‑tuned on NTIRE data, and augmented spatio‑temporally, achieving state‑of‑the‑art SROCC and PLCC scores and now powering quality monitoring on Alibaba’s live‑streaming and short‑video services.

MultimediaNo-ReferenceSwin Transformer
0 likes · 15 min read
MD-VQA: Multi-Dimensional No-Reference Video Quality Assessment for CVPR NTIRE 2023
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Apr 14, 2023 · Fundamentals

Pipeline Domain Design in Multimedia Frameworks: Concepts, Comparative Analysis, and Implementation

The article defines pipeline domain design concepts, compares major multimedia frameworks such as FFmpeg, GStreamer, MediaPipe and AVPipeline, and demonstrates a configurable, extensible node‑based architecture that enables fast plugin integration and adaptable audio‑video pipelines across diverse business scenarios and platforms.

AIFrameworkMultimedia
0 likes · 39 min read
Pipeline Domain Design in Multimedia Frameworks: Concepts, Comparative Analysis, and Implementation
IT Services Circle
IT Services Circle
Mar 3, 2023 · Backend Development

FFmpeg 6.0 “Von Neumann” Released with New Encoders, Decoders, Filters, and ABI Versioning

FFmpeg 6.0 “Von Neumann” has been officially released, introducing numerous new encoders, decoders, and filters, adding ABI versioning to major releases, deprecating old APIs, and enhancing CLI performance with threading, statistics options, and file‑based filter options, while outlining upcoming features for version 6.1.

CLIDecodersEncoders
0 likes · 6 min read
FFmpeg 6.0 “Von Neumann” Released with New Encoders, Decoders, Filters, and ABI Versioning
Programmer DD
Programmer DD
Mar 3, 2023 · Backend Development

FFmpeg 6.0 Highlights: New Codecs, Filters, and Performance Boosts

FFmpeg 6.0 "Von Neumann" introduces a host of new codecs, decoders, filters, CLI enhancements, ABI versioning, and a more frequent release cadence, offering developers expanded multimedia processing capabilities and improved performance across platforms.

Backend DevelopmentMultimediaSoftware Release
0 likes · 6 min read
FFmpeg 6.0 Highlights: New Codecs, Filters, and Performance Boosts
DataFunSummit
DataFunSummit
Feb 10, 2023 · Information Security

Digital Watermarking Technology: Concepts, Models, Algorithms, and Applications

The article provides a comprehensive overview of digital watermarking, covering its fundamental concepts, security features, embedding/detection/extraction processes, major algorithm families, practical applications such as copyright protection and anti‑counterfeiting, and future research directions in multimedia information security.

MultimediaSignal Processingcryptography
0 likes · 20 min read
Digital Watermarking Technology: Concepts, Models, Algorithms, and Applications
Tencent Advertising Technology
Tencent Advertising Technology
Dec 15, 2022 · Artificial Intelligence

AI‑Driven Element Selection for Advertising Video Creative Generation

This article explains how Tencent's advertising system leverages multimedia AI techniques—including multi‑armed bandit, pairwise learning, and DeepFM models—to automatically select optimal templates, music, and stickers for image and video assets, thereby reducing production cost, improving creative quality, and boosting ad performance.

MABMultimediaadvertising AI
0 likes · 17 min read
AI‑Driven Element Selection for Advertising Video Creative Generation
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Nov 30, 2022 · Frontend Development

Building an Interactive 3D Phone Showcase with Three.js Multimedia Elements (Text, Image, Audio, Video)

This article explains how to use Three.js to create a realistic 3D phone product page by loading and applying multimedia assets such as custom fonts, textures, audio sources, and video textures, and demonstrates interactive features like ray‑casting for material switching and first‑person controls.

3DJavaScriptMultimedia
0 likes · 19 min read
Building an Interactive 3D Phone Showcase with Three.js Multimedia Elements (Text, Image, Audio, Video)
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Nov 22, 2022 · Fundamentals

Unlocking Multimedia Power: OpenMAX Architecture for Cross‑Platform Plug‑in Algorithms

This article explains the OpenMAX (OMX) multimedia acceleration standard, its three-layer architecture (Application, Integration, Development), how it’s used in frameworks like gStreamer, StageFright, and the AVProcessEngine plugin library to modularize audio, video, and image algorithms across Android, iOS, macOS, and Windows platforms.

AVProcessEngineMultimediaOpenMAX
0 likes · 16 min read
Unlocking Multimedia Power: OpenMAX Architecture for Cross‑Platform Plug‑in Algorithms
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Sep 29, 2022 · Artificial Intelligence

How DeViT Revolutionizes Video Inpainting with Deformed Vision Transformers

The article introduces DeViT, a novel Deformed Vision Transformer framework for video inpainting that leverages a deformable patch homography estimator, mask‑pruned attention, and spatio‑temporal weight adaptation, achieving state‑of‑the‑art results on benchmark datasets and highlighting its potential for advanced video editing tools.

DeViTMultimediaTransformer
0 likes · 10 min read
How DeViT Revolutionizes Video Inpainting with Deformed Vision Transformers
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Aug 16, 2022 · Artificial Intelligence

Deep Learning Turns SDR Video into HDR: ACM Multimedia 2022 Breakthrough

Researchers from Kuaishou and Xi’an University of Electronic Science and Technology presented a novel deep‑learning‑based SDR‑to‑HDR video conversion method at ACM Multimedia 2022, introducing hierarchical dynamic context feature mapping, a layered dynamic feature modulation module, and a patch‑discriminator GAN that together achieve superior objective and subjective HDR quality.

HDR videoMultimediavideo conversion
0 likes · 6 min read
Deep Learning Turns SDR Video into HDR: ACM Multimedia 2022 Breakthrough
Youku Technology
Youku Technology
Aug 10, 2022 · Artificial Intelligence

Youku Moku Lab's QoEVMA Workshop at ACM MM 2022: Advancing Quality of Experience in Visual Multimedia Applications

At ACM MM 2022, Youku Moku Lab chaired the pioneering QoEVMA workshop, receiving 14 international submissions and accepting eight papers that advanced Quality of Experience assessment for diverse visual multimedia formats, while showcasing the lab’s free‑viewpoint video technology deployed in major events such as the 2022 Beijing Winter Olympics.

ACM MMFree‑viewpoint videoKPI
0 likes · 5 min read
Youku Moku Lab's QoEVMA Workshop at ACM MM 2022: Advancing Quality of Experience in Visual Multimedia Applications
Baidu Geek Talk
Baidu Geek Talk
Aug 2, 2022 · Fundamentals

Understanding ffplay: Playback Workflow and Core Components

The article walks through ffplay’s end‑to‑end playback pipeline—starting with protocol and container demuxing, initializing FFmpeg and SDL, spawning read and decoder threads, handling video/audio decoding, synchronizing streams, and finally rendering frames—offering design insights for constructing a basic media player.

Audio-Video SyncMultimediaSDL
0 likes · 18 min read
Understanding ffplay: Playback Workflow and Core Components
Tencent Advertising Technology
Tencent Advertising Technology
Nov 2, 2021 · Artificial Intelligence

Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review

This article presents Tencent's advertising multimedia AI platform, detailing its intelligent video creation engine, fine‑grained ad content understanding, large‑scale similar‑ad retrieval system, and automated ad review pipeline, while also introducing the team and current recruitment opportunities.

MultimediaVideo Generationad understanding
0 likes · 22 min read
Tencent Advertising Multimedia AI Platform: Intelligent Creation, Fine‑grained Understanding, Similar‑Ad Retrieval, and Smart Review
Alibaba Terminal Technology
Alibaba Terminal Technology
Jun 25, 2021 · Frontend Development

Mastering Web Multimedia Front‑End: A Complete Beginner’s Guide

This comprehensive guide introduces multimedia front‑end development, explains W3C media standards and HTML elements, explores media APIs, outlines playback scenarios and solutions, and details both consumer‑facing live video systems and production‑side tools such as streaming and video‑editing, while sharing Alibaba’s roadmap for the field.

MultimediaStreamingmedia APIs
0 likes · 25 min read
Mastering Web Multimedia Front‑End: A Complete Beginner’s Guide
Programmer DD
Programmer DD
Dec 30, 2020 · Fundamentals

FFmpeg’s 20‑Year Legacy: Powering Video Players and the ‘Shame Pillar’

Celebrating two decades of the open‑source FFmpeg project, this article explains how its multimedia decoding libraries underpin popular video players and major platforms, recounts the 2011 split that birthed Libav, discusses licensing obligations under LGPL/GPL, and reveals the infamous “shame pillar” list of software that ignored those rules.

Multimediaffmpegopen source
0 likes · 6 min read
FFmpeg’s 20‑Year Legacy: Powering Video Players and the ‘Shame Pillar’
Taobao Frontend Technology
Taobao Frontend Technology
Sep 4, 2020 · Frontend Development

How a Frontend Engineer Turned Career Confusion into Multimedia Innovation

In this candid talk, a former Alibaba multimedia front‑end engineer shares his career journey—from early patents and B2B animation work to leading live‑streaming projects, detailing the challenges, technical breakthroughs, and personal reflections that helped him overcome professional uncertainty.

Career DevelopmentMultimediaSoftware Architecture
0 likes · 13 min read
How a Frontend Engineer Turned Career Confusion into Multimedia Innovation
Taobao Frontend Technology
Taobao Frontend Technology
Jul 2, 2020 · Frontend Development

How Same‑Layer Rendering Powers Alibaba’s H5 Video Player for 618 Live Events

This article explains how Alibaba’s multimedia front‑end team built a same‑layer rendering video player for H5, detailing the architecture, performance advantages, degradation strategies, issues encountered during the 618 promotion, and future plans to improve playback experience across live and on‑demand scenarios.

AlibabaH5Multimedia
0 likes · 11 min read
How Same‑Layer Rendering Powers Alibaba’s H5 Video Player for 618 Live Events
Tencent Cloud Developer
Tencent Cloud Developer
Jun 23, 2020 · Cloud Computing

AV1 Video Codec: Development History, Technical Architecture, and Cloud Encoding Applications

AV1, the royalty‑free video codec created by the Alliance for Open Media and supported by Tencent’s Multimedia Lab, combines advanced block partitioning, intra‑ and inter‑prediction, versatile transforms, sophisticated entropy coding, and loop‑filtering to cut bitrates by roughly 20% for streaming, cloud transcoding, 8K, VR, and medical imaging, while paving the way toward future AV2 and VVC standards.

AOMediaAV1Multimedia
0 likes · 17 min read
AV1 Video Codec: Development History, Technical Architecture, and Cloud Encoding Applications
Taobao Frontend Technology
Taobao Frontend Technology
Jun 9, 2020 · Frontend Development

Unlocking Taobao Live: Front‑End Multimedia Tech Behind the Hype

This article explores the front‑end multimedia technologies that power Taobao Live, covering video and audio fundamentals, container and codec formats, streaming protocols, player architecture, web media APIs, and popular open‑source frameworks for building robust live‑streaming experiences.

Multimediaaudiolive streaming
0 likes · 16 min read
Unlocking Taobao Live: Front‑End Multimedia Tech Behind the Hype
iQIYI Technical Product Team
iQIYI Technical Product Team
May 8, 2020 · Backend Development

iQIYI Deploys AV1 Video Codec and Introduces QAV1 Encoder for Bandwidth Savings

iQIYI has become the first Chinese video platform to deploy the open‑source AV1 codec on its PC web and Android apps, cutting video file size by about 20%, and its in‑house QAV1 encoder further reduces bitrate over 40% and runs roughly five times faster than competing encoders, enabling smoother 4K/8K streaming while saving bandwidth.

AV1MultimediaQAV1
0 likes · 5 min read
iQIYI Deploys AV1 Video Codec and Introduces QAV1 Encoder for Bandwidth Savings
Qunar Tech Salon
Qunar Tech Salon
Dec 17, 2019 · Operations

Evolution of Call Center Technology: From Hotlines to Multimedia

This article traces the evolution of call center technology across four generations—from early hotlines using PSTN and PBX, through IVR and CTI innovations, to modern multimedia channels—highlighting key concepts, features, and their impact on operational efficiency and customer service.

CTIIVRMultimedia
0 likes · 10 min read
Evolution of Call Center Technology: From Hotlines to Multimedia
Hulu Beijing
Hulu Beijing
Nov 15, 2019 · Artificial Intelligence

How Content-Based Video Relevance Prediction Advances Personalized Streaming

The CBVRP (Content-Based Video Relevance Prediction) challenge, co‑hosted by Hulu and ACM MM 2019, showcased the shift from user‑based collaborative filtering to content‑driven recommendation, highlighted winning teams and their papers, and underscored the ongoing research importance of cold‑start video recommendation for streaming platforms.

MultimediaStreamingcold start
0 likes · 15 min read
How Content-Based Video Relevance Prediction Advances Personalized Streaming
Tencent Cloud Developer
Tencent Cloud Developer
Oct 22, 2019 · Game Development

Key Technologies of Immersive Media in the 5G Era: 3D, Point Cloud, and Compression

With 5G accelerating video traffic, immersive media relies on advanced 3D representation, efficient point‑cloud and video‑based compression, standardized containers, and selective streaming to enable realistic VR/AR experiences across gaming, autonomous driving, and broadcasting, while Tencent’s lab drives standards and future XR applications.

3D5GMultimedia
0 likes · 12 min read
Key Technologies of Immersive Media in the 5G Era: 3D, Point Cloud, and Compression
ITPUB
ITPUB
Sep 30, 2019 · Fundamentals

Top 10 Essential Free Linux Applications You Should Install Today

Discover a curated list of ten indispensable, free, open‑source Linux applications—from package managers and media players to office suites and security tools—each with key features, usage tips, and direct download links to enhance your Linux desktop experience.

LinuxMultimediasoftware
0 likes · 11 min read
Top 10 Essential Free Linux Applications You Should Install Today
58 Tech
58 Tech
Apr 16, 2019 · Mobile Development

Design and Architecture of the 58 Short Video SDK for Mobile Applications

The article outlines the technical challenges of short‑video apps and presents the modular, extensible architecture of the 58 Short Video SDK, detailing its layered design, design principles, advantages, and future evolution to support advanced features such as AR, hardware decoding, and h265 encoding.

MultimediaVideo processingshort video
0 likes · 12 min read
Design and Architecture of the 58 Short Video SDK for Mobile Applications
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 28, 2018 · Artificial Intelligence

CCF Multimedia Committee Visit to iQIYI – AI, Knowledge Graph, and Multimedia Technology Presentations

During the CCF Multimedia Committee’s visit to iQIYI, senior researchers and professors presented cutting‑edge AI, knowledge‑graph‑driven content distribution, image‑text sentiment matching, and intelligent multimedia transmission technologies, while interactive tours of studios and labs deepened academia‑industry collaboration and highlighted iQIYI’s innovative multimedia ecosystem.

AICCFKnowledge Graph
0 likes · 8 min read
CCF Multimedia Committee Visit to iQIYI – AI, Knowledge Graph, and Multimedia Technology Presentations
360 Quality & Efficiency
360 Quality & Efficiency
Apr 25, 2018 · Fundamentals

Introduction to FFmpeg: Libraries, Tools, and Basic Command Usage

This article introduces FFmpeg, outlines its eight core libraries, describes the main command‑line tools (ffmpeg, ffplay, ffprobe), and provides a step‑by‑step example of converting an MP4 video to HEVC with MP3 audio on Windows, including useful help commands and additional features.

MultimediaVideo processingffmpeg
0 likes · 5 min read
Introduction to FFmpeg: Libraries, Tools, and Basic Command Usage
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 31, 2017 · Artificial Intelligence

How Alibaba’s ‘City Brain’ Powered Cutting‑Edge AI Research at ACM MM 2017

Three Alibaba iDST papers on the AI‑driven City Brain were selected for ACM Multimedia 2017, showcasing novel video anomaly detection, deep siamese re‑identification, and stylized image generation methods that improve urban traffic management and demonstrate the broader potential of Alibaba’s ET Brain platform.

MultimediaVideo Anomaly Detection
0 likes · 8 min read
How Alibaba’s ‘City Brain’ Powered Cutting‑Edge AI Research at ACM MM 2017