Tag

image-text fusion

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
Sep 16, 2024 · Artificial Intelligence

Multimodal Content Understanding and Cold-Start Practices in NetEase Cloud Music Community Recommendation System

This article details how NetEase Cloud Music leverages multimodal content understanding—using audio models like MusicCLIP and Audio MAE and image‑text fusion via FLAVA—to improve recommendation performance for new content and new users, covering system architecture, cold‑start solutions, and future AI‑driven directions.

AI modelsCold Startaudio representation
0 likes · 15 min read
Multimodal Content Understanding and Cold-Start Practices in NetEase Cloud Music Community Recommendation System