Artificial Intelligence 9 min read

Cold-Start Short Video Potential Prediction Using Siamese Networks

The paper proposes a Siamese‑based PredictionNet that combines EfficientB3 image and VGGish audio features with user metrics to predict a HotValue score for newly uploaded short videos, using a margin loss with view‑value‑aware pair selection, enabling tiered cold‑start exposure that boosts overall platform efficiency.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Cold-Start Short Video Potential Prediction Using Siamese Networks

In the era of traffic-driven platforms, high-quality short videos are crucial for user engagement. This article explores potential prediction for newly uploaded short videos on the Weishi platform to improve cold‑start performance.

Goal: allocate a certain amount of exposure to each new video, quickly locate target users via UserCF/Lookalike, and increase the cost‑effectiveness of cold‑start traffic.

Model: a Siamese network with a late‑merge architecture (PredictionNet) is employed. Video features are extracted by EfficientB3 (image) and VGGish (audio), aggregated by NeXtVLAD, and combined with user features (total fans, active fans, new fans in 7 days, potential fans). The prediction sub‑network outputs a HotValuePred score.

Loss: Margin loss is used. Three strategies for defining positive/negative pairs were tested: (1) threshold on video view‑value (VV), (2) VV ratio >10, (3) VV ratio >10 with the VV gap incorporated into the loss. The third variant achieved the best results.

Experiments: Models trained with the three loss variants were evaluated on a test set. The third loss yielded the highest proportion of videos whose HotValuePred falls in the top 20% of VV. Based on the prediction, videos are divided into three tiers: tier‑0 (bottom 40%), tier‑1 (middle 40%), and tier‑2 (top 20%).

Applications:

Quality content discovery – tier‑2 videos receive higher cold‑start exposure, while tier‑0/1 receive reduced exposure, improving overall VV efficiency.

Category balance – tier thresholds are adjusted per content category using a global threshold T and per‑category thresholds ti with a balancing parameter λ, preventing dominant categories from monopolizing exposure.

Assisted manual review – during high‑traffic periods (e.g., Chinese New Year), tier‑2 videos are prioritized for faster human review, reducing system load.

Future work includes enriching input features with textual and historical user activity data, and exploring better cold‑start strategies and support for high‑quality accounts.

References: [1] Y. Zhou et al., 2015; [2] M. Fiaz et al., 2019; [3] M. Tan & Q. V. Le, 2019; [4] VGGish (TensorFlow models); [5] R. Lin et al., 2018; [6] C. Shizhe, 2019.

machine learningshort videocold startVideo Recommendationpopularity predictionSiamese network
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.