Big Data 9 min read

Designing a Content Hotness Scoring Algorithm for Community Platforms

This article describes how a community’s big‑data team designed a content hotness algorithm by defining time, interaction, content, and user dimensions, assigning business meanings, applying weighted formulas and a Newton‑cooling decay function, and integrating user interest vectors to compute dynamic scores.

Top Architect
Top Architect
Top Architect
Designing a Content Hotness Scoring Algorithm for Community Platforms

In the process of a community redesign, the big‑data team combined business details, research, discussion, and trial‑and‑error to design a basic content hotness scoring algorithm.

Reference: A few foreign companies (Hacker News, Reddit, Stack Overflow, StumbleUpon) have publicly described their hotness algorithms; links are provided for further reading.

Data dimensions considered include:

Time dimensions: post_time, last_reply_time, last_op_time.

Interaction dimensions: view_num, reply_num, favor_num, like_num, reply_like_num, share_num.

Content dimensions: content_length, reply_avg_length, picture_num.

User dimensions: user interest, activity, reputation, etc.

The business meaning of each dimension is explained: time controls decay, interaction metrics are weighted to reflect their impact, content metrics serve as auxiliary quality signals, and user metrics enable personalized recommendation.

Algorithm design focuses on the last_reply_time as the time dimension and models decay using Newton’s cooling law:

H(t) = H_a * exp[-γ * (t - t_last) / 86400]

The raw hotness value H_a is calculated from interaction metrics:

H_a = ln(1 + N_view) + 1.0 * N_reply + 1.75 * N_like + 3.2 * N_favor

Reading count is transformed with a natural logarithm to reflect diminishing contribution, and content length contributes via lg(N_length).

The final hotness formula combines all dimensions:

H = [lg(N_length) + ln(1+N_view) + 1.0*N_reply + 1.75*N_like + 3.2*N_favor] * exp[-γ*(t - t_last)/86400]

User interest vectors are built from recent behavior, normalized to [0.1, 1.0], and combined with content tag vectors; the total score may be further adjusted by a random factor to diversify ordering:

rand(0.75,1.25) * C_interest * H

Conclusion: The algorithm balances technical decay, weighted interaction, content characteristics, and user interest to produce a dynamic hotness score that reflects both content quality and personalized relevance.

big datarecommendation systemtime decaycontent rankingengagement metricshotness algorithm
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.