Big Data 9 min read

Designing a Content Hotness Scoring Algorithm for Community Platforms

This article describes how a community’s big‑data team designed a content hotness algorithm by defining time, interaction, content, and user dimensions, assigning business meanings, applying weighted formulas and a Newton‑cooling decay function, and integrating user interest vectors to compute dynamic scores.

Top Architect

Jan 19, 2021

Designing a Content Hotness Scoring Algorithm for Community Platforms

In the process of a community redesign, the big‑data team combined business details, research, discussion, and trial‑and‑error to design a basic content hotness scoring algorithm.

Reference: A few foreign companies (Hacker News, Reddit, Stack Overflow, StumbleUpon) have publicly described their hotness algorithms; links are provided for further reading.

Data dimensions considered include:

Time dimensions: post_time, last_reply_time, last_op_time.

Interaction dimensions: view_num, reply_num, favor_num, like_num, reply_like_num, share_num.

Content dimensions: content_length, reply_avg_length, picture_num.

User dimensions: user interest, activity, reputation, etc.

The business meaning of each dimension is explained: time controls decay, interaction metrics are weighted to reflect their impact, content metrics serve as auxiliary quality signals, and user metrics enable personalized recommendation.

Algorithm design focuses on the last_reply_time as the time dimension and models decay using Newton’s cooling law: H(t) = H_a * exp[-γ * (t - t_last) / 86400] The raw hotness value H_a is calculated from interaction metrics:

H_a = ln(1 + N_view) + 1.0 * N_reply + 1.75 * N_like + 3.2 * N_favor

Reading count is transformed with a natural logarithm to reflect diminishing contribution, and content length contributes via lg(N_length).

The final hotness formula combines all dimensions:

H = [lg(N_length) + ln(1+N_view) + 1.0*N_reply + 1.75*N_like + 3.2*N_favor] * exp[-γ*(t - t_last)/86400]

User interest vectors are built from recent behavior, normalized to [0.1, 1.0], and combined with content tag vectors; the total score may be further adjusted by a random factor to diversify ordering: rand(0.75,1.25) * C_interest * H Conclusion: The algorithm balances technical decay, weighted interaction, content characteristics, and user interest to produce a dynamic hotness score that reflects both content quality and personalized relevance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

recommendation system time decay content ranking engagement metrics hotness algorithm

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.