How Quora Leverages Machine Learning for Ranking, Personalization, and Moderation
Quora employs a variety of machine‑learning techniques—from ranking and personalized feed algorithms to duplicate‑question detection, user expertise inference, and content moderation—optimizing both user experience and content quality through offline testing, online A/B experiments, and models such as logistic regression, gradient‑boosted trees, and neural networks.
Quora has been using machine learning for some time, continuously improving methods and validating them offline before confirming gains with online A/B testing.
Ranking
Ranking is one of the most important ML applications at Quora, used for ordering answers, users, and other entities. Features include answer quality, user expertise, interaction signals (up‑votes, down‑votes, expansions), and content relevance.
Search Algorithm
Search combines text matching with a ranking stage that optimizes click‑through probability using both textual features and user‑behavior signals.
Personalized Ranking
Personalized ranking tailors the Quora Feed to each user, considering answer quality, topics of interest, followed users, trending events, and timeliness. The system uses a multi‑stage pipeline to pre‑select candidates before final ranking.
Recommendation
Recommendation appears in email digests and in‑app suggestions of users or topics, driven by similar ML ranking models optimized for different objectives.
Related Questions
A separate model predicts related questions using textual similarity, co‑visit data, topic overlap, popularity, and quality signals, balancing similarity with interestingness.
Duplicate Questions
Duplicate‑question detection uses a binary classifier trained on duplicate/non‑duplicate labels, leveraging text vector representations and usage‑based features.
User Credibility / Expertise Inference
Quora infers user expertise by analyzing answers written, votes received, comments, and endorsements, weighting signals from domain experts higher than those from non‑experts.
Spam Detection and Moderation
Multiple ML classifiers flag low‑quality or malicious content, routing items to moderation queues for human review.
Content Creation Prediction
A model predicts the likelihood that a user will answer a given question, enabling automatic Ask‑to‑Answer prompts and informing ranking decisions.
Models
Logistic regression
Elastic net
Gradient‑boosted decision trees
Random forest
Neural networks
LambdaMART
Matrix factorization
Vector models and other NLP techniques
Conclusion
Quora’s diverse machine‑learning applications have delivered significant benefits, and the team expects further gains from upcoming work in ad ranking, machine translation, and other natural‑language‑processing areas.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
