Inside X’s Open‑Source ‘For You’ Algorithm: How AI Drives Your Attention

The article dissects X’s newly open‑sourced ‘For You’ feed algorithm, detailing its Rust and Python implementation, the Home Mixer pipeline, candidate sourcing, Grok‑based scoring, and extensive filtering, showing how machine‑learning predicts user interactions and shapes the content you see.

ShiZhen AI
ShiZhen AI
ShiZhen AI
Inside X’s Open‑Source ‘For You’ Algorithm: How AI Drives Your Attention

Background: Why X Open‑sourced the Algorithm

X’s “For You” feed mixes in‑network posts with out‑of‑network content and ranks them using AI. The company released the entire production system on GitHub (https://github.com/xai-org/x-algorithm) under an Apache 2.0 license, with 62.9% of the code in Rust and 37.1% in Python, updating every four weeks. The stated goal is transparency: developers can study a large‑scale recommender and contribute code, replacing the previous black‑box perception.

How the Algorithm Works Step‑by‑Step

The pipeline follows a Home Mixer‑controlled Candidate Pipeline framework. Each stage performs a single function and can run in parallel, improving efficiency.

Query Hydration : Load the user’s recent interactions, follow list, and preferences; the code uses a hydrator to fetch serialized data such as recent likes and replies.

Candidate Sources : Pull posts from the Thunder store (friends’ posts) and from Phoenix (global candidates).

Hydration : Enrich candidates with metadata—post content, author info, media assets—to ensure completeness.

Filtering : Remove duplicates, stale posts, and blocked content using a series of filter classes.

Scoring : The Grok transformer predicts interaction probabilities for each candidate and produces a weighted total score.

Selection : Rank candidates by the total score and select the top K for the feed.

Post‑Selection Filtering : Apply final checks to drop spam or low‑quality items.

Key Components: Thunder and Phoenix

Thunder is an in‑memory store located in the thunder/ directory. It consumes Kafka streams of post creation and deletion events, partitions data by user, and serves recent friend posts in milliseconds without hitting a database.

Phoenix resides in the phoenix/ directory and handles both retrieval and ranking. Retrieval uses a two‑tower model: a user tower encodes the user’s history and features, while a candidate tower encodes all posts; the dot‑product similarity selects the top candidates. Ranking employs a Grok‑1‑derived transformer that takes user context and candidate posts, masks attention to keep candidates independent, and outputs probabilities for actions such as like, reply, and share.

Scoring and Filtering Logic

The scoring stage runs the Phoenix scorer transformer, which emits probabilities for multiple actions. Positive actions (e.g., Favorite, Reply, Repost, Click) add to the score, while negative actions (e.g., Not Interested, Block Author) subtract. A weighted scorer aggregates these using learned weights from data.

Favorite – positive – likes probability

Reply – positive – reply probability

Repost – positive – repost probability

Quote – positive – quote probability

Click – positive – click probability

Profile Click – positive – author‑page click probability

Video View – positive – video‑view probability

Photo Expand – positive – photo‑expand probability

Share – positive – share probability

Dwell – positive – dwell time probability

Follow Author – positive – follow‑author probability

Not Interested – negative – negative signal

Block Author – negative – negative signal

Mute Author – negative – negative signal

Report – negative – negative signal

Filtering uses several dedicated filters, each implemented as a class:

DropDuplicates – removes duplicate IDs

AgeFilter – discards old posts to keep the feed fresh

SelfpostFilter – excludes posts authored by the user

MutedKeyword – respects user‑specified muted keywords

AuthorSocialgraph – blocks authors on the user’s blacklist

Additional post‑filtering stages such as VF Filter and DedupConversation further guard against spam and thread‑collapse issues.

Implications of the Open Source Release

For end users, the feed becomes more personalized because the code is transparent and the community can address bias. For developers, the pipeline can be reused to build new applications, and the Rust + ML stack serves as a practical example of high‑performance recommender engineering. Long‑term, the open‑source model promotes AI transparency and positions the Grok‑based recommender as a reference implementation. Risks include potential abuse for manipulation, but the authors argue that the benefits outweigh the drawbacks, especially with a four‑week update cadence that keeps the system current.

Conclusion

By examining the source code, the “mystery” behind X’s feed disappears: a modular pipeline, a transformer that learns user behavior, and a series of strict filters together produce the personalized experience. Readers are encouraged to explore the GitHub repository, experiment with modifications, and anticipate even smarter recommendations in the future.

machine learningPythonrecommendation systemRustGrok transformerX algorithm
ShiZhen AI
Written by

ShiZhen AI

Tech blogger with over 10 years of experience at leading tech firms, AI efficiency and delivery expert focusing on AI productivity. Covers tech gadgets, AI-driven efficiency, and leisure— AI leisure community. 🛰 szzdzhp001

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.