Artificial Intelligence 17 min read

Graph Embedding Algorithms and Their Application in Zhuanzhuan Recommendation System

This article introduces the fundamentals of recommendation systems, explains Zhuanzhuan's main recommendation scenarios and pipeline, and details three graph embedding methods—DeepWalk, node2vec, and EGES—along with their practical implementations in recall and coarse‑ranking stages.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Graph Embedding Algorithms and Their Application in Zhuanzhuan Recommendation System

1. Introduction to Zhuanzhuan Recommendation Algorithms

1.1 What is a Recommendation System?

With the rapid development of information technology and the Internet, the amount of information transmitted online has exploded, leading to an information‑overload era. Recommendation systems play an indispensable role by efficiently connecting users with relevant information, saving time on filtering, and helping platforms distribute content effectively. For Zhuanzhuan, the recommendation system is crucial for suggesting products and content, linking users to items they may like and helping merchants expose their goods to suitable audiences. The recommendation algorithm serves as the engine of this system.

1.2 Main Scenarios and Process of Zhuanzhuan Recommendation

In the Zhuanzhuan app, recommendation algorithms are applied to home‑page recommendation, product‑detail page recommendation, and favorites‑list recommendation, among others. When a user opens the app, the home page displays a stream of items that the algorithm predicts the user may like. Clicking a product leads to a detail‑page recommendation of similar items, while adding items to the favorites triggers personalized suggestions based on the user's collection behavior.

The overall recommendation workflow is funnel‑shaped: a large item pool is first filtered in the recall stage, then passed to coarse‑ranking, and finally to fine‑ranking before being presented to the user.

Recall quickly selects a candidate set from billions of items using simple models and features to meet strict latency requirements.

Coarse‑ranking scores the recalled candidates with moderately complex models, narrowing the set for fine‑ranking.

Fine‑ranking applies sophisticated models and many features to a small set of items to achieve high accuracy.

The final stage may involve re‑ranking based on business goals.

This talk focuses on graph algorithms and their practice in Zhuanzhuan's recall and coarse‑ranking stages.

2. Graph Algorithm Principles and Zhuanzhuan Practice

Graphs are fundamental data structures that appear in many real‑world scenarios such as social networks, protein interactions, and e‑commerce relationships between users and items.

In Zhuanzhuan, constructing a graph from user‑item interactions and learning graph embeddings yields low‑dimensional dense vectors that capture intrinsic relationships between nodes.

These vectors can be used as pre‑training features for ranking models or directly to compute similarity for item‑to‑item recommendation.

2.1 Classic Graph Embedding: DeepWalk

DeepWalk learns node embeddings by performing random walks on the graph and feeding the generated sequences into a skip‑gram model (word2vec).

The typical DeepWalk pipeline for e‑commerce includes:

Collect raw user behavior sequences and split them according to rules (e.g., click interval > 1 hour).

From the split sequences, build a directed item‑cooccurrence graph (e.g., sequence D→A→B creates edges D→A and A→B).

Perform random walks from multiple start nodes to generate walk sequences.

Train a skip‑gram model on the walk sequences to obtain item embeddings.

The transition probability from node u to node v is proportional to the edge weight w uv divided by the sum of outgoing edge weights of u.

2.2 Structural vs. Homophilic Similarity: node2vec

node2vec extends DeepWalk by introducing two hyper‑parameters p and q to bias the random walk toward breadth‑first (BFS) or depth‑first (DFS) exploration, allowing control over structural and homophilic similarity.

The transition probability from node v to node x is:

π vx = α pq (t, x)·w vx , where α depends on the shortest‑path distance between the previous node t and candidate node x. Small p encourages returning to the previous node (BFS‑like, capturing structural similarity), while small q encourages exploring farther nodes (DFS‑like, capturing homophilic similarity).

In recommendation, homophilic similarity corresponds to items sharing categories, attributes, or co‑click/purchase patterns, while structural similarity reflects items that are popular in the same “trend” or “bundle”. node2vec’s flexibility lets us tailor the walk strategy to the specific recommendation scenario.

2.3 Incorporating Side Information: EGES

DeepWalk suffers from cold‑start problems because new or rarely interacted items become isolated nodes with low edge weights, making them unlikely to be visited during walks.

EGES (Enhanced Graph Embedding with Side Information) addresses this by augmenting the skip‑gram embedding with weighted side‑information embeddings such as category, brand, or city.

After generating the item sequences as in DeepWalk, EGES jointly trains multiple embeddings (item ID + side‑information) and aggregates them, typically via a weighted average:

e i = Σ k w k ·e i,k / Σ k w k

where e i is the final item embedding, e i,k are embeddings of different side attributes, and w k are learnable weights obtained through a hidden‑representation layer followed by softmax.

2.4 Integrating Side Information into Graph Construction: Zhuanzhuan Recall Practice

In Zhuanzhuan we applied EGES‑style side‑information weighting to the graph construction phase, which improved training speed and reduced embedding parameter size.

Steps:

Split user behavior sequences into co‑occurrence pairs and collect corresponding side information for each item.

Aggregate global co‑occurrence pairs and assign initial edge weights based on interaction counts.

Adjust edge weights according to predefined side‑information rules (e.g., same category or price range increases weight).

Run node2vec random walks on the weighted graph and train embeddings.

The resulting item vectors can be used for item‑to‑item similarity or for user‑to‑item‑to‑item recommendation.

2.5 Heterogeneous Graph for Coarse‑Ranking

Coarse‑ranking requires both user and item vectors to compute inner‑product scores efficiently. By extending the graph to a bipartite user‑item graph, we can obtain embeddings for both sides.

Implementation steps:

Collect user behavior sequences and construct a user‑item bipartite graph.

From the bipartite edges, build an undirected weighted graph.

Perform random walks on the graph to generate alternating user‑item sequences (u1‑i1‑u2‑i2 …).

Train embeddings for both users and items.

During online serving, retrieve the user and item vectors and compute a simple inner product to obtain item scores.

3. Summary

This presentation covered Zhuanzhuan's recommendation scenarios and pipeline, and introduced three widely used graph embedding methods:

DeepWalk, the foundational random‑walk based embedding technique.

node2vec, which adds controllable BFS/DFS bias to capture structural and homophilic similarity.

EGES, which incorporates weighted side information to alleviate cold‑start issues.

We demonstrated how these algorithms are applied in the recall and coarse‑ranking stages of Zhuanzhuan's recommendation system. Other graph‑based methods such as LINE, SDNE, GAT, and GraphSAGE remain active research topics and have also been explored in our platform.

References

[1] Perozzi B, Al‑Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations. KDD 2014: 701‑710.

[2] Grover A, Leskovec J. node2vec: Scalable Feature Learning for Networks. KDD 2016: 855‑864.

[3] Wang J, Huang P, Zhao H, et al. Billion‑scale Commodity Embedding for E‑commerce Recommendation in Alibaba. KDD 2018: 839‑848.

[4] 王喆:深度学习中不得不学习的Graph Embedding方法 https://zhuanlan.zhihu.com/p/64200072

e-commercerecommendation systemgraph embeddingnode2vecDeepWalkEGES
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.