Scaling Card‑Based Social Matching with Multi‑Task AI Models and Efficient Backend
This article details the design and optimization of Jimu’s card‑based stranger‑social recommendation system, covering product background, gameplay flow, technical challenges in strategy and engineering, a multi‑task AI ranking model, vector recall improvements, and the resulting performance gains.
Background
In recent years the number of users of stranger‑social apps in China has grown rapidly, leading to many products that present opposite‑sex profiles as cards or lists. This article uses the card‑based stranger‑social app “Jimu” as a case study to describe the architecture of card recommendation from both algorithmic and engineering perspectives.
Product Overview
Jimu is the first entertainment‑social platform focused on youth culture. It matches users based on interests and purposes, providing a “card” UI to quickly find like‑minded friends. The app includes card and community scenes, chat rooms, voice rooms, etc. This article mainly introduces the engineering and algorithmic optimizations for the card scene.
Card Gameplay
Step1: Users click the “Filter” button, set basic info, and see cards that meet the criteria.
Step2: Swipe right to like, left to reject; a match occurs when two users both swipe right.
Step3: Matched users can start chatting.
Technical Challenges
Recommendation Strategy Challenges
The conversion chain (right‑swipe → match → chat) is long, creating many sub‑goals. Low right‑swipe probability and long chain cause user churn if matches are not returned.
Engineering System Challenges
The core of the card recommendation system is to deliver appropriate data. The system must provide data sources, support filtering, avoid duplicate cards, and scale to millions of users. Original design loaded all user data and indexes into memory, leading to high memory usage, long update times, and low QPS.
Provide usable data sources.
Support data condition filtering via indexes.
Prevent duplicate cards (read‑state maintenance).
As user count grew, problems emerged:
Memory consumption near 100 GB per machine.
Data update required FTP transfer and loading, causing tens of minutes of downtime.
Read‑state Bloom filter was too large, resulting in low retrieval efficiency and QPS around 5 on a 32‑core machine.
Key questions: how to provide efficient metadata filtering, improve hardware utilization, and increase service availability?
Overall Solution
Recommendation Framework
Overall Architecture
The recommendation pipeline follows the classic three‑stage design: recall, ranking, and re‑ranking. The article focuses on the ranking model.
Data Framework
Data flow and model framework are illustrated below.
Engineering Service Framework
Service Layers
Four layers:
Business layer – renders cards, secondary filtering, etc.
Strategy layer – selects recommendation strategies based on location, registration time, experiment ID.
Recall layer – fetches candidate cards from various recall sources.
Model layer – runs algorithm models to produce basic data for recall.
Data Flow
Typical LBS‑based social product flow: user request → business aggregation → strategy selection → recall → candidate cards.
Ranking Model
Optimization Goals
Key metrics: like rate, match rate, chat rate, and ABA chat rate (both parties converse beyond one round). The model must improve these across the long conversion chain.
Algorithm Model
The problem is a multi‑objective, chain‑dependent scenario. Two common solutions are multi‑model fusion and multi‑task learning. Multi‑task learning is chosen because later tasks depend on earlier ones and data become sparse.
Base Model
Inspired by Alibaba’s 2018 ESMM, the model consists of:
Input feature layer – sparse one‑hot features embedded, multihot features summed, then concatenated with dense features.
Multi‑task tower – four towers predict right‑swipe, match after right‑swipe, chat after match, and ABA chat.
Output layer – combines predictions using conditional probability.
Iterative Optimizations
Feature Layer Optimizations
Sequence Feature Processing
Replace simple sum‑pooling with DIN to give higher weight to historical interests related to the candidate.
Feature Crossing Network
Adopt DCN‑V2 stacked cross network to learn explicit and implicit feature interactions.
Model Optimizations
Use PLE (2020 Tencent) to resolve task conflicts, built on Wide & Deep architecture.
Multi‑Objective Loss Optimization
Apply trainable loss weights (Kendall 2018) to balance tasks dynamically.
Model Performance
Compared with the previous MMoe model, the new PLE‑based model improves AUC for all sub‑tasks.
Vector Recall System
Optimization Goals
Increase throughput, availability, reduce resource cost, and support million‑scale social recommendation.
Read‑State Filtering Optimization
Switch from global Bloom filter to per‑user RoaringBitmap, reducing memory and improving latency.
Metadata and Index Optimization
Move from in‑memory hash indexes to PostgreSQL for flexible, real‑time updates, geolocation queries, and complex conditions.
<code>users = map[uid][User]{
"10000": {
"nickname":"nickname",
"age": "age",
"location": "location",
"gender": "gender",
"aim": "aim",
...,
"index": ["hash index for this uid"]
}
...
}
index = map[age_gender][uids]{
"18_female": [10000, 10001],
"19_female": [10003, 10002],
"18_male": [10004, 10005],
...
}
</code>Replace massive full‑load updates with incremental message‑queue driven updates.
Strategy Development Optimization
Integrate vector recall as a data source consistent with business data, enabling flexible mixing.
Optimization Results
New architecture reduces machine resources by 82 % while QPS increases by 2900 %. It also unifies business and algorithmic card recommendation, lowering development barriers.
Future Outlook
Continued collaboration between engineering and algorithms is needed as user scale and model complexity grow. Future work will focus on model size reduction, computational efficiency, and service stability for large models.
References
One article “Seeing through” multi‑task learning: https://cloud.tencent.com/developer/article/1824506
https://arxiv.org/pdf/1804.07931.pdf
https://arxiv.org/pdf/1706.06978.pdf
https://dl.acm.org/doi/abs/10.1145/3383313.3412236
https://arxiv.org/abs/1606.07792
https://openaccess.thecvf.com/content_cvpr_2018/papers/Kendall_Multi-Task_Learning_Using_CVPR_2018_paper.pdf
RBM advantages: https://www.jianshu.com/p/818ac4e90daf
PostgreSQL spatial index: https://www.alibabacloud.com/blog/spatial-search-geometry-and-gist-combination-outperforms-geohash-and-b-tree_597174
Common vector search engines: https://zhuanlan.zhihu.com/p/364923722
https://arxiv.org/pdf/2008.13535.pdf
Inke Technology
Official account of Inke Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.