Artificial Intelligence 20 min read

Scaling Card‑Based Social Matching with Multi‑Task AI Models and Efficient Backend

This article details the design and optimization of Jimu’s card‑based stranger‑social recommendation system, covering product background, gameplay flow, technical challenges in strategy and engineering, a multi‑task AI ranking model, vector recall improvements, and the resulting performance gains.

Inke Technology
Inke Technology
Inke Technology
Scaling Card‑Based Social Matching with Multi‑Task AI Models and Efficient Backend

Background

In recent years the number of users of stranger‑social apps in China has grown rapidly, leading to many products that present opposite‑sex profiles as cards or lists. This article uses the card‑based stranger‑social app “Jimu” as a case study to describe the architecture of card recommendation from both algorithmic and engineering perspectives.

Product Overview

Jimu is the first entertainment‑social platform focused on youth culture. It matches users based on interests and purposes, providing a “card” UI to quickly find like‑minded friends. The app includes card and community scenes, chat rooms, voice rooms, etc. This article mainly introduces the engineering and algorithmic optimizations for the card scene.

Card Gameplay

Step1: Users click the “Filter” button, set basic info, and see cards that meet the criteria.

Step2: Swipe right to like, left to reject; a match occurs when two users both swipe right.

Step3: Matched users can start chatting.

Technical Challenges

Recommendation Strategy Challenges

The conversion chain (right‑swipe → match → chat) is long, creating many sub‑goals. Low right‑swipe probability and long chain cause user churn if matches are not returned.

Engineering System Challenges

The core of the card recommendation system is to deliver appropriate data. The system must provide data sources, support filtering, avoid duplicate cards, and scale to millions of users. Original design loaded all user data and indexes into memory, leading to high memory usage, long update times, and low QPS.

Provide usable data sources.

Support data condition filtering via indexes.

Prevent duplicate cards (read‑state maintenance).

As user count grew, problems emerged:

Memory consumption near 100 GB per machine.

Data update required FTP transfer and loading, causing tens of minutes of downtime.

Read‑state Bloom filter was too large, resulting in low retrieval efficiency and QPS around 5 on a 32‑core machine.

Key questions: how to provide efficient metadata filtering, improve hardware utilization, and increase service availability?

Overall Solution

Recommendation Framework

Overall Architecture

The recommendation pipeline follows the classic three‑stage design: recall, ranking, and re‑ranking. The article focuses on the ranking model.

Data Framework

Data flow and model framework are illustrated below.

Engineering Service Framework

Service Layers

Four layers:

Business layer – renders cards, secondary filtering, etc.

Strategy layer – selects recommendation strategies based on location, registration time, experiment ID.

Recall layer – fetches candidate cards from various recall sources.

Model layer – runs algorithm models to produce basic data for recall.

Data Flow

Typical LBS‑based social product flow: user request → business aggregation → strategy selection → recall → candidate cards.

Ranking Model

Optimization Goals

Key metrics: like rate, match rate, chat rate, and ABA chat rate (both parties converse beyond one round). The model must improve these across the long conversion chain.

Algorithm Model

The problem is a multi‑objective, chain‑dependent scenario. Two common solutions are multi‑model fusion and multi‑task learning. Multi‑task learning is chosen because later tasks depend on earlier ones and data become sparse.

Base Model

Inspired by Alibaba’s 2018 ESMM, the model consists of:

Input feature layer – sparse one‑hot features embedded, multihot features summed, then concatenated with dense features.

Multi‑task tower – four towers predict right‑swipe, match after right‑swipe, chat after match, and ABA chat.

Output layer – combines predictions using conditional probability.

Iterative Optimizations

Feature Layer Optimizations

Sequence Feature Processing

Replace simple sum‑pooling with DIN to give higher weight to historical interests related to the candidate.

Feature Crossing Network

Adopt DCN‑V2 stacked cross network to learn explicit and implicit feature interactions.

Model Optimizations

Use PLE (2020 Tencent) to resolve task conflicts, built on Wide & Deep architecture.

Multi‑Objective Loss Optimization

Apply trainable loss weights (Kendall 2018) to balance tasks dynamically.

Model Performance

Compared with the previous MMoe model, the new PLE‑based model improves AUC for all sub‑tasks.

Vector Recall System

Optimization Goals

Increase throughput, availability, reduce resource cost, and support million‑scale social recommendation.

Read‑State Filtering Optimization

Switch from global Bloom filter to per‑user RoaringBitmap, reducing memory and improving latency.

Metadata and Index Optimization

Move from in‑memory hash indexes to PostgreSQL for flexible, real‑time updates, geolocation queries, and complex conditions.

<code>users = map[uid][User]{
    "10000": {
        "nickname":"nickname",
        "age": "age",
        "location": "location",
        "gender": "gender",
        "aim": "aim",
        ...,
        "index": ["hash index for this uid"]
    }
    ...
}

index = map[age_gender][uids]{
    "18_female": [10000, 10001],
    "19_female": [10003, 10002],
    "18_male": [10004, 10005],
    ...
}
</code>

Replace massive full‑load updates with incremental message‑queue driven updates.

Strategy Development Optimization

Integrate vector recall as a data source consistent with business data, enabling flexible mixing.

Optimization Results

New architecture reduces machine resources by 82 % while QPS increases by 2900 %. It also unifies business and algorithmic card recommendation, lowering development barriers.

Future Outlook

Continued collaboration between engineering and algorithms is needed as user scale and model complexity grow. Future work will focus on model size reduction, computational efficiency, and service stability for large models.

References

One article “Seeing through” multi‑task learning: https://cloud.tencent.com/developer/article/1824506

https://arxiv.org/pdf/1804.07931.pdf

https://arxiv.org/pdf/1706.06978.pdf

https://dl.acm.org/doi/abs/10.1145/3383313.3412236

https://arxiv.org/abs/1606.07792

https://openaccess.thecvf.com/content_cvpr_2018/papers/Kendall_Multi-Task_Learning_Using_CVPR_2018_paper.pdf

RBM advantages: https://www.jianshu.com/p/818ac4e90daf

PostgreSQL spatial index: https://www.alibabacloud.com/blog/spatial-search-geometry-and-gist-combination-outperforms-geohash-and-b-tree_597174

Common vector search engines: https://zhuanlan.zhihu.com/p/364923722

https://arxiv.org/pdf/2008.13535.pdf

recommendation systemmulti-task learningVector Retrievalbackend optimizationsocial app
Inke Technology
Written by

Inke Technology

Official account of Inke Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.