Supply Standardization for Script‑Murder Business Using a Knowledge Graph
Meituan’s To‑Store Integrated data team built an end‑to‑end supply‑standardization pipeline for the rapidly growing script‑murder market by extending the GENE knowledge graph to mine merchant supply, construct a unified script library through rule‑based, semantic, and multimodal clustering, and link products and user‑generated content to standardized scripts, enabling a dedicated category, personalized recommendations, filter tags, and improved ranking.
The script‑murder (剧本杀) business has experienced explosive growth, but existing category structures, merchant onboarding, and supply‑demand matching are insufficient. Standardizing the supply can create value for users, merchants, and the platform.
This article describes how Meituan’s “To‑Store Integrated” data team built a supply‑standardization pipeline for script‑murder from scratch, leveraging the General Needs net (GENE) knowledge graph. The pipeline consists of three core stages: script‑murder supply mining, construction of a standard script library, and linking supply items to standard scripts.
1. Background
Three main pain points are identified:
Missing dedicated category for script‑murder, causing chaotic user decision paths.
Low decision efficiency because scripts lack a unified library and are not linked to supply.
Labor‑intensive merchant product entry, leading to low upload rates.
Standardizing supply addresses these issues by creating a new category, a central script library, and explicit relationships among merchants, products, and content.
2. Solution Overview
The team extended the existing GENE knowledge graph to the script‑murder domain, constructing entities (script names, attributes, merchants, products, content) and their multi‑type relationships. The solution is implemented in three major steps.
3. Implementation Methods
3.1 Script‑Murder Supply Mining
Supply mining is treated as a multi‑source classification problem (merchant name, product name, product detail, merchant UGC). Because of limited labeled data, a hybrid unsupervised‑matching + supervised‑fitting approach is used:
Unsupervised matching: Build a keyword dictionary, perform exact matches, then filter results with a BERT‑based semantic drift model and compute a matching score per source.
Supervised fitting: Manually label a small set of merchants, train a linear regression model to weight source scores, producing a final merchant confidence score.
This yields high‑precision merchant extraction, which feeds downstream product mining and category creation.
3.2 Standard Script Library Construction
The library consists of standard script names and their attributes (genre, specification, difficulty, background, etc.). Three aggregation methods are applied:
Rule aggregation: Clean product titles, filter non‑script tokens, and use longest common subsequence (LCS) similarity to cluster variants such as “舍离”, “舍离壹”, “舍离1”.
Semantic aggregation: Use a dual‑tower Sentence‑BERT model to embed script titles and compute cosine similarity. Training data are generated from rule‑aggregated clusters (positive pairs) and cross‑cluster pairs (negative pairs), with active learning and data augmentation (typos, synonyms).
Multimodal aggregation: Incorporate product images using a pre‑trained EfficientNet encoder. Text and image embeddings are concatenated and matched via cosine similarity, improving recall for scripts that differ textually but share visual cues.
Attribute extraction is performed by voting over all products belonging to a cluster; the most frequent attribute value is selected and later verified by human reviewers.
3.3 Linking Supply to Standard Scripts
After the library is built, three association tasks are performed:
Product‑script linking: Match new products to existing standard scripts using the multimodal model; unmatched products trigger a new name/attribute extraction workflow.
Content‑script linking: Align user‑generated content (UGC) with scripts at the clause level. A two‑stage pipeline first recalls clauses containing script names or aliases, then ranks them with an aspect‑based BERT classifier (sentence‑pair input with [SEP]).
Both linking steps use limited labeled data (few hundred examples) augmented by active learning and regularized dropout (R‑Drop) to achieve production‑grade accuracy.
4. Application Practice
The knowledge graph now powers several Meituan scenarios:
Category construction: New script‑murder category and list pages are live, providing a centralized traffic entry.
Personalized recommendation: Standard script nodes feed hot‑script recommendation, cross‑scene suggestions, and “playable store” recommendations. A Deep Interest Network (DIN) model incorporates script‑level sequences derived from the graph.
Information exposure & filtering: Script attributes and supply links are displayed as filter tags on list pages, reducing user decision cost.
Scoring & ranking: Script‑UGC associations contribute to script scores, enabling classic‑must‑play and trending‑script leaderboards.
5. Summary & Outlook
The team demonstrated a rapid, end‑to‑end pipeline that builds a domain‑specific knowledge graph, standardizes supply, and drives measurable product improvements. Future work includes continuous enrichment of the script library, deeper exploration of scenario‑based user needs, and extending graph data to search and other downstream services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
