How LORE Transforms E‑Commerce Search Relevance with Generative AI
The article details the development and deployment of LORE, a large generative model that reshapes e‑commerce search relevance by combining knowledge injection, chain‑of‑thought reasoning, and multimodal alignment, achieving simultaneous improvements in user experience and revenue metrics.
Overview
Search relevance in e‑commerce must simultaneously satisfy user experience and revenue goals. Traditional models lacked deep semantic understanding, causing a zero‑sum trade‑off between experience and RPM. The LORE system (Large Generative Model for Search Relevance) was introduced to break this deadlock. Since 2024, LORE has delivered a 27% lift in good‑rate and a 2% increase in RPM by reconstructing the relevance pipeline.
Core Insight: Deconstructing Relevance
Relevance judgment can be split into three orthogonal dimensions:
Product Understanding : Accurate perception of all product attributes (textual and visual) to handle diverse query expressions.
Query Understanding : Extraction of user intent from noisy, unstructured inputs, including colloquial terms, synonyms, and implicit meanings.
Path Modeling : Construction of a matching path that connects the interpreted intent to a concrete product; difficulty grows with query complexity and attribute specificity.
Effective relevance therefore requires:
Broad e‑commerce commonsense knowledge combined with deep domain expertise.
Multi‑step logical reasoning to resolve conflicts between intent and product data.
Multimodal perception to align visual cues (e.g., style, material) with textual demands.
Strict alignment with detailed business relevance standards.
Technical Evolution of LORE
Phase 1 – Knowledge (Foundation)
Challenge : Missing domain “common sense” caused entity‑recognition errors and poor attribute comprehension.
Solution : Built a large, high‑quality e‑commerce knowledge graph and injected it into the model via In‑Context Learning (ICL). This created a solid knowledge base that the model can query at inference time.
Result : Enabled a high‑accuracy offline copilot for data labeling and bad‑case monitoring, reducing annotation cost and improving supervision quality.
Phase 2 – Reasoning (Chain‑of‑Thought)
Challenge : Complex, ambiguous queries required multi‑step logical inference and path construction.
Solution : Integrated Chain‑of‑Thought prompting that forces the model to emit a full reasoning chain:
entity identification → intent analysis → attribute constraint → path comparison. A data‑synthesis engine generated hard cases to enrich training.
Result : Developed a multidimensional knowledge‑distillation framework that transfers the deep reasoning ability of the large model to lightweight online models, markedly improving online experience while keeping latency low.
Phase 3 – Alignment (Multimodal & Rules)
Challenge : Visual attribute understanding and strict adherence to extensive business rules.
Solution : Upgraded the backbone to Qwen‑VL, a multimodal model capable of interpreting product images. Business rules were explicitly encoded into the CoT path and reinforced with reinforcement learning (RL) to improve rule‑guided inference.
Result : Overcame compute bottlenecks by combining high‑frequency caching for head queries with tail‑real‑time inference for complex cases, bringing large‑model judgment online at scale.
System Refactor: Harnessing Model Benefits
Beyond model capability, the ad pipeline was re‑architected to expose the model’s power at scale.
Control Front‑Loading : Relevance filtering was moved upstream to the sea‑selection stage, increasing relevant ad supply by 32% and improving auction density.
Compute Tiering : Three compute tiers were defined to balance latency, QPS, and cost:
Head‑Q (high‑frequency) : Offline pre‑compute + cache (Cache‑aside) for frequent query‑product pairs, delivering near‑zero latency.
Hard‑Query (tail) : Real‑time inference with the full LORE model for low‑frequency, ambiguous, or newly emerging queries.
Mid‑Tier (regular) : Distilled lightweight models derived from LORE via knowledge‑distillation, handling the bulk of traffic with high QPS and low RT.
Results and Impact
The redesign turned relevance from a pure blocker into a supply optimizer, delivering:
+27% improvement in good‑rate (user‑experience metric).
+2% increase in RPM (revenue per mille).
+32% growth in relevant ad supply, leading to higher auction density and better monetization.
Future Directions
Planned extensions include:
Using LORE’s generative capabilities for proactive ad‑creative generation and automatic product‑attribute completion.
Leveraging model confidence scores to refine relevance standards and identify ambiguous rule regions.
Further optimizing compute allocation to achieve the best trade‑off between latency, cost, and accuracy.
Technical report: LORE: A Large Generative Model for Search Relevance (https://arxiv.org/abs/2512.03025)
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
