Design and Evolution of the DaJia App Search System
This article explains the motivations, requirements, and technical design of the DaJia app's search system, compares relational databases with Lucene‑based solutions, describes the inverted index mechanism, outlines common search workflows, and details the system's three iterative development phases and future improvement plans.
The DaJia app search system is introduced, prompting readers to consider their own design ideas before comparing them with the presented solution.
Why Implement Search
Search reduces user onboarding cost and shortens the distance to order placement, improving conversion rates.
Typical Search Implementation
Search requirements include handling large data volumes, supporting text and multi‑field queries, and delivering fast responses.
Relational databases like MySQL can handle large volumes but struggle with unstructured text search, multi‑field indexing, and performance for fast responses.
Lucene‑based engines (Elasticsearch, Solr) excel because they support unstructured storage, rich text matching, and high‑performance multi‑field queries using an inverted index.
Inverted Index
An inverted index maps each term to the list of document IDs containing that term.
docid
age
gender
1
22
male
2
22
female
3
18
male
Separate inverted indexes are built for each field (e.g., age, gender), each consisting of a dictionary and posting list.
age
docid list
22
1,2
18
3
gender
docid list
male
1,3
female
2
Common Search Process
Typical steps include selecting the top‑N relevant documents, applying coarse ranking (e.g., by rating), performing fine‑grained re‑ranking for business‑specific rules, and finally returning the results.
DaJia App Search Iterations
Phase 1 – List Era
Search displays a list of services matching the query, with basic content matching, multi‑criteria sorting, and special handling for promoted or top‑ranked services.
Phase 2 – Self‑Operated Direct Booking Era
Prioritizes self‑operated and direct‑booking services, provides query suggestions, and recommends related services when no exact match is found, using multiple term dictionaries.
Phase 3 – Semi‑Automatic Era
Automates term‑dictionary maintenance by mining user search logs, reducing manual effort and the chance of errors, while also decreasing the rate of zero‑result queries.
Future Development
Plans include continuously improving accuracy by expanding the term dictionary, enlarging the product catalog (e.g., adding household‑helper data), and enhancing query parsing with part‑of‑speech analysis to avoid irrelevant results.
End :)
Swan Home Tech Team
Official account of Swan Home's Technology Center, covering FE, Native, Java, QA, BI, Ops and more. We regularly share technical articles, events, and updates. Swan Home centers on home scenarios, using doorstep services as a gateway, and leverages an innovative “Internet + life services” model to deliver one‑stop, standardized, professional home services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.