Backend Development 9 min read

Search Engineering Architecture: Lessons from Zhihu and 58 Group

The article summarizes the evolution and redesign of Zhihu's search engine, details 58 Group's high‑performance uesearch architecture, real‑time indexing mechanisms, cloud‑native deployment with Kubernetes, and highlights key technical insights and future directions for large‑scale search systems.

58 Tech
58 Tech
58 Tech
Search Engineering Architecture: Lessons from Zhihu and 58 Group

Background

On January 21, 2019, the 58 Group Technical Salon (Session 8 – Search Engineering Architecture) was held at the Beijing headquarters, featuring speakers from Zhihu's search team and 58 Group's TEG search team who shared their practical experiences.

1. Zhihu Search Architecture Evolution

1.1 First‑generation Search Zhihu built its own search in 2016, moving from Sogou to an Elasticsearch‑based system to meet growing demands for freshness, ranking quality, and content diversity.

1.2 Current State After a year‑long refactor in 2018, the system was rebuilt with clearer module boundaries, improving maintainability. A Rust‑based search engine compatible with Lucene replaced Elasticsearch, splitting the monolithic service into focused micro‑services, enhancing stability and performance.

1.3 Ongoing Improvements Future work focuses on further enhancing ranking quality and recall capabilities.

2. 58 Search Architecture

2.1 System Overview The self‑developed uesearch system serves various vertical search scenarios with high timeliness and consistency requirements. Its architecture consists of a stateless proxy layer, a merger layer for result merging and ranking, and a searcher layer that stores indexes and serves queries.

Horizontal sharding and replication enable unlimited scaling of data volume and concurrency.

2.2 Real‑time Index Update Design Real‑time indexing is achieved by building inverted indexes in memory within the search process. Updates are processed every few seconds, creating small index segments that are merged progressively (3 seconds → 15 minutes → 1 hour → permanent), ensuring fast document visibility and efficient search.

2.3 Cloud Search (云搜) Built on uesearch and Kubernetes, Cloud Search provides a private‑cloud search service where users define schemas and ingest documents. Kubernetes manages resources, schedules pods, and ensures automatic recovery of failed components.

3. Summary

The salon participants discussed search index organization, distributed index synchronization, query rewriting, multi‑replica consistency, relevance calculation, and search quality evaluation, sharing practical experiences with Elasticsearch, Lucene, and Kubernetes, and expressed a desire for continued collaboration to improve search system stability, timeliness, and relevance.

distributed systemsarchitecturerustKubernetesReal-time indexingSearch
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.