Design Principles of 58.com Database Architecture and Codd's Twelve Rules
This article outlines 58.com’s database architecture strategies—including high availability via replication and dual‑master setups, read‑performance enhancements through indexing, read replicas, and caching, consistency solutions, scalability techniques, horizontal sharding patterns, and a review of Codd’s twelve relational rules as design guidelines.
The article presents the design ideas behind 58.com’s database architecture, covering availability, read performance, consistency, scalability, and SQL practices, followed by a recap of Codd’s twelve relational rules.
1. Availability Design
High availability is achieved through replication and redundancy, which inevitably introduces consistency challenges. Read availability is ensured by using read replicas, while write availability is addressed with a dual‑master configuration that can also serve as a master‑slave pair when one master fails.
Problems such as master‑slave inconsistency and key conflicts in dual‑master setups are discussed, with two solution paths: (a) ensuring keys do not conflict at the database or business layer, and (b) using the dual‑master as a master‑slave pair, allowing reads and writes to go to the same primary node to avoid inconsistency.
2. Read‑Performance Design
Three main approaches are described:
Adding many indexes improves read speed but reduces write performance and increases memory usage; different databases can maintain distinct indexes (e.g., online vs. offline replicas).
Adding read replicas expands read capacity, though it can exacerbate master‑slave lag and inconsistency.
Introducing caching with a service layer that hides the database‑cache complexity; the service‑plus‑cache‑plus‑data model avoids read‑write separation and prevents inconsistency.
3. Consistency Design
To address master‑slave inconsistency, two options are presented: (a) inserting a middleware that routes reads to the master for a short window after a write, and (b) routing all reads and writes to the master (the approach 58.com adopts). For cache‑database inconsistency, a “double‑eviction” method is used: on a write, the cache is evicted, the database is updated, and after a timer (based on expected master‑slave sync time) the cache is evicted again; reads follow the usual cache‑first strategy.
4. Scalability Design
Rapid data expansion is demonstrated by doubling the number of databases (N → 2N) within seconds using dual‑master‑as‑master‑slave mode, promoting read replicas, and then removing old synchronizations. Field expansion can be handled via log‑tailing or dual‑write techniques. Horizontal sharding scenarios covering single‑key, one‑to‑many, many‑to‑many, and multi‑key tables are listed.
5. SQL Practices for Massive Data
The article advises avoiding complex queries (joins, sub‑queries, triggers, user‑defined functions, transactions) due to performance impact. It discusses handling IN queries, non‑partition‑key queries, and cross‑shard pagination, offering optimizations such as auxiliary IDs, fuzzy queries, and a two‑stage query method that rewrites the OFFSET/LIMIT logic across shards.
6. Codd’s Twelve Rules
The article briefly lists Codd’s twelve relational database rules—information rule, guaranteed access, systematic treatment of nulls, dynamic online catalog based on the relational model, comprehensive data sub‑language (SQL), view updateability, set‑level insert/update/delete, physical data independence, logical data independence, integrity independence, distribution independence, and non‑destructive rule—highlighting them as guiding principles for database design.
Overall, the piece summarizes 58.com’s comprehensive approach to building a highly available, performant, consistent, and scalable database system, while grounding the design in classic relational theory.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
