Meituan Database High Availability System
Meituan’s high‑availability database system, built on a multi‑region Raft‑group architecture, addresses rapid instance growth and stringent availability by deploying three‑node HA cores, micro‑services, and MGR clusters with AZ‑ and region‑level disaster recovery, while employing multi‑channel fault detection, weighted election, and semi‑synchronous consistency mechanisms, and outlines future moves toward decentralized proxies and fully clustered designs.
美团数据库高可用系统分享了其在大规模数据集群保稳方面的实践经验,围绕4个方面的内容展开,包括高可用简介、高可用部署、重点模块的设计思考以及对未来思考。本文整理自主题分享《美团数据库的高可用系统》,系超大规模数据库集群保稳系列的第一篇文章。
01 高可用简介 :分享了面临的挑战,包括实例规模增长迅速、可用性要求提高、容灾场景复杂性等。介绍了发展历程,从MMM架构到MMHA架构再到当前的高可用系统(基于Raft Group的多Region、多AZ部署)。
02 高可用部署 :介绍了高可用架构(数据流和数据流),部署策略包括HA Core(3节点Raft Group)、微服务(同步服务、调度服务、配置中心)、数据层(MGR集群)和管控层。部署策略支持AZ级容灾和Region级容灾。
03 重点模块设计 :故障发现模块采用多通道探测(普通、心跳、从库)和多数派决策,减少漏判和误判。故障选举模块引入选举因子(版本、权重等)和选举策略(同机房优先),决定新主库。数据一致性模块涉及S1(半同步)、S2(补齐数据)和S3(回滚)策略,确保RPO≈0。
04 未来思考 :未来方向包括提升容灾能力(AZ级和Region级)、去中心化架构(内置Proxy进程)、集群化(去依赖化)。
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
