Why Leading Chinese Tech Giants Prefer PostgreSQL Over MySQL: A Deep Technical Comparison
This article examines why major Chinese technology companies are building distributed, cloud‑native and HTAP databases on PostgreSQL instead of MySQL, detailing PostgreSQL's richer data types, true sequence support, extensive extensions, superior monitoring tools, and robust replication mechanisms.
PostgreSQL‑Based Distributed Database Projects in China
Tencent Cloud TDSQL PG (code‑named TBase) – repository: https://github.com/Tencent/TBase. Introduces a Global Transaction Manager (GTM) and distributed coordination to enable cross‑shard transactions.
Alibaba Cloud PolarDB for PostgreSQL – rewrites the storage layer to provide a "one‑write‑multiple‑read" shared storage architecture, allowing read replicas to be added in seconds.
Huawei Cloud GaussDB (openGauss) – partially compatible with the PostgreSQL ecosystem; adds a columnar engine, AI optimizer, and HTAP capabilities. Official site: https://opengauss.org.
Hangzhou Yijing Shutong openHalo – integrates a columnar storage engine and AI optimizer for HTAP workloads. Repository: https://github.com/HaloTech-Co-Ltd/openHalo.
Technical Advantages of PostgreSQL Over MySQL
1. Richer Data Types
PostgreSQL supports advanced data types that simplify complex modeling:
ARRAY – store multiple values in a single column.
Range Types (e.g., int4range, tsrange) – represent intervals such as time spans or price ranges.
Composite Types – define custom structures like POINT(x,y) for geometric data.
JSONB – binary JSON with indexing, efficient querying, and update capabilities.
2. Independent Sequence Objects
PostgreSQL provides true sequence objects that are decoupled from any table, ensuring globally unique identifiers even in distributed environments.
-- Create an independent sequence in PostgreSQL
CREATE SEQUENCE order_seq START WITH 1 INCREMENT BY 1;
-- Use the sequence to generate IDs
INSERT INTO orders (id, name) VALUES (nextval('order_seq'), 'test');MySQL lacks a native independent sequence. It simulates a sequence using AUTO_INCREMENT plus an offset, which ties the identifier to a specific table and cannot be shared across tables.
-- MySQL: bind AUTO_INCREMENT to a table
ALTER TABLE orders AUTO_INCREMENT = 1000;
-- Or use a session variable to simulate a sequence
SET @next_id = 3088413 + 1;Sequences cannot be shared across tables.
Ensuring global uniqueness in distributed setups often requires external services such as Redis.
3. Powerful Extension Ecosystem
PostgreSQL’s extension mechanism allows seamless integration of third‑party functionality: TimescaleDB – time‑series database with automatic partitioning and compression. pg_trgm – fuzzy string matching and similarity search. Citus – distributed database extension for sharding and parallel query execution. pg_stat_statements – detailed SQL execution statistics.
4. Built‑in Monitoring and Explain
PostgreSQL includes comprehensive statistics views such as pg_stat_activity, pg_stat_statements, and pg_locks. It also supports EXPLAIN ANALYZE to obtain actual execution time and row counts. Mature third‑party tools (PgAdmin, pg_stat_monitor, Prometheus + Grafana) integrate easily.
5. Replication and High Availability
MySQL’s default asynchronous replication can introduce latency and lacks strong consistency guarantees. Features like GTID and semi‑synchronous replication require careful configuration and high‑quality networks.
PostgreSQL offers a richer set of replication options:
Streaming Replication – supports both asynchronous and synchronous modes.
Logical Replication – table‑level or cross‑version replication.
Write‑Ahead Logging (WAL) – mature mechanism guaranteeing durability and consistency.
Synchronous Replication – primary waits for at least one replica to acknowledge before committing, eliminating data loss.
6. Open‑Source Maturity
PostgreSQL’s development is community‑driven, with rapid feature delivery. Many capabilities (e.g., window functions, advanced indexing) appeared in PostgreSQL years before MySQL 8.0 introduced them.
7. MVCC Implementation Differences
PostgreSQL’s Multi‑Version Concurrency Control (MVCC) allows readers to see a consistent snapshot of data while writers modify rows, providing true isolation without blocking reads.
MySQL’s MVCC implementation varies by storage engine and isolation level; uncommitted changes can lead to dirty reads or non‑repeatable reads.
PostgreSQL is often described as a "programmable database" that can serve as an application platform, whereas MySQL functions more like a pure execution engine.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
