How Suning Scaled with PostgreSQL & Citus: Lessons from 3000+ Instances
This article details Suning's journey of replacing commercial databases with PostgreSQL and Citus, covering background, feature and performance comparisons, reliability tests, deployment architecture, real‑world use cases, operational tips, and a comprehensive Q&A on migration and maintenance.
Background
Suning began adopting PostgreSQL in 2017. By the 2019 Double‑11 shopping festival more than 3,000 PostgreSQL instances were running across various business lines, replacing many commercial databases to reduce licensing costs and improve performance.
Database Selection
Suning’s 2016 research project evaluated MySQL and PostgreSQL as alternatives to foreign commercial databases. PostgreSQL was chosen because it provides richer SQL features, better performance on large data volumes, and functional parity with Oracle/DB2, while MySQL is limited to simple OLTP workloads.
Feature & Performance Comparison
Sysbench high‑concurrency primary‑key query: PostgreSQL and MySQL performed similarly.
Sysbench high‑concurrency pure‑write test: PostgreSQL achieved roughly 4× the throughput of MySQL.
TPCH benchmark (MySQL 5.6 vs PostgreSQL 9.5): PostgreSQL outperformed MySQL by an order of magnitude.
Reliability Comparison
Physical replication in PostgreSQL delivered higher write TPS (≈10 w/s vs. 1 w/s for MySQL) and eliminated replication lag. PostgreSQL’s checkpoint and WAL design also enabled fast recovery after power loss.
Selection Conclusion
Suning adopted a hybrid strategy: MySQL + MyCAT for simple OLTP scenarios and PostgreSQL + Citus for workloads requiring horizontal scaling, complex queries, or OLAP capabilities.
PostgreSQL & Citus Deployment
Business Scenario
The first PostgreSQL‑based system processed real‑time large‑scale data (>50 k ops/s, 90 % updates) on a wide table, supporting high‑concurrency reporting and detail queries.
Technical Selection
Citus was selected to turn multiple single‑node PostgreSQL instances into a distributed cluster, providing horizontal scalability.
Deployment Architecture
Detail‑table updates bypass the coordinator. The coordinator first determines record locations, then workers execute batch INSERT … ON CONFLICT statements directly on the target shards.
Production Results
Replacing a single‑node commercial database with a 1 CN + 8 Worker Citus cluster yielded >10× performance improvement, meeting a ten‑fold capacity expansion goal. The cluster later scaled to 16 workers while maintaining ~10 % CPU utilization.
Representative Use Cases (Citus)
Case 1 – Billing Settlement
Logistics billing platform with hundreds of nodes and tables exceeding hundreds of billions of rows. Citus eliminated custom sharding logic and reduced operational overhead.
Case 2 – Precise Marketing
Combined Citus with the pg_roaringbitmap extension to store and query billions of tag records efficiently for real‑time marketing.
Case 3 – Geo‑Based Search & Recommendation
PostgreSQL + PostGIS stored location data and supported high‑performance queries for offline store search, scaling up to 11 nodes during peak traffic.
Operational Experience
Issue 1 – GIN Index
GIN indexes support flexible multi‑field queries and use little storage for low‑cardinality columns, outperforming B‑tree in many scenarios. Drawbacks include poor range‑query performance, higher write overhead, and lack of support for unique constraints or prefix‑LIKE optimization.
Issue 2 – Backend Memory Consumption
Each PostgreSQL backend process holds private memory for metadata (e.g., pg_statistic, pg_attribute). Large numbers of tables, partitions, or columns increase this memory. The following command can inspect a backend’s memory usage:
gdb --batch-silent -ex 'call MemoryContextStatsDetail(TopMemoryContext,100)' -p ${backend_pid}Mitigation strategies:
Reduce application pool size.
Deploy pgbouncer as a connection pooler.
Limit the number of partitions.
Isolate connections that access many partitions.
Adjust ALTER TABLE SET STATISTICS to lower statistics collection.
Issue 3 – Garbage Collection (VACUUM)
PostgreSQL uses MVCC with physical row deletion and a background autovacuum process. Tuning key parameters improves cleanup speed: autovacuum_vacuum_cost_limit – increase to ~10000 on SSDs (default 200). autovacuum_vacuum_scale_factor – lower for very large tables (e.g., from 0.2 to 0.05).
Partition large tables (≤1 billion rows or ≤10 GB) to keep autovacuum work manageable.
Q&A (Technical Highlights)
Recovery vs. Oracle: PostgreSQL supports point‑in‑time recovery and integrates with tools such as pg_rman and Barman, providing capabilities comparable to Oracle.
HA Solution: Native streaming replication with a virtual IP (VIP) for read/write traffic; read‑only traffic uses JDBC multi‑host URLs for load balancing.
Table Partitioning: Keep tables under 1 billion rows or 10 GB. Use pg_pathman on PostgreSQL 10 and earlier, or native partitioning from PostgreSQL 12+. For sharding, use Citus.
MySQL → PostgreSQL Migration: Use mysql_fdw for foreign‑data access, ensure UTF‑8 encoding, and strip illegal \0000 Unicode characters.
Citus HA: Two options – multi‑replica sharding (writes to multiple worker replicas) or native PostgreSQL streaming replication for both coordinator and workers.
max_locks_per_transaction: Default value 64 is sufficient for most workloads; increasing it unnecessarily raises memory usage.
Pacemaker‑based HA: Requires fencing devices to avoid split‑brain scenarios because metadata synchronization is not strongly consistent.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
