How Swat.io Migrated from MySQL to PostgreSQL in Two Years
This case study details Swat.io's two‑year journey from MySQL to PostgreSQL, covering the motivations, technical challenges, incremental experiments, migration strategy, performance tuning, and the lessons learned that helped the team successfully complete the transition.
Translator's note: Swat.io is a European social‑media solutions company based in Vienna.
At first glance the article looks like clickbait, but the fact remains that the company started migrating from MySQL to PostgreSQL because MySQL was becoming increasingly complex.
When the company was founded, the team consisted of only three people with limited resources for infrastructure; for small teams, moving to PostgreSQL can be a sensible choice.
MySQL has many advantages, but over time its drawbacks outweighed them. Relying on Percona MySQL and Ubuntu LTS 14.04 without external tools incurred high human‑cost for learning, maintenance, and workflow management.
Main Reasons We Were Unsatisfied with MySQL 5.6
Inability to add new columns online, forcing monthly server pauses for large tables.
Lack of a reliable way to add indexes online; even though MySQL 5.6 supports it, the team lacked confidence in its stability.
No native JSON type (added only in 5.7.8).
Missing advanced features such as CTEs and window functions.
Connection‑limit issues caused by hundreds of concurrent SELECTs.
Absence of a native boolean type, requiring Tinyint(0) workarounds.
Long backup times with mysqldump (3‑4 hours) and difficulty with parallel backups.
Importing backups took even longer (8‑9 hours).
DDL statements not being transactional.
Launching a Lightweight Postgres? We Did It!
The team followed the suggestion to explore Postgres more deeply, initially planning to switch drivers, but later proposing a full migration in November 2014 under the title “[Database] Brave New World”. The proposal was abandoned after six months, yet the team persisted.
They decided to introduce Postgres in a new analysis‑engine project, allowing a clean start independent of the existing system. Although most data remained in MySQL, synchronizing it with Postgres was acceptable from a business perspective.
The project took several months, during which the team learned how to integrate Postgres into their development VM stack, understand role concepts, evaluate write‑intensive performance, prototype new features quickly, tune query performance, handle backup/restore, and integrate with PHP and NodeJS.
Mirror Mode? We Failed!
The team attempted to create a mirror model in their ORM that wrote to Postgres whenever a MySQL model was saved. In theory it sounded good, but in practice some areas were not updated correctly, leading to abandonment of this approach.
Continuing Small Experiments? We Succeeded!
In the summer of 2015, they migrated some non‑critical tables to Postgres and decommissioned the old MySQL tables.
Subsequent Attempts
Around August 2015 the team grew frustrated with MySQL again and tried migrating the application codebase. Performance lagged behind MySQL, prompting micro‑optimizations such as disabling SSL connections and extensive query rewrites. The main focus became moving more work to the database side (e.g., using more complex composite queries).
Embarrassingly, a year later they discovered the culprit: the CakePHP 2 Postgres driver overhead, which revealed many optimization opportunities beyond their current capacity.
The attempt stalled and was eventually forgotten.
Challenge Everything, Never Give Up!
Later, the team faced the decision of creating new tables in MySQL or Postgres. InnoDB foreign‑key handling proved error‑prone across databases, but many MySQL pain points disappeared:
Adding nullable columns without defaults became a trivial no‑op.
Adding new indexes concurrently was straightforward.
After expanding the codebase and introducing a new JSON‑API backend in 2016, the team accumulated extensive experience writing DML and DDL statements in tests, though they eventually stopped due to MySQL’s limitations.
Dawn of Migration
Postgres’s popularity continued to rise while MySQL’s maintenance windows grew longer for schema changes. In spring‑summer 2016, management gave the green light for a full migration.
Several attempts in 2016 yielded performance insights and bug discoveries. After months of effort, a migration was scheduled for February 2017.
Data migration succeeded, but key application components suffered performance regressions, leading to a rollback. The team persisted, eventually identifying “cold” database effects as the root cause and adding pg_prewarm to improve first‑time user experience.
In March 2017 the second attempt succeeded; although the data migration took longer than expected, traffic ran smoothly on Postgres.
Final Migration Statistics
Used pgloader to migrate 120 GB of data, running custom scripts in parallel across all CPU cores.
The migration itself took about four hours, accelerated by running 32 SQL scripts concurrently.
Performed 20‑30 verification runs over two months to ensure stability under concurrency.
Converted eight codebases (three of them large).
Reviewed over 250 pull‑request comments, 150+ code commits, and roughly 50 git branches, affecting about 400 files.
Added ~7 000 lines of code and removed ~4 000 lines.
Conducted three major problem‑solving attempts and two migration trials.
Only two genuine bugs were found in the codebase during the failed and successful migrations.
Lessons Learned and Takeaways
Running VACUUM ANALYZE after bulk imports updates statistics, enabling the planner to choose optimal execution plans.
Using timestamp with time zone returns a timezone‑aware string, but can introduce unnecessary complexity.
CakePHP 2 ORM caused issues such as extra driver calls for column metadata, leading to query latency; enabling detailed SQL logging helped surface the problem.
Data inconsistencies appeared days after migration, requiring a dedicated Postgres driver for better compatibility.
Long column names (63 characters) hit ORM limits, necessitating raw SQL rewrites.
Switching to Laravel eliminated manual boolean conversions thanks to native support.
Optimized “thread comments” by collapsing N×M queries into a single CTE‑based query, achieving good performance even with 100 k comments.
PostgreSQL’s EXPLAIN (ANALYZE, BUFFERS) output proved more useful than MySQL’s.
Ordering of result rows differs between MySQL and Postgres (Postgres follows OS characteristics; PostgreSQL 10 introduced changes).
Postgres provides built‑in UTC handling, unlike MySQL which requires explicit timezone imports.
Postgres full‑text search with GIN indexes can replace external ElasticSearch for many use cases.
Partial indexes with WHERE clauses and expression indexes simplified many queries.
Postgres triggers with WHEN clauses proved more powerful than MySQL’s.
Leveraging new features (partial indexes, window functions) halved average system load.
WAL archiving can generate massive files (e.g., 100 GB) from seemingly trivial updates, necessitating an expandable filesystem.
Postgres made renaming hundreds of indexes and foreign keys painless, avoiding downtime.
Postgres documentation’s clarity helped the team understand complex concepts more easily than MySQL’s.
Conclusion
The chaotic coordination of two databases and the painful MySQL maintenance severely impacted the team, but Postgres finally resolved these issues. The moment the migration completed brought an indescribable sense of satisfaction.
Looking ahead to 2017, the focus will be on improving performance, handling scaling demands, and increasing customer satisfaction.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
