Migrating OkCupid from REST to GraphQL: Process, Lessons, and Outcomes
The article details OkCupid’s year‑and‑a‑half migration from a REST API to a production GraphQL API, describing the four‑step process—page selection, schema construction, shadow requests, and A/B testing—along with lessons learned about error handling, business‑logic placement, and performance monitoring.
1. Benefits
Our GraphQL API has been in production for 1.5 years, handling 227 entities and 170 000 requests per minute. We stopped adding new features to the REST API a year ago, although the REST API is still partially used.
2. How We Did It
We needed a plan to validate a new stack (Node, Apollo Server, Docker) without disrupting production. The workflow consisted of four steps:
Choose an appropriate page for conversion.
Build the GraphQL schema.
Add shadow requests that call the new API while still fetching data from the REST API.
Run A/B tests with real users.
We started in January 2019, released shadow queries on January 28, began A/B testing on March 13, and launched the full API on April 30, achieving a production‑ready GraphQL API in four months.
Choosing the Page
We selected the OkCupid conversation page because it contains core data such as user info, match status, and conversation details, providing a realistic test case for the new API.
Building the Schema
Key recommendations include researching existing schemas, avoiding REST‑style naming, keeping naming consistent, making field names specific (e.g., User.essaysWithDefaults vs User.essays ), and designing pagination that returns a simple data list rather than Relay‑specific edge / node structures.
Adding Shadow Requests
Before serving GraphQL data to users, we sent parallel requests to both the REST and GraphQL APIs on the target page, allowing us to compare performance and fix issues before they affect users.
Running Experiments
We performed A/B testing with real users for a month, each variant covering over 100 000 users, and measured whether any statistically significant change occurred.
3. Areas for Improvement
We discovered missing error‑type definitions in the GraphQL API and recognized that business logic belongs in the backend rather than the API layer.
4. Summary
The described process proved effective for quickly delivering a production GraphQL API, validating technical decisions, and catching errors before they impact users. Teams undertaking similar migrations can follow this workflow.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.