Inside Medium's Scalable Backend Architecture: Services, Databases, and Deployment
This article details Medium's complex backend infrastructure, covering its service-oriented design, use of Node.js, Go, DynamoDB, Aurora, Redis, messaging queues, CDN strategies, monitoring tools, and continuous deployment practices.
Architecture Overview
Medium runs on AWS using a service‑oriented architecture. The primary backend is Node.js, allowing code sharing between front‑end and back‑end for article editing and publishing. To avoid event‑loop blocking, each host runs multiple Node instances and isolates high‑load tasks on dedicated machines. Supplementary utilities are written in Go for type safety and easy deployment.
Infrastructure
Compute: EC2 instances managed with Ansible; configuration files are version‑controlled.
Load balancing and reverse proxy: Nginx and HAProxy.
CDN: CloudFlare serves most static assets; 5 % each via Fastly and CloudFront for cache‑invalidation flexibility.
Monitoring & alerting: Datadog and PagerDuty; logs are aggregated with the ELK stack (Elasticsearch, Logstash, Kibana).
Data Stores
DynamoDB – primary key‑value store.
Redis cluster – front‑ends DynamoDB to mitigate hot‑key spikes.
Amazon Aurora – relational store for flexible queries.
Neo4j – graph database holding relationships among users, articles, tags, and collections; used for recommendation traversals.
Data Platform
Warehouse: Amazon Redshift receives core entities from DynamoDB and event logs from S3.
ETL: Apache Spark runs batch pipelines; internal scheduler Conduit executes assertion‑based jobs, ensuring producers and consumers remain decoupled.
Schema: Protocol Buffers define contracts across services, mobile apps, and the warehouse.
Image Service
A Go service implements a waterfall processing pipeline. It uses groupcache (a memcached‑style in‑memory cache) backed by persistent S3 storage. Image operations—resize, crop, color cleaning, sharpening—are performed on‑demand when a request arrives.
Text Annotation Service
A lightweight Go server invokes PhantomJS for rendering annotated HTML; a future migration to Pango is planned for richer image handling.
Custom Domains
Users can map personal domains with full HTTPS coverage. HAProxy instances manage certificates and route traffic; automation integrates with Namecheap for DNS and certificate provisioning.
Frontend Stack
The web client is a single‑page application built on a custom framework that uses the Closure library, Closure Templates for server‑side rendering, and the Closure Compiler for minification and module splitting.
Mobile Clients
iOS : native app using NSURLSession for networking, Mantle for JSON‑model mapping, custom UICollectionView layouts, and XCTest/OCMock for testing.
Android : native app targeting the latest SDK, Guava for utility extensions, Protocol Buffers for API definitions, and Mockito/Robolectric for unit and integration tests.
Feature Flags & A/B Testing
All clients consume server‑side feature flags (variants) to enable gradual rollouts and experiments.
Supporting Services
Search: Algolia.
Email: SendGrid.
Push notifications: Urban Airship.
Queueing: Amazon SQS.
Bloom filters: Bloomd.
RSS distribution: PubSubHubbub and Superfeedr.
Build, Test, and Deployment
CI/CD: Jenkins orchestrates builds; the build system migrated from Make to Pants.
Testing: combination of unit tests and HTTP‑level functional tests; distributed execution via Cluster Runner integrated with GitHub.
Deployment: blue‑green and canary strategies with DNS‑based rollbacks; typical stage deployment under 15 minutes, up to ten deployments per day.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
