Overview of Flickr Website Architecture and Scaling Strategies
The article provides a comprehensive overview of Flickr’s web architecture, detailing its massive traffic statistics, load‑balancing, PHP application servers, MySQL dual‑tree sharding, Memcached caching, search engine integration, and the evolution of its replication and scaling strategies.
Flickr’s site handles up to 4 billion queries per day, stores over 2 PB of data (including 12 TB of databases), serves more than 8.5 million registered users, and processes millions of cache operations and database queries, illustrating the massive scale of its infrastructure.
The architecture consists of load‑balancing using ServerIron, Squid caches for static HTML and photos, NetApp NAS storage, PHP application servers running on Red Hat Linux/Apache, a private Flickr File System for massive file storage, and a dual‑tree central MySQL database that combines master‑master and master‑slave designs to avoid single points of failure while optimizing read performance.
PHP servers host roughly 60 000 lines of stateless PHP code (no sessions) with each page issuing 27‑35 queries; a separate Apache web farm serves static files. The storage manager runs a proprietary Flickr File System for large‑scale file handling.
The dual‑tree central database stores user‑to‑shard mappings; actual user data and photo metadata reside in master‑master shards, each holding about 400 000 users (≈120 GB). Shard servers operate at roughly 50 % capacity to allow online maintenance, and critical data is duplicated across shards with transactional consistency.
A Memcached cluster caches SQL query results, reducing database load, while a dedicated search engine farm indexes a copy of tag data for real‑time full‑text search, supplemented by Yahoo’s search service for other queries.
Typical hardware includes 64‑bit Intel/AMD CPUs with 16 GB RAM, 6‑disk 15K RPM RAID‑10, housed in 2U boxes. The deployment comprises 166 database servers, 244 web servers (including Squid), and 14 Memcached servers (data from the 2008 MySQL Conference).
Initially, Flickr scaled from a single server to a dual‑server setup, then faced MySQL bottlenecks. The team evaluated Scale‑Up (adding CPU/RAM) versus Scale‑Out (adding servers). Because Scale‑Out offers near‑unlimited growth, Flickr adopted master‑slave replication, directing writes to the master and reads to slaves, which could be added as read load increased.
To balance reads across slaves, a load‑balancer can be placed between the web app and the slave pool. An example Apache configuration is shown below:
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
LoadModule proxy_http_module modules/mod_proxy_http.so
BalancerMember "http://slave1:8008/App" loadfactor=4
BalancerMember "http://slave2:8008/App" loadfactor=3
BalancerMember "http://slave3:8008/App" loadfactor=3A lightweight PHP function for connecting to a list of MySQL hosts demonstrates simple read‑write distribution:
function db_connect($hosts, $user, $pass){
shuffle($hosts);
foreach($hosts as $host){
debug("Trying to connect to $host...");
$dbh = @mysql_connect($host, $user, $pass, 1);
if ($dbh){
debug("Connected to $host!");
return $dbh;
}
debug("Failed to connect to $host!");
}
debug("Failed to connect to all hosts in list - giving up!");
return 0;
}While master‑slave replication improves read scalability, it does not address write bottlenecks or single‑point failures. Flickr therefore moved to a master‑master shard configuration, forming a “dual‑tree” structure that combines master‑master and master‑slave benefits.
In 2005, Flickr introduced sharding to achieve linear Scale‑Out. Sharding splits large tables into many smaller databases, each holding a subset of users, allowing parallel writes, faster reads, and easier backup/recovery. Shard lookup is performed via a central Global Lookup Cluster that maps user IDs to specific shards.
Application code changes involve replacing a static connection string with a lookup call, e.g.:
string connectionString = GetDatabaseFor(customerId);
OdbcConnection conn = new OdbcConnection(connectionString);
conn.Open();Sharding introduces challenges: cross‑shard joins are costly, requiring data duplication; consistency is maintained via transactions, two‑phase commit, or asynchronous queues. Flickr uses a PHP‑based Queue system to handle operations like comment propagation, ensuring eventual consistency while offloading work from the critical path.
Rebalancing shards when load thresholds are reached remains a manual process at Flickr, involving data migration and temporary account locking.
Memcached is employed as a distributed in‑memory cache for frequently accessed objects, reducing database I/O. It is suitable for read‑heavy workloads but adds complexity for write‑heavy data, lacking built‑in redundancy or failover.
Typical read/write patterns with Memcached are illustrated:
$data = memcached_fetch($id);
return $data if $data;
$data = db_fetch($id);
memcached_store($id, $data);
return $data; db_store($id, $data);
memcached_store($id, $data);To mitigate replication lag, Flickr employs a “Write‑Through Cache” layer that writes to a cache before updating database nodes, directing reads to synchronized nodes. This approach, while complex, helps hide latency and maintain consistency.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.