Databases 19 min read

How Netflix’s Write‑Ahead Log Powers a Resilient Data Platform at Massive Scale

Netflix built a distributed write‑ahead log (WAL) system to guarantee data consistency, durability, and high‑throughput across services, tackling challenges like data loss, cross‑store entropy, multi‑partition updates, replication, and reliable retry for real‑time pipelines.

dbaplus Community
dbaplus Community
dbaplus Community
How Netflix’s Write‑Ahead Log Powers a Resilient Data Platform at Massive Scale

Introduction

Operating at Netflix’s massive scale requires a data platform that can keep billions of users’ content and features consistent, reliable, and efficient. A core component of this platform is a distributed write‑ahead log (WAL) that abstracts persistence and delivery of data changes to downstream consumers.

Key Challenges

Unexpected data loss and corruption in databases.

System entropy when writing to multiple stores (e.g., Cassandra and Elasticsearch).

Updating many partitions (e.g., secondary indexes built on top of NoSQL stores).

In‑region and cross‑region data replication.

Reliable retry mechanisms for large‑scale real‑time pipelines.

Batch deletions that exhaust key‑value node memory.

These issues often caused production incidents, required custom solutions, and generated technical debt. A single ALTER TABLE mistake once corrupted data, but a cache‑TTL extension and Kafka write‑through allowed rapid recovery.

WAL Overview

The WAL is a distributed system that captures data changes, provides strong durability guarantees, and reliably forwards those changes to downstream consumers. It supports use cases such as secondary indexing, cross‑region replication for non‑replicated stores, and delayed‑queue semantics.

API

The public API is intentionally simple, exposing only the necessary parameters. The primary endpoint is WriteToLog:

rpc WriteToLog (WriteToLogRequest) returns (WriteToLogResponse) { ... }
/** WAL request message */
message WriteToLogRequest {
  string namespace = 1;
  Lifecycle lifecycle = 2;
  bytes payload = 3;
  Target target = 4;
}
/** WAL response message */
message WriteToLogResponse {
  Trilean durable = 1;
  string message = 2;
}

A namespace defines where and how data is stored, allowing logical isolation and per‑namespace queue selection (Kafka, SQS, or a combination). Namespaces also hold configuration such as back‑off multipliers and max retry attempts.

Roles and Namespace Configurations

Role 1 – Delayed Queue

For a product‑data system that uses SQS, the namespace configuration enables delayed messages:

{
  "persistenceConfigurations": {
    "persistenceConfiguration": [
      {
        "physicalStorage": { "type": "SQS" },
        "config": {
          "wal-queue": ["dgwwal-dq-pds"],
          "wal-dlq-queue": ["dgwwal-dlq-pds"],
          "queue.poll-interval.secs": 10,
          "queue.max-messages-per-poll": 100
        }
      }
    ]
  }
}

Role 2 – General Cross‑Region Replication

For EVCache cross‑region replication the namespace uses Kafka as the underlying transport:

{
  "persistence_configurations": {
    "persistence_configuration": [
      {
        "physical_storage": { "type": "KAFKA" },
        "config": {
          "consumer_stack": "consumer",
          "context": "cross region replication for evcache_foobar",
          "target": {
            "euwest1": "dgwwal.foobar.cluster.eu-west-1.netflix.net",
            "useast1": "dgwwal.foobar.cluster.us-east-1.netflix.net",
            "useast2": "dgwwal.foobar.cluster.us-east-2.netflix.net",
            "uswest2": "dgwwal.foobar.cluster.us-west-2.netflix.net"
          },
          "wal-kafka-topics": ["evcache_foobar"],
          "wal.kafka.bootstrap.servers.prefix": "kafka-foobar"
        }
      }
    ]
  }
}

Role 3 – Multi‑Partition Updates

For a key‑value service that needs multi‑ID or multi‑table mutations, the namespace includes both Kafka and durable storage to support a two‑phase commit:

{
  "persistence_configurations": {
    "persistence_configuration": [
      {
        "physical_storage": { "type": "KAFKA" },
        "config": {
          "consumer_stack": "consumer",
          "context": "WAL to support multi-id/namespace mutations for dgwkv.foobar",
          "durable_storage": {
            "namespace": "foobar_wal_type",
            "shard": "walfoobar",
            "type": "kv"
          },
          "wal-kafka-topics": ["foobar_kv_multi_id"],
          "wal.kafka.bootstrap.servers.prefix": "kaas_kafka-dgwwal_foobar7102"
        }
      }
    ]
  }
}

All WAL requests guarantee at‑least‑once semantics due to the underlying implementation.

Underlying Principles

The architecture consists of several cooperating components:

Producer‑Consumer Separation: Producers ingest client messages into a queue; consumers read from the queue and deliver to the target store (Cassandra, Memcached, Kafka, etc.). The control plane makes the producer/consumer model pluggable.

Default Dead‑Letter Queues: Each namespace’s queue has an associated DLQ to handle transient and hard errors.

Flexible Targeting: Messages can be routed to any target (database, cache, another queue, or upstream service) based on namespace configuration.

These principles enable a plug‑in architecture where the underlying queue technology can be swapped without code changes.

Deployment Model

WAL runs on Netflix’s Data‑Gateway infrastructure, inheriting mTLS, connection management, authentication, and runtime configuration. Each WAL instance is deployed as a shard; a shard is a set of physical machines. Different services (e.g., ad‑event service, game‑directory service) get dedicated shards to avoid “noisy‑neighbor” interference.

Shards can host multiple namespaces, and operators can add more queues to a namespace if throughput becomes a bottleneck. The configuration is stored in a globally replicated relational database for consistency.

Auto‑scaling groups adjust producer and consumer instance counts based on CPU and network thresholds, using Netflix’s adaptive load‑shedding library and Envoy for request throttling. WAL can be deployed across multiple regions, each with its own instance group.

Key Use Cases

Delayed Queue

Applications that need to execute a request after a delay write to WAL, which guarantees delivery after the specified latency. Example: Netflix Live Origin offloads bulk delete requests to WAL with random jitter, smoothing the request curve and preventing Cassandra overload.

Cross‑Region Replication

WAL enables global replication for services like EVCache. A client writes to its local WAL shard; the WAL consumer reads from Kafka and forwards the change to target regions, where writer groups apply the mutation to regional EVCache clusters.

Multi‑Table Mutations

Key‑Value services use WAL to implement a MutateItems API that supports atomic multi‑table, multi‑ID changes via a two‑phase commit. The API definition:

message MutateItemsRequest {
  repeated MutationRequest mutations = 1;
  message MutationRequest {
    oneof mutation {
      PutItemsRequest put = 1;
      DeleteItemsRequest delete = 2;
    }
  }
}

WAL guarantees eventual consistency: each operation is assigned a sequence number, persisted, and later replayed by consumers. Failed mutations are re‑queued for retry.

Conclusions

Building a generic WAL taught Netflix several lessons:

Pluggable architecture is essential; configuration‑driven targets avoid code changes.

Leverage existing control‑plane and key‑value abstractions to focus on WAL‑specific challenges.

Separate concerns (producer vs. consumer) to enable independent scaling and reduce blast radius.

Expect failures: WAL itself can experience traffic spikes, slow consumers, and non‑transient errors, requiring careful trade‑offs and back‑pressure strategies.

Future Work

Planned extensions include adding secondary indexes to the key‑value service and using WAL to fan‑out requests to multiple data stores (e.g., database plus backup or queue).

WAL architecture diagram
WAL architecture diagram
Delayed queue example
Delayed queue example
Kafka retry example
Kafka retry example
Kafka consumer retry example
Kafka consumer retry example
Cross‑region replication diagram
Cross‑region replication diagram
Multi‑table mutation architecture
Multi‑table mutation architecture
Multi‑table mutation sequence diagram
Multi‑table mutation sequence diagram
backend architecturedata replicationwrite-ahead logNetflix
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.