Cloud Native 19 min read

Mastering etcd: History, Architecture, and Real‑World Use Cases

This article traces etcd’s evolution from its CoreOS origins, explains its Raft‑based distributed architecture, details its API groups, versioning and watch mechanisms, and showcases typical scenarios such as metadata storage, service discovery, leader election, and distributed coordination in cloud‑native environments.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Mastering etcd: History, Architecture, and Real‑World Use Cases

Development Timeline

etcd originated at CoreOS to provide a highly available, strongly consistent key‑value store for distributed configuration and OS upgrade coordination. It later became a CNCF incubating project and is used by major cloud providers.

June 2013 – First commit to GitHub.

June 2014 – Adopted by Kubernetes v0.4 (etcd 0.2), accelerating community growth.

Feb 2015 – etcd 2.0 released with a redesigned Raft algorithm; >1 000 writes/s.

Jan 2017 – etcd 3.1 released; new gRPC API, more efficient reads, GC optimizations; >10 000 writes/s.

2018 – CNCF incubation; >400 contributors from eight companies.

2019 – etcd 3.4 co‑developed by Google and Alibaba, further performance and stability improvements.

Overall Architecture

etcd is a distributed, reliable key‑value store built on the Raft consensus algorithm. A typical production cluster consists of 3 or 5 nodes. One node is elected leader; the leader serialises writes and replicates log entries to followers. If the leader fails, a new leader is elected automatically.

Clients may read or write to any node; the cluster guarantees linearizable consistency using a quorum of (n+1)/2 nodes. This quorum property ensures that any two majority subsets intersect in at least one node, allowing safe log replication after leader changes.

etcd cluster architecture
etcd cluster architecture

API Overview

etcd exposes five logical API groups:

Put & Delete – Simple key/value writes and deletions.

Range (Query) – Single‑key lookups or range queries.

Watch – Real‑time subscription to key changes; supports prefix watches.

Txn (Transactions) – Conditional atomic operations (if‑else semantics).

Lease – Time‑bound contracts that automatically expire attached keys.

Data Versioning and MVCC

Each key stores three version numbers: create_revision – Revision when the key was first created. mod_revision – Revision of the most recent modification. version – Counter of how many times the key has been modified.

Two global counters are maintained by the cluster:

term – Increments each time the Raft leader changes.

revision – Monotonically increasing global data version; incremented on every write.

These counters enable multi‑version concurrency control (MVCC) and precise watch semantics. Clients can request a specific revision to read historical state, and watches can start from any past revision to receive a continuous stream of changes.

MVCC and watch internals
MVCC and watch internals

Mini‑Transactions

A transaction is an atomic if‑else block. Example:

if Value(key1) > "bar" && Version(key1) == 2 {
    Put(key2, "valueX")
    Delete(key3)
} else {
    Put(key2, "valueY")
}

The entire block sees a consistent snapshot of the store and either fully succeeds or fails, guaranteeing atomicity.

Transaction flow
Transaction flow

Lease Mechanism

A lease represents a time‑bound contract identified by a lease ID. Keys attached to a lease are automatically removed when the lease expires. Clients keep a lease alive by periodically invoking KeepAlive. This pattern is useful for implementing TTL‑based caches, service registration, and distributed heartbeats.

Lease lifecycle
Lease lifecycle

Typical Use Cases

Metadata Storage

Kubernetes stores its entire control‑plane state in etcd. By delegating consistency and high availability to etcd, the Kubernetes API server can remain simple and stateless.

Service Discovery

Services register their network address in etcd. API gateways or sidecar proxies watch the registration keys; when a service instance crashes, its lease expires and the entry is removed automatically, keeping the routing table up‑to‑date.

Leader Election

Competing nodes attempt to create a designated election key using a transaction. The node that succeeds writes its own address to the key and becomes leader. Followers read the key to discover the current leader. If the leader fails, its lease expires, the key is removed, and a new election can proceed.

Distributed Coordination & Concurrency Control

etcd can act as a distributed semaphore by using a lease‑protected key as a lock. Multiple processes attempt to acquire the lock via a transaction; the lease ensures that a crashed holder releases the lock automatically. Long‑running jobs can persist intermediate state in etcd, enabling fast recovery after failures.

Coordination patterns
Coordination patterns

In summary, etcd provides a Raft‑based, strongly consistent key‑value store with a simple gRPC/HTTP API. Its built‑in versioning, watch streams, transactions, and lease primitives enable core cloud‑native patterns such as metadata storage, service discovery, leader election, and distributed coordination.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesservice discoveryTransactionsRaftetcdkey-value store
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.