Operations 16 min read

How to Build a Highly Available etcd Cluster with SSL Security

This guide explains the fundamentals of etcd, its Raft‑based architecture, cluster planning, secure certificate generation, installation steps, service configuration, and verification commands to deploy a reliable, SSL‑protected etcd cluster for service discovery and configuration management.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Build a Highly Available etcd Cluster with SSL Security

etcd is a distributed key‑value store developed by CoreOS that uses the Raft consensus algorithm to ensure data consistency and high availability. It provides a simple HTTP API for service discovery and configuration management.

etcd Overview

Simple installation and configuration with an HTTP API.

Secure communication via SSL certificates.

Fast performance: a single instance can handle over 2k reads per second.

Reliable: Raft ensures data availability and consistency across nodes.

etcd listens on port 2379 for client HTTP API requests and on port 2380 for peer communication. While single‑node deployment is possible, production environments should use a cluster of 3, 5, or 7 nodes to guarantee data replication and consistency.

How it Works

Each etcd cluster consists of multiple members, each running an independent etcd instance. One member acts as the leader, synchronizing logs to followers and sending periodic heartbeats. Clients send requests to the leader, which replicates the log to a majority of followers before committing to disk and responding.

The etcd service comprises three main components: the Raft implementation, a Write‑Ahead Log (WAL) for persistent log storage, and the data storage/indexing layer. WAL files and snapshots are stored on the local disk (e.g., under --data-dir).

Cluster Planning

A typical three‑node cluster might be configured as follows:

etcd01  192.168.255.131  master1
etcd02  192.168.255.132  master2
etcd03  192.168.255.133  master3

Installation

Node configuration requires knowledge of other members' IPs and ports. Deployment options include:

Static configuration using the --initial-cluster flag.

Dynamic discovery via an existing etcd cluster (e.g., discovery.etcd.io).

DNS‑based discovery.

In production, static configuration with SSL is recommended.

1. Install cfssl

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson

2. Create CA, server, client, and peer certificates

etcd acts as a server; etcdctl is the client. Both communicate over HTTPS.

CA certificate: self‑signed authority for signing other certificates.

Server certificate: used by etcd instances.

Client certificate: used by etcdctl.

Peer certificate: used for inter‑node communication.

Generate the necessary directories and configuration files:

mkdir -p /etc/etcd/pki
cd /etc/etcd/pki
cfssl print-defaults config > ca-config.json
cfssl print-defaults csr > ca-csr.json

Create the CA certificate:

{
  "signing": {
    "default": { "expiry": "43800h" },
    "profiles": {
      "server": { "expiry": "43800h", "usages": ["signing","key encipherment","server auth","client auth"] },
      "client": { "expiry": "43800h", "usages": ["signing","key encipherment","client auth"] },
      "peer":   { "expiry": "43800h", "usages": ["signing","key encipherment","server auth","client auth"] }
    }
  }
}

Generate the CA certificate and key:

# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
# ls ca*

Generate client, server, and peer certificates using the appropriate profiles (client, server, peer) and the CA created above.

3. Install etcd binaries

wget https://github.com/coreos/etcd/releases/download/v3.1.5/etcd-v3.1.5-linux-amd64.tar.gz
tar -xvf etcd-v3.1.5-linux-amd64.tar.gz
mv etcd-v3.1.5-linux-amd64/etcd* /usr/local/bin

4. Create systemd service file

On each node, edit /usr/lib/systemd/system/etcd.service and replace the IP and name placeholders with the node’s specific values.

[Unit]
Description=Etcd Server
After=network.target network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
ExecStart=/usr/local/bin/etcd \
  --data-dir=/var/lib/etcd \
  --name=master1 \
  --cert-file=/etc/etcd/pki/server.pem \
  --key-file=/etc/etcd/pki/server-key.pem \
  --trusted-ca-file=/etc/etcd/pki/ca.pem \
  --peer-cert-file=/etc/etcd/pki/peer.pem \
  --peer-key-file=/etc/etcd/pki/peer-key.pem \
  --peer-trusted-ca-file=/etc/etcd/pki/ca.pem \
  --listen-peer-urls=https://192.168.255.131:2380 \
  --initial-advertise-peer-urls=https://192.168.255.131:2380 \
  --listen-client-urls=https://192.168.255.131:2379,http://127.0.0.1:2379 \
  --advertise-client-urls=https://192.168.255.131:2379 \
  --initial-cluster-token=etcd-cluster-0 \
  --initial-cluster=master1=https://192.168.255.131:2380,master2=https://192.168.255.132:2380,master3=https://192.168.255.133:2380 \
  --initial-cluster-state=new \
  --heartbeat-interval=250 \
  --election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Key parameters:

--name : Unique node identifier.

--data-dir : Directory for persistent data.

--snapshot-count : Number of transactions before taking a snapshot.

--heartbeat-interval and --election-timeout : Control leader heartbeats and election timing.

--listen-peer-urls / --listen-client-urls : Addresses for intra‑cluster and client communication.

--advertise-client-urls and --initial-advertise-peer-urls : URLs advertised to other members and clients.

--initial-cluster : List of all members in the format name=ip:port.

--initial-cluster-state : Set to new for a fresh cluster.

--initial-cluster-token : Unique token to avoid clashes between clusters.

5. Start the cluster

mkdir /var/lib/etcd
systemctl daemon-reload && systemctl enable etcd && systemctl start etcd && systemctl status etcd

6. Verify the deployment

From any machine with etcdctl and the generated certificates, run:

etcdctl --ca-file=/etc/etcd/pki/ca.pem \
  --cert-file=/etc/etcd/pki/server.pem \
  --key-file=/etc/etcd/pki/server-key.pem \
  --endpoints=https://192.168.255.131:2379 cluster-health

The command should report each member as healthy and confirm the overall cluster health. Omitting the certificates results in an error such as “certificate signed by unknown authority”.

To list cluster members:

etcdctl --ca-file=/etc/etcd/pki/ca.pem \
  --cert-file=/etc/etcd/pki/server.pem \
  --key-file=/etc/etcd/pki/server-key.pem \
  --endpoints=https://192.168.255.131:2379 member list

Using etcd

HTTP API : Direct interaction via curl or any HTTP client.

etcdctl : Official command‑line tool built in Go, wrapping the HTTP API for easier use.

Conclusion

etcd retains only the most recent 1000 events by default, making it unsuitable for workloads with heavy write traffic. Its typical use cases are configuration management and service discovery—scenarios characterized by many reads and few writes. Compared with ZooKeeper, etcd is simpler to operate, but achieving full service‑discovery functionality often requires integration with auxiliary tools such as registrator or confd. etcd is widely used as the configuration store for Kubernetes.

Currently, there is no official graphical interface for etcd.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityConfiguration ManagementClusterSSLRaftetcd
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.