Cloud Native 45 min read

What Is etcd? Features, Use Cases, and How It Powers Kubernetes

This article explains etcd as a highly available distributed key‑value store, outlines its simple, secure, fast, and reliable characteristics, describes typical scenarios such as service discovery and distributed locking, and then provides a comprehensive overview of Kubernetes architecture, components, deployment methods, security, networking, storage, and operational best practices.

Open Source Linux

May 30, 2021

What Is etcd? Features, Use Cases, and How It Powers Kubernetes

What is etcd and its characteristics?

etcd is an open‑source project initiated by the CoreOS team, a distributed key‑value database implemented in Go that provides configuration management and service discovery.

Simple: supports a REST‑style HTTP+JSON API

Secure: accesses via HTTPS

Fast: handles up to 1k/s write operations

Reliable: uses a distributed structure with the Raft consensus algorithm

Typical scenarios for etcd

Because of its strong features, etcd is widely used in the following scenarios:

Service discovery – locating and connecting to services within a distributed cluster

Message publish/subscribe – a shared configuration center for dynamic updates

Load balancing – storing frequently accessed data to enable balanced access across nodes

Distributed notification and coordination – using the Watch mechanism for real‑time data change handling

Distributed locks – leveraging Raft’s strong consistency to implement exclusive or sequenced locks

Cluster monitoring and leader election – simple, real‑time monitoring via etcd

What is Kubernetes?

Kubernetes is a container‑based distributed system platform originally open‑sourced by Google (based on Borg). Built on Docker, it provides deployment, resource scheduling, service discovery, auto‑scaling, multi‑tenant support, built‑in load balancing, fault detection, self‑healing, rolling upgrades, automatic resource scheduling, and fine‑grained quota management.

Relationship between Kubernetes and Docker

Docker manages the lifecycle of containers and builds container images, offering portability.

Kubernetes orchestrates and manages containers running across multiple hosts.

Kubernetes core tools

Minikube – runs a single‑node Kubernetes cluster locally.

Kubectl – command‑line tool to interact with the Kubernetes API (create, delete, update resources, view applications).

Kubelet – agent that runs on each node, communicating with the master.

Common Kubernetes deployment methods

kubeadm – the recommended deployment tool.

Binary installation.

Minikube – local single‑node cluster.

Kubernetes cluster management

The cluster consists of a master node and multiple worker nodes. The master runs kube‑apiserver, kube‑controller‑manager, and kube‑scheduler, providing resource management, pod scheduling, auto‑scaling, security control, monitoring, and error correction.

Kubernetes advantages, scenarios, and characteristics

Container orchestration

Lightweight

Open source

Elastic scaling

Load balancing

Typical scenarios include rapid application deployment, quick scaling, seamless integration of new features, and resource optimization.

Portable across public, private, hybrid, and multi‑cloud environments

Modular and extensible via plugins

Automated deployment, restart, replication, and scaling

Kubernetes drawbacks

Installation and configuration are complex

Management can be cumbersome

Runtime and compilation consume significant time

Higher cost compared with some alternatives

May be overkill for simple applications

Basic Kubernetes concepts

Master – control plane managing the cluster, running etcd, API server, controller manager, and scheduler.

Node (worker) – runs pods, Docker engine, kubelet, and kube‑proxy.

Pod – the smallest deployable unit, a group of one or more containers sharing network namespace and storage.

Label – key/value pairs attached to objects for selection.

ReplicationController – ensures a specified number of pod replicas.

Deployment – manages ReplicaSets and provides rolling updates.

Horizontal Pod Autoscaler (HPA) – automatically scales pod replicas based on metrics.

Service – abstracts a set of pods and provides a stable network endpoint.

Volume – shared storage accessible by containers in a pod.

Namespace – logical isolation for multi‑tenant environments.

Kubernetes master components

Kubernetes API Server – central entry point exposing RESTful APIs.

Kubernetes Scheduler – selects suitable nodes for new pods.

Kubernetes Controller Manager – runs various controllers to maintain desired state.

ReplicationController, NodeController, NamespaceController, ServiceController, EndpointsController, ServiceAccountController, PersistentVolumeController, DaemonSetController, DeploymentController, JobController, PodAutoscalerController – each responsible for specific resources.

ReplicationController mechanism

RC ensures the actual number of pod replicas matches the desired count, creating or deleting pods as needed.

Difference between ReplicaSet and ReplicationController

Both maintain a set number of pod replicas; ReplicaSet uses set‑based selectors, while ReplicationController uses equality‑based selectors.

kube‑proxy role

Runs on every node, watches Service and Endpoint changes, creates routing rules to provide virtual IPs and load balancing, acting as a transparent proxy.

kube‑proxy iptables mode

Since Kubernetes 1.2, iptables is the default mode; kube‑proxy watches the API server and updates iptables rules, directing traffic directly to target pods.

kube‑proxy IPVS mode

Introduced as GA in Kubernetes 1.11, IPVS offers high‑performance load balancing using hash tables and ipset for efficient rule matching.

Static Pods

Managed directly by kubelet on a specific node, not created via the API server and cannot be controlled by RC, Deployment, or DaemonSet.

Pod lifecycle phases

Pending – pod created but containers not yet running.

Running – at least one container is running.

Succeeded – all containers terminated successfully.

Failed – at least one container terminated with failure.

Unknown – status cannot be obtained.

Pod creation workflow

Client submits pod manifest (YAML) to kube‑apiserver.

Apiserver stores the object and notifies controller‑manager.

Controller‑manager writes the pod spec to etcd.

Kube‑scheduler selects a suitable node and creates a binding.

Kubelet on the chosen node pulls the image and starts the pod, reporting status back to the scheduler.

Pod restart policies

Defined by RestartPolicy (Always, OnFailure, Never) and enforced by kubelet on the node.

Always – restart on any failure (default).

OnFailure – restart only if exit code is non‑zero.

Never – never restart.

Controllers impose constraints: RC and DaemonSet require Always; Job uses OnFailure or Never.

Pod health checks

Two probes are used:

LivenessProbe – determines if a container is alive; failing causes kubelet to kill and restart it.

ReadinessProbe – determines if a container is ready to serve traffic; failing removes the pod from Service endpoints.

StartupProbe – guards slow‑starting containers from being killed by other probes.

Pod scheduling strategies

Deployment/RC – automatically maintains desired replica count.

NodeSelector – schedules pods to nodes with matching labels.

NodeAffinity – hard (requiredDuringScheduling) or soft (preferredDuringScheduling) rules.

Taints and Tolerations – nodes can repel pods unless they tolerate the taint.

Init containers

Run sequentially before application containers; all must succeed before the pod starts its main containers.

Deployment upgrade process

Creating a Deployment creates a ReplicaSet with the desired number of pods.

Updating the Deployment creates a new ReplicaSet, scales it up, and scales down the old one.

The process repeats until the old ReplicaSet reaches zero replicas.

Deployment upgrade strategies

Two strategies are supported:

Recreate – kills all existing pods before creating new ones.

RollingUpdate (default) – gradually replaces pods while respecting maxUnavailable and maxSurge settings.

DaemonSet characteristics

Ensures exactly one pod runs on each node; does not support replica counts.

Automatic scaling (HPA)

HPA controller monitors CPU (or custom) metrics via the Metrics Server and adjusts pod replica counts accordingly.

Kubernetes Service types

ClusterIP – internal virtual IP for intra‑cluster access.

NodePort – exposes the service on a static port on each node.

LoadBalancer – provisions an external load balancer (typically in public clouds).

Service load‑balancing strategies

RoundRobin – default, distributes requests evenly.

SessionAffinity – pins a client IP to a specific pod.

Headless Service

Creates no ClusterIP; returns the list of pod IPs directly to the client, useful for custom load balancing or service discovery.

External access to services

HostPort – map pod port to the host.

NodePort – expose service on a node port.

LoadBalancer – use cloud provider’s load balancer.

Ingress

Ingress resources define HTTP routing rules; an Ingress controller implements those rules, forwarding traffic directly to backend services, effectively bypassing kube‑proxy.

Image pull policies

Always – always pull the image (default for :latest tags).

Never – never pull, use only local images.

IfNotPresent – pull only if the image is not present locally.

Kubernetes security mechanisms

Infrastructure isolation between containers and hosts.

Principle of least privilege for components.

RBAC for user and service account permissions.

API server authentication (HTTPS, tokens) and authorization (RBAC).

Secrets for sensitive data.

Admission controllers for request validation.

RBAC advantages

Comprehensive coverage of resource and non‑resource permissions.

Managed via standard API objects.

Adjustable at runtime without restarting the API server.

Kubernetes Secrets

Store confidential data such as passwords, tokens, and SSH keys, offering a safer alternative to embedding secrets in pods or images.

Secret usage patterns

Automatically via ServiceAccount.

Mounted as files into a pod.

Referenced as ImagePullSecrets for pulling private images.

PodSecurityPolicy

Enforces fine‑grained security controls for pod creation; requires explicit policies and RBAC permissions before pods can be created.

Network model

Each pod receives a unique IP address and can communicate directly with any other pod in a flat, routable network.

CNI model

CNI provides a plugin‑based interface for configuring container networking, handling IP allocation and teardown.

Network policies

Allow fine‑grained traffic control between pods, enforced by a policy controller and implemented via CNI plugins.

Flannel

Provides an overlay network assigning non‑conflicting IPs to containers and routing traffic between them.

Calico

Implements a pure‑L3 BGP‑based network, using a kernel‑level vRouter for efficient routing without overlay or NAT.

Persistent storage

Supports stateful workloads via shared storage solutions such as EmptyDir, HostPath, PersistentVolumes (PV), and PersistentVolumeClaims (PVC).

PV and PVC lifecycle

Available – not bound to any claim.

Bound – linked to a PVC.

Released – claim deleted but resource not yet reclaimed.

Failed – reclamation failed.

Storage supply modes

Static – admin manually creates PVs.

Dynamic – PVC triggers automatic PV creation via StorageClass.

CSI model

Container Storage Interface separates storage provider implementations (CSI Controller) from node‑level volume management (CSI Node).

Adding worker nodes

Install Docker, kubelet, and kube‑proxy on the node.

Configure them to point to the master URL and start the services.

Kubelet auto‑registers the node with the master.

The master adds the node to the scheduling pool.

Pod resource control

Pods request CPU and memory; the scheduler ensures the sum of requests on a node does not exceed its capacity.

Requests vs. Limits impact on scheduling

Schedulers consider only resource requests when placing pods; limits are enforced at runtime.

Metrics service

Metrics Server (since v1.10) provides core metrics (CPU, memory) for nodes and pods; custom metrics are collected by Prometheus or similar.

EFK logging stack

Elasticsearch stores logs, Fluentd collects logs from each node and forwards them to Elasticsearch, and Kibana provides a web UI for searching and visualizing logs.

Graceful node maintenance

Use kubectl drain to evict pods before shutting down a node.

Cluster federation

Federation allows multiple Kubernetes clusters to be managed as a single logical cluster.

Helm

Helm is the package manager for Kubernetes, bundling related resources into a Chart for easy distribution, installation, upgrade, rollback, and removal.

Source: https://blog.csdn.net/estarhao/article/details/114703958

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kubernetes Etcd Container Orchestration

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.