Mastering Envoy: Service Discovery, Load Balancing, and Health Checks Explained
Envoy, the high‑performance edge and service‑mesh proxy, offers three core mechanisms—service discovery, load balancing, and health checking—each with multiple configurable options and code examples, enabling operators and developers to optimize distributed systems for scalability, reliability, and performance in cloud‑native environments.
In the era of cloud‑native architecture, Envoy has become a high‑performance, extensible edge and service‑to‑service proxy that serves as a core component of service meshes such as Istio.
Three Core Mechanisms
Envoy provides Service Discovery, Load Balancing, and Health Checking/Outlier Detection to help operators and developers understand its operation and optimize online systems.
Service Discovery Mechanism
When an upstream cluster is defined, Envoy must resolve the members of the cluster. This is the service discovery process.
Static : The simplest type; explicitly specify network names (IP/port, Unix socket, etc.) for each upstream host.
Strict DNS/Logical DNS : Envoy periodically resolves DNS records (A/AAAA) to obtain a list of backend instance IPs.
Original destination : Uses the original destination cluster when inbound connections are redirected to Envoy via iptables REDIRECT, TPROXY, or proxy protocol.
Endpoint Discovery Service (EDS) : An xDS management server based on gRPC or REST‑JSON API that Envoy queries to obtain cluster members.
# Static
clusters:
- name: static_cluster
type: STATIC
load_assignment:
cluster_name: static_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 10.0.0.1
port_value: 8080
# Logical DNS
clusters:
- name: logical_dns_cluster
type: LOGICAL_DNS
load_assignment:
cluster_name: logical_dns_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: my-service.default.svc.cluster.local
port_value: 80
# EDS (Endpoint Discovery Service)
clusters:
- name: eds_cluster
type: EDS
eds_cluster_config:
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
- envoy_grpc:
cluster_name: xds_clusterLoad Balancing Mechanism
Load balancing is the foundation of high availability and high performance in distributed systems.
ROUND_ROBIN : Distributes requests in a round‑robin fashion; the default algorithm.
LEAST_REQUEST : Prefers the instance with the fewest active requests.
RING_HASH : Consistent hashing based on request attributes (e.g., header or cookie).
MAGLEV : High‑performance consistent hashing with more uniform distribution.
RANDOM : Randomly assigns requests; simple and efficient.
# ROUND_ROBIN
lb_policy: ROUND_ROBIN
# LEAST_REQUEST
lb_policy: LEAST_REQUEST
least_request_lb_config:
choice_count: 2
# RING_HASH
lb_policy: RING_HASH
ring_hash_lb_config:
minimum_ring_size: 1024
maximum_ring_size: 4096
# MAGLEV
lb_policy: MAGLEV
maglev_lb_config:
table_size: 65537
# RANDOM
lb_policy: RANDOMActive Health Checks
Envoy can perform health checks over different protocols:
HTTP – sends an HTTP request to the upstream host.
gRPC – sends a gRPC request.
L3/L4 – sends a configurable byte buffer.
Redis – sends a Redis PING command and expects a PONG response.
health_checks:
- timeout: 1s
interval: 5s
unhealthy_threshold: 3
healthy_threshold: 2
http_health_check:
path: "/healthz"
expected_statuses:
- start: 200
end: 399Full Example Configuration
# Configuration file
$ cat /tmp/envoy-demo.yaml
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: simple_cluster
clusters:
- name: simple_cluster
# Load balancing policy
lb_policy: ROUND_ROBIN
# Static service discovery
type: STATIC
load_assignment:
cluster_name: simple_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address: { address: 172.139.20.170, port_value: 8090 }
- endpoint:
address:
socket_address: { address: 172.139.20.3, port_value: 8090 }
- endpoint:
address:
socket_address: { address: 172.139.20.92, port_value: 8090 }
# Health checks
health_checks:
- timeout: 5s
interval: 15s
no_traffic_interval: 60s
unhealthy_threshold: 3
healthy_threshold: 1
http_health_check:
path: /ping
expected_statuses:
- start: 200
end: 399
# Enable health‑check failure logging
event_logger:
- name: envoy.health_checkers.event_logger.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.health_check.event_sinks.file.v3.HealthCheckEventFileSink
event_log_path: "/tmp/envoy_health_events.log"
always_log_health_check_failures: true
# Disable panic threshold
common_lb_config:
healthy_panic_threshold:
value: 0
# Run Envoy
$ docker run -d --name envoy \
-v /tmp/envoy-demo.yaml:/etc/envoy/envoy.yaml:ro \
-p 10000:10000 \
envoyproxy/envoy:v1.28.7Conclusion
Service discovery lets Envoy dynamically perceive backend changes and integrate with various registries.
Load balancing offers rich policy choices to suit different business scenarios.
Active health checks improve overall system robustness.
For operators, a deep understanding of these mechanisms enables better configuration, tuning, fault isolation, capacity planning, and high‑availability design.
Linux Ops Smart Journey
The operations journey never stops—pursuing excellence endlessly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
