Unlocking Service Mesh: A Deep Dive into Envoy’s Architecture and Deployment
This article introduces Envoy as the core sidecar proxy in Service Mesh, explains its terminology, thread model, filters, routing, health checking, circuit breaking, hot restart, and presents multiple deployment patterns for cloud‑native microservice environments.
Introduction
The article originates from the ADDOPS team; the translator participated in building the 360 HULK cloud platform’s containerization and virtualization services and offers unique insights on micro‑services. Istio, a joint open‑source project by Google, IBM, and Lyft, is introduced as a five‑month‑old Service Mesh solution.
Envoy Overview
In a Service Mesh each service is paired with a sidecar proxy (Envoy) that handles inter‑service communication without the application being aware of it. These sidecars form a lightweight network proxy matrix, effectively becoming the Service Mesh.
Envoy Terminology
Host : a logical network entity that may run on multiple physical machines.
Downstream : the request initiator.
Upstream : the request receiver.
Listener : a service that listens for downstream requests.
Cluster : an upstream group discovered and load‑balanced by Envoy.
Mesh : a collection of Envoy proxies forming a reliable request delivery network.
Runtime configuration : hot‑reloaded configuration without restart.
Filter : pluggable logic units in Envoy’s processing pipeline.
Basic Concepts
Thread Model
Envoy runs a single process with one main thread and multiple worker threads. Each worker independently listens and forwards requests. The recommended number of worker threads equals the machine’s CPU thread count.
Listeners
Listeners are the active components that bind to ports, initialize associated filters, and process downstream requests. Envoy can create any number of listeners; typically one listener per CPU thread is used.
Network (L3/L4) Filters
Three filter types exist: Read (invoked on inbound data), Write (invoked on outbound data), and Read/Write (bidirectional control).
HTTP Filters
The built‑in HTTP connection manager filter converts raw bytes into HTTP structures and handles logging, request IDs, header manipulation, routing, and statistics. It provides three filter types: Decoder , Encoder , and Decoder/Encoder .
HTTP Protocols
Envoy natively supports HTTP/1.1, WebSockets, and HTTP/2 (no SPDY). It can translate HTTP/1.1 traffic into an HTTP/2‑like internal representation.
Access Log
Access logging is configurable and can be tailored to specific needs.
Routing
Envoy includes an HTTP router filter that enables advanced routing, virtual hosts, path and host matching, TLS redirection, header rewriting, and WebSocket upgrades. Routing can be static or dynamically fetched via RDS (Route Discovery Service).
{
"cluster": "...",
"route_config_name": "route_config_example",
"refresh_delay_ms": "3000"
}
route_config_example:
{
"validate_clusters": "example",
"virtual_hosts": [
{
"name": "vh01",
"domains": ["test.foo.cn"],
"routes": [],
"require_ssl": "...",
"virtual_clusters": [],
"rate_limits": [],
"request_headers_to_add": [
{"key": "header1", "value": "value1"},
{"key": "header2", "value": "value2"}
]
}
],
"internal_only_headers": [],
"response_headers_to_add": [],
"response_headers_to_remove": [],
"request_headers_to_add": []
}Advanced Concepts
Cluster Manager
The cluster manager oversees all upstream clusters, handling health checks, load balancing, connection types, and protocol selection. Clusters can be defined statically or discovered dynamically via CDS (Cluster Discovery Service).
{
"clusters": [],
"sds": "{...}",
"local_cluster_name": "...",
"outlier_detection": "{...}",
"cds": "{...}"
}Service Discovery (SDS)
Discovery methods include static configuration, DNS‑based discovery, original destination, SDS, and eventually consistent discovery.
Health Checking
Envoy supports three active health‑check types (HTTP, L3/L4, Redis) with multiple strategies (pass‑through, no pass‑through, cached) and passive health checking via Outlier Detection. Outlier detection can eject hosts based on consecutive 5xx errors or success‑rate thresholds.
{
"consecutive_5xx": "...",
"interval_ms": "...",
"base_ejection_time_ms": "...",
"max_ejection_percent": "...",
"enforcing_consecutive_5xx": "...",
"enforcing_success_rate": "...",
"success_rate_minimum_hosts": "...",
"success_rate_request_volume": "...",
"success_rate_stdev_factor": "..."
}Circuit Breaking
Envoy provides global rate limiting via a gRPC service backed by Redis. Configuration parameters include max connections, pending requests, total requests, and max retries.
{
"max_connections": "...",
"max_pending_requests": "1024",
"max_requests": "1024",
"max_retries": "3"
}Hot Restart
Envoy can restart without dropping connections by sharing statistics in shared memory, communicating between old and new processes via RPC, draining existing connections, and finally shutting down the old process.
Deployment Options
Service‑to‑service only
The simplest mode where Envoy acts as an internal bus, exposing listeners for local and remote traffic.
Service‑to‑service + front proxy
An additional layer‑7 front proxy provides TLS termination, supports both HTTP/1.1 and HTTP/2, and offers full HTTP routing.
Service‑to‑service, front proxy, and double proxy
The double‑proxy architecture offloads TLS more efficiently and reuses established HTTP/2 connections for downstream traffic.
Conclusion
The article provides a foundational overview of Envoy as the data plane of Service Mesh. Readers interested in Istio are encouraged to explore its official site and source code for deeper understanding.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
