Why Varnish Beats Squid: Deep Dive into Architecture, VCL, and Performance
Varnish is a high‑performance open‑source reverse proxy and HTTP accelerator that outperforms Squid through in‑memory caching, multi‑process architecture, and flexible VCL scripting; this guide covers its advantages, disadvantages, configuration, VCL syntax changes, request handling flow, and practical deployment examples.
Introduction
Varnish is a high‑performance open‑source reverse proxy server and HTTP accelerator. The latest stable version is 4.0.0; version 3.x is also stable for production, while the 2.x packages in yum are outdated and not recommended.
Comparison with Squid
Similarities
Both are reverse proxy servers.
Both are open‑source software.
Advantages of Varnish
Higher stability; Squid fails more often under the same load.
Faster access because Varnish uses “Visual Page Cache” to read from memory, while Squid reads from disk.
Supports more concurrent connections; Varnish releases TCP connections faster.
Cache can be cleared in bulk using regular expressions via the management port.
Uses multiple processes (fork) to utilize all CPU cores, unlike Squid’s single‑process single‑core model.
Disadvantages of Varnish
If the Varnish process hangs, crashes, or restarts, all cached data in memory is lost, causing a surge of requests to the backend.
When load‑balanced across multiple Varnish instances, the same URL may be cached on several servers, leading to cache duplication and performance degradation.
Mitigation for disadvantages
Use memory‑based caching for high traffic and deploy additional Squid servers as a second‑level cache to handle backend load when Varnish restarts.
Apply URL hashing in the load balancer so that a given URL is consistently routed to the same Varnish instance.
Improvements in Varnish 3.x
Full support for streaming objects.
Backend‑fetchable objects (client/backend separation).
New Varnishlog query language for automatic request grouping.
Detailed request timestamps and byte counters.
Security enhancements.
VCL Syntax Changes
VCL files must start with vcl 4.0 to specify the version. vcl_fetch is replaced by vcl_backend_response; req.* is no longer valid in vcl_backend_response.
Backend director definitions require import directors and are created in vcl_init.
Custom subroutines must not start with vcl_ and are called with call sub_name. error() is replaced by synth(). return(lookup) is replaced by return(hash).
Use beresp.uncacheable to create hit_for_pass objects. req.backend.healthy is replaced by std.healthy(req.backend). req.backend is replaced by req.backend_hint.
The keyword remove is replaced by unset.
Architecture and Cache Workflow
Varnish consists of a master process and multiple child processes.
The master reads the storage configuration, creates or reads the cache file, initializes the storage structure, then forks and monitors child processes.
Each child maps the storage file into memory, creates free‑space structures, and attaches them to the storage manager.
Management interfaces include command‑line, Telnet, and web interfaces.
During runtime, VCL is compiled to C and loaded as a shared object by child processes.
Child processes allocate several threads: Accept thread receives requests, Work threads process them, Epoll thread monitors events, and Expire thread removes expired objects from a binary heap.
HTTP Request Processing Flow
Receive state ( vcl_recv) – entry point; decides whether to pass, pipe, or proceed to lookup.
Lookup state – searches the hash table; on hit, enters vcl_hit, otherwise vcl_miss.
Pass state – forwards request directly to the backend ( vcl_pass).
Fetch state – retrieves data from the backend ( vcl_backend_fetch in Varnish 4).
Deliver state – sends the response to the client ( vcl_deliver).
Note: In Varnish 4 the former vcl_fetch logic is split into vcl_backend_fetch and vcl_backend_response.
Built‑in Subroutines
vcl_recv– receive and process the request. vcl_pipe – pipe mode, forwards request unchanged. vcl_pass – pass mode, forwards request without caching. vcl_hit – called when a cached object is found. vcl_miss – called when no cached object exists. vcl_hash – creates a hash key for the request. vcl_purge – builds a response after a purge. vcl_deliver – called before sending the response to the client. vcl_backend_fetch – modifies the request sent to the backend. vcl_backend_response – processes the backend response. vcl_backend_error – handles backend fetch failures. vcl_init – runs when VCL is loaded, often to initialise VMODs. vcl_fini – runs when VCL is discarded, for cleanup.
VCL Variables
Common variable scopes: req – request object, available when the request arrives. bereq – backend request object, used when contacting the backend. beresp – backend response object. resp – HTTP response object sent to the client. obj – object attributes stored in memory.
Grace Mode (Stale Content)
When multiple clients request the same page, Varnish sends a single request to the backend and holds the others. To avoid thundering‑herd problems, Varnish can keep expired objects for a grace period and serve stale content:
sub vcl_recv {
if (!req.backend.healthy) {
set req.grace = 5m;
} else {
set req.grace = 15s;
}
}
sub vcl_fetch {
set beresp.grace = 30m;
}
# The above keeps expired objects for up to 30 minutes.Installation and Configuration
# Install Varnish 4.0
yum localinstall --nogpgcheck \
varnish-4.0.0-1.el6.x86_64.rpm \
varnish-libs-4.0.0-1.el6.x86_64.rpm \
varnish-docs-4.0.0-1.el6.x86_64.rpm
# Edit /etc/sysconfig/varnish
VARNISH_STORAGE_SIZE=100M
VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"
service varnish start # default listen port 6081, admin port 6082
varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
# Commands:
# vcl.list # list configurations
# vcl.load test1 test.vcl # load new VCL
# vcl.use test1 # activate configuration
# vcl.show test1 # display VCL contentExample VCL Configuration
# This is an example VCL file for Varnish 4.0
vcl 4.0;
import directors;
probe backend_healthcheck {
.url = "/health.html";
.window = 5;
.threshold = 2;
.interval = 3s;
}
backend web1 {
.host = "static1.lnmmp.com";
.port = "80";
.probe = backend_healthcheck;
}
backend web2 {
.host = "static2.lnmmp.com";
.port = "80";
.probe = backend_healthcheck;
}
backend img1 {
.host = "img1.lnmmp.com";
.port = "80";
.probe = backend_healthcheck;
}
backend img2 {
.host = "img2.lnmmp.com";
.port = "80";
.probe = backend_healthcheck;
}
vcl_init {
new web_cluster = directors.random();
web_cluster.add_backend(web1);
web_cluster.add_backend(web2);
new img_cluster = directors.random();
img_cluster.add_backend(img1);
img_cluster.add_backend(img2);
}
acl purgers {
"127.0.0.1";
"192.168.0.0"/24;
}
sub vcl_recv {
if (req.request == "GET" && req.http.cookie) {
return (hash);
}
if (req.url ~ "test.html") {
return (pass);
}
if (req.request == "PURGE") {
if (!client.ip ~ purgers) {
return (synth(405, "Method not allowed"));
}
return (hash);
}
if (req.http.X-Forward-For) {
set req.http.X-Forward-For = req.http.X-Forward-For + "," + client.ip;
} else {
set req.http.X-Forward-For = client.ip;
}
if (req.http.host ~ "(?i)^(www.)?lnmmp.com$") {
set req.http.host = "www.lnmmp.com";
set req.backend_hint = web_cluster.backend();
} elsif (req.http.host ~ "(?i)^images.lnmmp.com$") {
set req.backend_hint = img_cluster.backend();
}
return (hash);
}
sub vcl_hit {
if (req.request == "PURGE") {
purge;
return (synth(200, "Purged"));
}
}
sub vcl_miss {
if (req.request == "PURGE") {
purge;
return (synth(404, "Not in cache"));
}
}
sub vcl_pass {
if (req.request == "PURGE") {
return (synth(502, "PURGE on a passed object"));
}
}
sub vcl_backend_response {
if (req.url ~ "\\.(jpg|jpeg|gif|png)$") {
set beresp.ttl = 7200s;
}
if (req.url ~ "\\.(html|css|js)$") {
set beresp.ttl = 1200s;
}
if (beresp.http.Set-Cookie) {
return (deliver);
}
}
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT from " + server.ip;
} else {
set resp.http.X-Cache = "MISS";
}
}Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
