Frontend Development 16 min read

WebSocket Performance, Overhead, and Deployment Best Practices

This article explains how the WebSocket API provides bidirectional, message‑oriented communication, compares its latency and overhead with XHR and SSE, discusses compression and custom protocols, and offers practical deployment and tuning guidelines for long‑lived connections.

Architects Research Society

Jul 27, 2020

WebSocket Performance, Overhead, and Deployment Best Practices

The WebSocket API offers a simple interface for bidirectional, message‑oriented text and binary communication between client and server: you create a WebSocket with a URL, set JavaScript callbacks, and the browser handles the rest, providing binary frames, extensibility, and sub‑protocol negotiation.

However, like any performance discussion, the hidden complexity of the WebSocket protocol still has important implications for when and how to use it; it is not a drop‑in replacement for XHR or SSE, and each transport has its own strengths.

For detailed performance characteristics of each transport, see the XHR and SSE performance articles.

Request and Response Streams

WebSocket is the only transport that allows full duplex communication over a single TCP connection, enabling low‑latency delivery of both text and binary application data in both directions.

Figure 17‑2. Communication flow of XHR, SSE, and WebSocket.

XHR is optimized for transactional request‑response communication: the client sends a complete, well‑formed HTTP request and the server replies with a complete response. It does not support request streams and, before the Streams API, lacked a reliable cross‑browser response‑stream API.

SSE provides efficient, low‑latency server‑to‑client streaming of text data: the client opens an SSE connection and the server pushes updates via the EventSource protocol. After the initial handshake, the client cannot send data to the server.

Propagation and Queuing Latency Switching from XHR to SSE or WebSocket does not reduce the round‑trip propagation delay; that remains constant. In addition, there is queuing latency: the time a message must wait on the client or server before being routed. With XHR polling, queuing latency depends on the client’s poll interval—messages may be ready on the server but not sent until the next poll. SSE and WebSocket use a persistent connection, allowing the server (or client, for WebSocket) to send messages as soon as they are available. Thus, “low‑latency delivery” for SSE and WebSocket means eliminating queuing latency. (No technology can make packets travel faster than light!)

Message Overhead

Once a WebSocket connection is established, data is exchanged in frames; each frame adds 2‑14 bytes of overhead, and both UTF‑8 text and binary payloads are efficiently encoded.

SSE adds only 5 bytes per message, but only for UTF‑8 content.

HTTP/1.x requests (XHR or others) carry an extra 500‑800 bytes of HTTP metadata plus cookies.

HTTP/2 compresses HTTP metadata, reducing overhead dramatically; unchanged headers can be as low as 8 bytes.

These numbers do not include IP, TCP, and TLS frame overhead, which adds roughly 60‑100 bytes per message independent of the application protocol.

Data Efficiency and Compression

Each XHR request can negotiate the optimal transport encoding (e.g., gzip for text). SSE, limited to UTF‑8, can also benefit from gzip compression over the entire session.

WebSocket is more complex: it can carry both text and binary data, so compressing the whole session is not always appropriate. Binary payloads may already be compressed, requiring a per‑message compression extension.

The HyBi working group is developing a per‑message compression extension, but it is not yet widely supported in browsers. Until then, applications must implement their own compression logic for binary payloads and optionally for text messages.

Older versions of Chrome and some WebKit browsers support a per‑frame compression extension.

Custom Application Protocols

Browsers optimize HTTP data transfer with built‑in features such as authentication, caching, and compression, which XHR inherits for free.

In contrast, a custom protocol over a persistent stream bypasses many of these services: after the initial HTTP upgrade handshake, subsequent data is opaque to the browser, so the application must implement its own caching, state management, and metadata delivery.

The initial HTTP upgrade handshake still allows the server to use existing cookie mechanisms for authentication, and can reject the upgrade if verification fails.

Leveraging Browser and Intermediate Caching Regular HTTP has obvious advantages: assets can be cached by the client or intermediate CDNs, reducing unnecessary traffic. WebSocket binary transfers (e.g., images) are not cacheable, so developers should reserve WebSocket for real‑time or control messages and use HTTP/XHR for cacheable assets.

Deploying WebSocket Infrastructure

HTTP is optimized for short‑lived, bursty transfers, so many servers, proxies, and load balancers timeout idle HTTP connections, which is undesirable for long‑lived WebSocket sessions. Three areas need attention:

Routers, load balancers, and proxies in the provider’s network.

Transparent and explicit proxies in external networks (ISPs, carriers).

Routers, firewalls, and proxies in the client’s network.

Client‑side network policies are out of our control; some networks may block WebSocket entirely, so a fallback strategy is needed. TLS tunneling can bypass intermediate proxies, improving handshake success rates and extending idle timeouts.

TLS does not prevent middleboxes from timing out idle TCP connections, but in practice it greatly improves the likelihood of a successful WebSocket upgrade and often extends the timeout.

Server‑side infrastructure also requires tuning. For example, Nginx 1.3.13+ can proxy WebSocket traffic but defaults to a 60‑second timeout. The following configuration extends the timeouts:

<span style="color: rgb(17, 75, 166)">location</span> /websocket {
    <span style="color: rgb(17, 75, 166)">proxy_pass</span> http://backend;
    <span style="color: rgb(17, 75, 166)">proxy_http_version</span> 1.1;
    <span style="color: rgb(17, 75, 166)">proxy_set_header</span> Upgrade $http_upgrade;
    <span style="color: rgb(17, 75, 166)">proxy_set_header</span> Connection "upgrade";
    <span style="color: rgb(17, 75, 166)">proxy_read_timeout</span> 3600;
    <span style="color: rgb(17, 75, 166)">proxy_send_timeout</span> 3600;
}

Set a 60‑minute read timeout.

Set a 60‑minute write timeout.

Similarly, HAProxy is often placed in front of Nginx. Its defaults also need explicit tunnel timeout configuration:

defaults http
  timeout connect 30s
  timeout client  30s
  timeout server  30s
  timeout tunnel  1h

60‑minute idle tunnel timeout.

These examples illustrate that most infrastructure components must be tuned to allow long‑lived sessions; otherwise, idle connections consume memory and socket resources on intermediate servers.

Long‑lived and idle sessions occupy memory and socket resources on all intermediate servers, so short timeouts are often used as a safety and resource‑management measure.

Performance Checklist

Deploying a high‑performance WebSocket service requires careful client‑ and server‑side tuning. A concise checklist includes:

Use secure WebSocket (WSS over TLS) for reliable deployment.

Monitor polyfill performance if used.

Leverage sub‑protocol negotiation for application protocols.

Optimize binary payloads to minimize size.

Consider compressing UTF‑8 content.

Set correct binary type for received binary payloads.

Watch buffered data volume on the client.

Fragment large application messages to avoid head‑of‑line blocking.

Use alternative transports where appropriate.

Mobile optimization is also critical: real‑time pushes can drain battery, so WebSocket should be used for non‑cacheable, real‑time updates while other assets are fetched via HTTP.

Preserve battery life.

Eliminate periodic and inefficient data transfers.

Enable efficient server push.

Avoid unnecessary application keep‑alive traffic.

Source: http://jiagoushi.pro/node/1112

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Deployment WebSocket protocol

Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.