Building a Scalable WebSocket Push Service in Go: From Basics to Million‑User Architecture

This article explains WebSocket fundamentals, compares pull and push models, details the WebSocket handshake flow, presents a complete Go server and client implementation, analyzes performance bottlenecks of a million‑user bullet‑screen system, and proposes concrete optimizations such as packet merging, lock granularity, JSON encoding reduction, and HTTP/2‑based clustering.

Go Development Architecture Practice
Go Development Architecture Practice
Go Development Architecture Practice
Building a Scalable WebSocket Push Service in Go: From Basics to Million‑User Architecture

WebSocket Overview

WebSocket provides a full‑duplex, persistent TCP‑based channel that allows browsers to maintain a long‑living connection with the server, enabling real‑time bidirectional communication without the overhead of repeated HTTP requests.

Pull vs Push Model

Pull (polling) repeatedly queries the server at fixed intervals, wasting bandwidth when data changes infrequently and increasing server load.

Push sends data only when updates occur, but requires the server to keep many active connections and manage their lifecycle.

WebSocket Handshake

The client initiates an HTTP request with an Upgrade: websocket header. The server responds with 101 Switching Protocols, completing the handshake. After that, both sides exchange message frames over the underlying TCP connection.

Server Technology Choices

Node.js – single‑threaded event loop; limited push performance.

C/C++ – low‑level TCP handling; high implementation cost.

Go – native goroutine concurrency, compiled speed, and the mature gorilla/websocket library make it well‑suited for high‑throughput services.

Go WebSocket Server Implementation

package main

import (
    "net/http"
    "time"

    "github.com/gorilla/websocket"
    "github.com/myproject/gowebsocket/impl"
)

var upgrader = websocket.Upgrader{
    CheckOrigin: func(r *http.Request) bool { return true },
}

func wsHandler(w http.ResponseWriter, r *http.Request) {
    wsConn, err := upgrader.Upgrade(w, r, nil)
    if err != nil {
        return
    }
    conn, err := impl.InitConnection(wsConn)
    if err != nil {
        wsConn.Close()
        return
    }

    // Send heartbeat every second
    go func() {
        for {
            if err = conn.WriteMessage([]byte("heartbeat")); err != nil {
                return
            }
            time.Sleep(time.Second)
        }
    }()

    // Echo received messages
    for {
        data, err := conn.ReadMessage()
        if err != nil {
            break
        }
        if err = conn.WriteMessage(data); err != nil {
            break
        }
    }
    conn.Close()
}

func main() {
    http.HandleFunc("/ws", wsHandler)
    http.ListenAndServe("0.0.0.0:7777", nil)
}

Simple HTML Client for Testing

<!DOCTYPE html>
<html>
<head>
    <title>Go WebSocket Test</title>
    <meta charset="utf-8"/>
</head>
<body>
<script>
    var wsUri = "ws://127.0.0.1:7777/ws";
    var output;

    function init() {
        output = document.getElementById("output");
        websocket = new WebSocket(wsUri);
        websocket.onopen = function() { writeToScreen("CONNECTED"); };
        websocket.onclose = function() { writeToScreen("DISCONNECTED"); };
        websocket.onmessage = function(evt) { writeToScreen("RESPONSE: " + evt.data); };
        websocket.onerror = function(evt) { writeToScreen("ERROR: " + evt.data); };
    }

    function writeToScreen(message) {
        var p = document.createElement("p");
        p.style.wordWrap = "break-word";
        p.innerHTML = message;
        output.appendChild(p);
    }

    window.addEventListener("load", init, false);
</script>
<h2>WebSocket Test</h2>
<input type="text" id="input"/>
<button onclick="websocket.send(document.getElementById('input').value);">send</button>
<button onclick="websocket.close();">close</button>
<div id="output"></div>
</body>
</html>

Encapsulated WebSocket Logic (Go)

package impl

import (
    "errors"
    "sync"

    "github.com/gorilla/websocket"
)

type Connection struct {
    wsConnect *websocket.Conn
    inChan    chan []byte
    outChan   chan []byte
    closeChan chan struct{}
    mutex     sync.Mutex
    isClosed  bool
}

// InitConnection creates a Connection and starts read/write loops.
func InitConnection(wsConn *websocket.Conn) (*Connection, error) {
    conn := &Connection{
        wsConnect: wsConn,
        inChan:    make(chan []byte, 1000),
        outChan:   make(chan []byte, 1000),
        closeChan: make(chan struct{}),
    }
    go conn.readLoop()
    go conn.writeLoop()
    return conn, nil
}

// ReadMessage returns the next inbound payload or an error if the connection is closed.
func (c *Connection) ReadMessage() ([]byte, error) {
    select {
    case data := <-c.inChan:
        return data, nil
    case <-c.closeChan:
        return nil, errors.New("connection is closed")
    }
}

// WriteMessage queues an outbound payload.
func (c *Connection) WriteMessage(data []byte) error {
    select {
    case c.outChan <- data:
        return nil
    case <-c.closeChan:
        return errors.New("connection is closed")
    }
}

// Close shuts down the underlying websocket and releases resources.
func (c *Connection) Close() {
    c.wsConnect.Close()
    c.mutex.Lock()
    if !c.isClosed {
        close(c.closeChan)
        c.isClosed = true
    }
    c.mutex.Unlock()
}

// readLoop continuously reads from the websocket and forwards data to inChan.
func (c *Connection) readLoop() {
    for {
        _, data, err := c.wsConnect.ReadMessage()
        if err != nil {
            break
        }
        select {
        case c.inChan <- data:
        case <-c.closeChan:
            break
        }
    }
    c.Close()
}

// writeLoop continuously reads from outChan and writes to the websocket.
func (c *Connection) writeLoop() {
    for {
        select {
        case data := <-c.outChan:
            if err := c.wsConnect.WriteMessage(websocket.TextMessage, data); err != nil {
                break
            }
        case <-c.closeChan:
            break
        }
    }
    c.Close()
}

Design Challenges for a Million‑User Bullet‑Screen System

When scaling to ~1 million concurrent connections, three primary bottlenecks appear:

Kernel bottleneck – Linux TCP stack can handle roughly 100 k packets per second; pushing 1 M users at 10 messages per second would exceed this limit.

Lock contention – A single global map of online users forces a global mutex during broadcast, causing severe contention.

CPU bottleneck – Serializing each message to JSON for every user consumes large CPU cycles (≈100 k JSON encodes per second).

Optimization Strategies

Network packet merging – Aggregate N messages generated within a short interval (e.g., 1 s) into a single payload, reducing the number of packets sent to the kernel.

Sharded user maps – Split the global user map into multiple shards, each protected by its own lock, allowing parallel broadcasts without a single lock.

Read‑write lock – Replace a mutex with sync.RWMutex so multiple push workers can traverse the map concurrently for reads.

Pre‑encode JSON – Encode a message once and reuse the binary payload for all recipients, eliminating per‑connection JSON serialization.

Clustered gateway architecture – Deploy multiple gateway nodes behind a load balancer. Use HTTP/2 between gateways and a logical cluster to broadcast messages efficiently, while exposing an HTTP/1 API to external clients.

Overall Distributed Architecture

Clients call a public HTTP API, which forwards the request to the logical cluster. The cluster broadcasts the payload to every gateway node; each gateway pushes the message to its subset of online connections.

Architecture diagram
Architecture diagram

Key Takeaways

By applying packet merging, sharded locks, pre‑encoding, and an HTTP/2‑based clustering layer, a Go‑based WebSocket service can reliably handle millions of concurrent users and sustain high‑throughput real‑time push scenarios.

real-time messagingGoscalable architecture
Go Development Architecture Practice
Written by

Go Development Architecture Practice

Daily sharing of Golang-related technical articles, practical resources, language news, tutorials, real-world projects, and more. Looking forward to growing together. Let's go!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.