Building a Ten‑Million‑Scale WebSocket Push Service with Go

This article explains the trade‑offs between pull and push models, why Go is chosen for a high‑concurrency WebSocket server, provides complete Go and HTML code examples, and details architectural and performance optimizations needed to support millions of simultaneous connections and messages per second.

Golang Shines
Golang Shines
Golang Shines
Building a Ten‑Million‑Scale WebSocket Push Service with Go

Pull vs Push Model

Pull (periodic polling) generates many unnecessary requests when data updates are infrequent, imposes high query load on the server, and adds latency because the client must wait for the next poll. Push (server‑initiated) sends data only when it changes, keeps long‑lived connections, and delivers updates immediately.

Server Technology Selection

Node.js – single‑threaded (even with clustering) limits push performance.

C/C++ – low‑level TCP and WebSocket implementation incurs high development cost.

Go – goroutine‑based concurrency, compiled speed, and the mature gorilla/websocket library make it suitable for high‑throughput services.

Basic Go WebSocket Server

package main
import (
    "net/http"
    "github.com/gorilla/websocket"
    "github.com/myproject/gowebsocket/impl"
    "time"
)

var (
    upgrader = websocket.Upgrader{CheckOrigin: func(r *http.Request) bool {return true}}
)

func wsHandler(w http.ResponseWriter, r *http.Request) {
    var (
        wsConn *websocket.Conn
        err    error
        conn   *impl.Connection
        data   []byte
    )
    if wsConn, err = upgrader.Upgrade(w, r, nil); err != nil {return}
    if conn, err = impl.InitConnection(wsConn); err != nil {goto ERR}
    go func() {
        for {
            if err = conn.WriteMessage([]byte("heartbeat")); err != nil {return}
            time.Sleep(1 * time.Second)
        }
    }()
    for {
        if data, err = conn.ReadMessage(); err != nil {goto ERR}
        if err = conn.WriteMessage(data); err != nil {goto ERR}
    }
ERR:
    conn.Close()
}

func main() {
    http.HandleFunc("/ws", wsHandler)
    http.ListenAndServe("0.0.0.0:7777", nil)
}

The server upgrades an HTTP request to a WebSocket connection, starts a heartbeat goroutine, and echoes received messages back to the client.

Front‑End Test Page

<!DOCTYPE html>
<html>
<head>
    <title>go websocket</title>
    <meta charset="utf-8"/>
</head>
<body>
<script type="text/javascript">
    var wsUri = "ws://127.0.0.1:7777/ws";
    var output;
    function init() {output = document.getElementById("output"); testWebSocket();}
    function testWebSocket() {
        websocket = new WebSocket(wsUri);
        websocket.onopen = function(evt){onOpen(evt);};
        websocket.onclose = function(evt){onClose(evt);};
        websocket.onmessage = function(evt){onMessage(evt);};
        websocket.onerror = function(evt){onError(evt);};
    }
    function onOpen(evt){writeToScreen("CONNECTED");}
    function onClose(evt){writeToScreen("DISCONNECTED");}
    function onMessage(evt){writeToScreen("RESPONSE: " + evt.data);}
    function onError(evt){writeToScreen("ERROR: " + evt.data);}
    function doSend(message){writeToScreen("SENT: " + message); websocket.send(message);}
    function writeToScreen(message){var p=document.createElement("p");p.style.wordWrap="break-word";p.innerHTML=message;output.appendChild(p);}
    window.addEventListener("load", init, false);
    function sendBtnClick(){var msg=document.getElementById("input").value;doSend(msg);document.getElementById("input").value='';}
    function closeBtnClick(){websocket.close();}
</script>
<h2>WebSocket Test</h2>
<input type="text" id="input"/>
<button onclick="sendBtnClick()">send</button>
<button onclick="closeBtnClick()">close</button>
<div id="output"></div>
</body>
</html>

WebSocket Wrapper Library (impl package)

package impl
import (
    "github.com/gorilla/websocket"
    "sync"
    "errors"
)

type Connection struct {
    wsConnect *websocket.Conn
    inChan   chan []byte
    outChan  chan []byte
    closeChan chan byte
    mutex    sync.Mutex // protect closeChan
    isClosed bool       // prevent double close
}

func InitConnection(wsConn *websocket.Conn) (conn *Connection, err error) {
    conn = &Connection{wsConnect: wsConn, inChan: make(chan []byte, 1000), outChan: make(chan []byte, 1000), closeChan: make(chan byte, 1)}
    go conn.readLoop()
    go conn.writeLoop()
    return
}

func (c *Connection) ReadMessage() (data []byte, err error) {
    select {
    case data = <-c.inChan:
    case <-c.closeChan:
        err = errors.New("connection is closed")
    }
    return
}

func (c *Connection) WriteMessage(data []byte) (err error) {
    select {
    case c.outChan <- data:
    case <-c.closeChan:
        err = errors.New("connection is closed")
    }
    return
}

func (c *Connection) Close() {
    c.wsConnect.Close()
    c.mutex.Lock()
    if !c.isClosed {close(c.closeChan); c.isClosed = true}
    c.mutex.Unlock()
}

func (c *Connection) readLoop() {
    for {
        _, data, err := c.wsConnect.ReadMessage()
        if err != nil {goto ERR}
        select {
        case c.inChan <- data:
        case <-c.closeChan: goto ERR
        }
    }
ERR:
    c.Close()
}

func (c *Connection) writeLoop() {
    for {
        select {
        case data := <-c.outChan:
            if err := c.wsConnect.WriteMessage(websocket.TextMessage, data); err != nil {goto ERR}
        case <-c.closeChan: goto ERR
        }
    }
ERR:
    c.Close()
}

Challenges of a Ten‑Million‑Scale Bullet‑Screen System

Network kernel limit: Linux can send roughly 100 M packets per second; pushing 100 W online users with 10 messages per second requires 1 000 M messages/s.

Lock contention: a single map holding 1 M online connections must be locked during traversal, causing long delays when broadcasting.

CPU bottleneck: JSON encoding for each message is expensive; broadcasting to 100 W online users means 100 W JSON encodings per push.

Optimization Strategies

Network bottleneck : merge N messages generated within one second into a single packet, reducing the number of small packets sent.

Lock bottleneck : split the global connection map into multiple shards, each with its own lock; replace the mutex with a read‑write lock so multiple push workers can traverse the same shard concurrently.

CPU bottleneck : perform JSON encoding once per merged message instead of per connection; pre‑encode before broadcasting.

Cluster design : deploy multiple gateway nodes behind a load balancer; use a logical cluster that broadcasts messages to all gateways. Internal communication between the logical cluster and gateways uses HTTP/2 for RPC‑style multiplexed connections, while external APIs remain HTTP/1 for compatibility.

The overall flow: a business service calls the HTTP API of the logical cluster, the cluster broadcasts the message to every gateway, and each gateway pushes the message to its subset of online connections.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationDistributed ArchitecturegoHigh ConcurrencyWebSocketBullet Screen
Golang Shines
Written by

Golang Shines

We share daily the latest Golang technical articles, practical resources, language news, tutorials, and real-world projects to help everyone learn and improve.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.