Simplifying a Four‑Layer TCP Proxy in Go: From Custom Goroutine Loops to io.Copy

The article recounts the original complex implementation of a four‑layer TCP proxy in Easegress, explains why using separate read/write goroutines and custom buffers caused error‑handling and flow‑control difficulties, and then shows how switching to Go's io.Copy (and its variants) dramatically simplifies the code while preserving performance through zero‑copy techniques.

Tech Musings
Tech Musings
Tech Musings
Simplifying a Four‑Layer TCP Proxy in Go: From Custom Goroutine Loops to io.Copy

1. Initial implementation

The first version wrapped each TCP connection in a Connection struct that maintained separate read and write buffers, a pool of stream buffers, and two goroutines—one for reading and one for writing. Errors were examined to decide whether to close only the read side, the write side, or the whole connection, and flow‑control was handled manually via channels and timeouts.

package tcpproxy

import (
    "io"
    "net"
    "runtime/debug"
    "sync"
    "sync/atomic"
    "time"
    "github.com/megaease/easegress/pkg/logger"
    "github.com/megaease/easegress/pkg/util/fasttime"
    "github.com/megaease/easegress/pkg/util/iobufferpool"
    "github.com/megaease/easegress/pkg/util/timerpool"
)

const writeBufSize = 8

var tcpBufferPool = sync.Pool{New: func() interface{} { buf := make([]byte, iobufferpool.DefaultBufferReadCapacity); return buf }}

type Connection struct {
    closed uint32
    rawConn net.Conn
    localAddr net.Addr
    remoteAddr net.Addr
    readBuffer []byte
    writeBuffers net.Buffers
    ioBuffers []*iobufferpool.StreamBuffer
    writeBufferChan chan *iobufferpool.StreamBuffer
    mu sync.Mutex
    connStopChan chan struct{}
    listenerStopChan chan struct{}
    lastReadDeadlineTime time.Time
    lastWriteDeadlineTime time.Time
    onRead func(*iobufferpool.StreamBuffer)
    onClose func(event ConnectionEvent)
}

func NewClientConn(conn net.Conn, listenerStopChan chan struct{}) *Connection { /* ... */ }
func (c *Connection) SetOnRead(onRead func(*iobufferpool.StreamBuffer)) { c.onRead = onRead }
func (c *Connection) SetOnClose(onclose func(event ConnectionEvent)) { c.onClose = onclose }
func (c *Connection) Start() { /* launches read and write loops */ }
/* read/write loops, Write, Close, doReadIO, doWriteIO, etc. */

This code demonstrates the complexity of manually managing buffers, deadlines, and error propagation.

2. Using io.Copy

Following a suggestion from a mentor, the author replaced the custom loops with the standard library's io.Copy, which already handles efficient copying and edge cases. The simplified proxy is only a few dozen lines.

package main

import (
    "io"
    "log"
    "net"
    "sync"
)

type TCPProxy struct {
    ListenAddr string
    TargetAddr string
}

func (p *TCPProxy) Start() error {
    tcpAddr, err := net.ResolveTCPAddr("tcp", p.ListenAddr)
    if err != nil { return err }
    listener, err := net.ListenTCP("tcp", tcpAddr)
    if err != nil { return err }
    defer listener.Close()
    log.Printf("TCP proxy listening on %s, forwarding to %s", p.ListenAddr, p.TargetAddr)
    for {
        clientConn, err := listener.AcceptTCP()
        if err != nil { log.Printf("accept error: %v", err); continue }
        go p.handleConnection(clientConn)
    }
}

func (p *TCPProxy) handleConnection(clientConn *net.TCPConn) {
    defer clientConn.Close()
    serverConn, _ := net.DialTCP("tcp", nil, p.TargetAddr)
    defer serverConn.Close()
    var wg sync.WaitGroup
    wg.Add(2)
    go func() { defer wg.Done(); io.Copy(serverConn, clientConn); serverConn.CloseWrite() }()
    go func() { defer wg.Done(); io.Copy(clientConn, serverConn); clientConn.CloseWrite() }()
    wg.Wait()
}

func main() {
    proxy := &TCPProxy{ListenAddr: ":8080", TargetAddr: "localhost:80"}
    if err := proxy.Start(); err != nil { log.Fatalf("proxy start failed: %v", err) }
}

Because net.TCPConn implements io.ReaderFrom and io.WriterTo, io.Copy internally uses zero‑copy system calls such as splice or sendfile, making it both simple and fast.

2.1 Adding a custom buffer with io.CopyBuffer

The author experimented with io.CopyBuffer to specify a larger buffer, but discovered that when the source or destination implements the fast interfaces, the supplied buffer is ignored, so performance does not improve.

func (p *TCPProxy) handleConnection(clientConn *net.TCPConn) {
    // ... same setup as before ...
    go func() {
        defer wg.Done()
        defer serverConn.CloseWrite()
        bufPtr := bufferPool.Get().(*[]byte)
        defer bufferPool.Put(bufPtr)
        io.CopyBuffer(serverConn, clientConn, *bufPtr)
    }()
    // symmetric copy for the opposite direction
}

2.2 How io.CopyBuffer works

// Simplified excerpt from Go's src/io/io.go
func CopyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {
    if buf != nil && len(buf) == 0 { panic("empty buffer in CopyBuffer") }
    return copyBuffer(dst, src, buf)
}

func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {
    if wt, ok := src.(WriterTo); ok { return wt.WriteTo(dst) }
    if rf, ok := dst.(ReaderFrom); ok { return rf.ReadFrom(src) }
    if buf == nil { buf = make([]byte, 32*1024) }
    for {
        nr, er := src.Read(buf)
        if nr > 0 {
            nw, ew := dst.Write(buf[:nr])
            // handle short write, errors, etc.
        }
        if er != nil { if er != EOF { err = er }; break }
    }
    return written, err
}

Since net.TCPConn satisfies WriterTo and ReaderFrom, the fast path is taken and the custom buffer is never used.

2.3 ReadFrom and WriteTo implementations

// net.TCPConn implements ReaderFrom
func (c *TCPConn) ReadFrom(r io.Reader) (int64, error) {
    if !c.ok() { return 0, syscall.EINVAL }
    n, err := c.readFrom(r) // may use splice or sendfile internally
    if err != nil && err != io.EOF { err = &OpError{Op: "readfrom", Err: err} }
    return n, err
}

// net.TCPConn implements WriterTo
func (c *TCPConn) WriteTo(w io.Writer) (int64, error) {
    // similar logic using spliceTo or sendfile
}

These methods invoke OS‑level zero‑copy primitives, allowing io.Copy to transfer data without allocating intermediate buffers.

3. Remaining issues – idle timeout

While io.Copy simplifies data transfer, it does not provide fine‑grained idle‑timeout handling for long‑lived proxy connections. net.TCPConn.SetDeadline sets an absolute deadline, which is suitable for short HTTP connections but inflexible for persistent four‑layer proxies.

A possible improvement is to wrap the connection with custom Reader/Writer that updates the deadline before each read or write, thereby achieving true idle‑timeout semantics. However, this re‑introduces manual loop logic, bringing back the original complexity trade‑off.

For now, the author hopes future Go releases will add more flexible timeout controls, similar to how io.Copy leveraged ReadFrom and WriteTo for zero‑copy.
proxyGoTCPio.Copyzero-copy
Tech Musings
Written by

Tech Musings

Capturing thoughts and reflections while coding.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.