Backend Development 12 min read

Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture

The article describes how a Go‑based backend, using a two‑layer job/worker pattern with buffered channels and configurable worker pools, can reliably ingest millions of POST requests per minute, serialize payloads to Amazon S3, and dramatically reduce server count through Elastic Beanstalk auto‑scaling.

Architecture Digest

Mar 11, 2018

Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture

The author, a veteran of anti‑ad and anti‑virus software, explains a real‑world challenge: processing POST requests from millions of endpoints, each containing a JSON payload that must be stored in Amazon S3 for downstream map‑reduce analysis.

Initially a naive implementation launched a new goroutine for every payload ( go payload.UploadToS3()), which quickly exhausted resources under high traffic because the number of goroutines could not be limited.

Switching to a single buffered channel to queue payloads (

var Queue chan Payload
func init() {
    Queue = make(chan Payload, MAX_QUEUE)
}

) still failed; the queue filled up faster than workers could upload, causing request handling to block.

The final solution adopts a classic job/worker pattern: a Job struct wraps a payload, a pool of workers registers with a WorkerPool, and a dispatcher routes jobs from a JobQueue to idle workers, respecting configurable limits ( MAX_WORKERS and MAX_QUEUE) read from environment variables.

type Job struct {
    Payload Payload
}

var JobQueue chan Job

type Worker struct {
    WorkerPool chan chan Job
    JobChannel chan Job
    quit chan bool
}

func (w Worker) Start() {
    go func() {
        for {
            w.WorkerPool <- w.JobChannel
            select {
            case job := <-w.JobChannel:
                if err := job.Payload.UploadToS3(); err != nil {
                    log.Errorf("Error uploading to S3: %s", err.Error())
                }
            case <-w.quit:
                return
            }
        }
    }()
}

func (d *Dispatcher) dispatch() {
    for {
        select {
        case job := <-JobQueue:
            go func(job Job) {
                jobChannel := <-d.WorkerPool
                jobChannel <- job
            }(job)
        }
    }
}

The HTTP handler now decodes the incoming JSON, creates a Job for each payload, and pushes it onto JobQueue, allowing the worker pool to process uploads concurrently while keeping request latency low.

Deployed on Amazon Elastic Beanstalk with Docker, the system achieved near‑million‑requests‑per‑minute throughput, reduced the server fleet from 100 to as few as four instances, and leveraged auto‑scaling based on CPU utilization.

The author concludes that simplifying architecture—using Go’s lightweight concurrency and Elastic Beanstalk’s scaling—delivers high performance without the complexity of multiple queues or heavyweight frameworks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend scaling S3 elasticbeanstalk job queue

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.