Backend Development 12 min read

Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture

The article describes how a Go‑based backend, using a two‑layer job/worker pattern with buffered channels and configurable worker pools, can reliably ingest millions of POST requests per minute, serialize payloads to Amazon S3, and dramatically reduce server count through Elastic Beanstalk auto‑scaling.

Architecture Digest
Architecture Digest
Architecture Digest
Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture

The author, a veteran of anti‑ad and anti‑virus software, explains a real‑world challenge: processing POST requests from millions of endpoints, each containing a JSON payload that must be stored in Amazon S3 for downstream map‑reduce analysis.

Initially a naive implementation launched a new goroutine for every payload ( go payload.UploadToS3() ), which quickly exhausted resources under high traffic because the number of goroutines could not be limited.

Switching to a single buffered channel to queue payloads ( var Queue chan Payload\nfunc init() {\n Queue = make(chan Payload, MAX_QUEUE)\n} ) still failed; the queue filled up faster than workers could upload, causing request handling to block.

The final solution adopts a classic job/worker pattern: a Job struct wraps a payload, a pool of workers registers with a WorkerPool , and a dispatcher routes jobs from a JobQueue to idle workers, respecting configurable limits ( MAX_WORKERS and MAX_QUEUE ) read from environment variables.

type Job struct {\n    Payload Payload\n}\n\nvar JobQueue chan Job\n\ntype Worker struct {\n    WorkerPool chan chan Job\n    JobChannel chan Job\n    quit chan bool\n}\n\nfunc (w Worker) Start() {\n    go func() {\n        for {\n            w.WorkerPool <- w.JobChannel\n            select {\n            case job := <-w.JobChannel:\n                if err := job.Payload.UploadToS3(); err != nil {\n                    log.Errorf("Error uploading to S3: %s", err.Error())\n                }\n            case <-w.quit:\n                return\n            }\n        }\n    }()\n}\n\nfunc (d *Dispatcher) dispatch() {\n    for {\n        select {\n        case job := <-JobQueue:\n            go func(job Job) {\n                jobChannel := <-d.WorkerPool\n                jobChannel <- job\n            }(job)\n        }\n    }\n}

The HTTP handler now decodes the incoming JSON, creates a Job for each payload, and pushes it onto JobQueue , allowing the worker pool to process uploads concurrently while keeping request latency low.

Deployed on Amazon Elastic Beanstalk with Docker, the system achieved near‑million‑requests‑per‑minute throughput, reduced the server fleet from 100 to as few as four instances, and leveraged auto‑scaling based on CPU utilization.

The author concludes that simplifying architecture—using Go’s lightweight concurrency and Elastic Beanstalk’s scaling—delivers high performance without the complexity of multiple queues or heavyweight frameworks.

backendConcurrencyGoscalingS3elasticbeanstalkJob Queue
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.