Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture
The article describes how a Go‑based backend, using a two‑layer job/worker pattern with buffered channels and configurable worker pools, can reliably ingest millions of POST requests per minute, serialize payloads to Amazon S3, and dramatically reduce server count through Elastic Beanstalk auto‑scaling.
The author, a veteran of anti‑ad and anti‑virus software, explains a real‑world challenge: processing POST requests from millions of endpoints, each containing a JSON payload that must be stored in Amazon S3 for downstream map‑reduce analysis.
Initially a naive implementation launched a new goroutine for every payload ( go payload.UploadToS3() ), which quickly exhausted resources under high traffic because the number of goroutines could not be limited.
Switching to a single buffered channel to queue payloads ( var Queue chan Payload\nfunc init() {\n Queue = make(chan Payload, MAX_QUEUE)\n} ) still failed; the queue filled up faster than workers could upload, causing request handling to block.
The final solution adopts a classic job/worker pattern: a Job struct wraps a payload, a pool of workers registers with a WorkerPool , and a dispatcher routes jobs from a JobQueue to idle workers, respecting configurable limits ( MAX_WORKERS and MAX_QUEUE ) read from environment variables.
type Job struct {\n Payload Payload\n}\n\nvar JobQueue chan Job\n\ntype Worker struct {\n WorkerPool chan chan Job\n JobChannel chan Job\n quit chan bool\n}\n\nfunc (w Worker) Start() {\n go func() {\n for {\n w.WorkerPool <- w.JobChannel\n select {\n case job := <-w.JobChannel:\n if err := job.Payload.UploadToS3(); err != nil {\n log.Errorf("Error uploading to S3: %s", err.Error())\n }\n case <-w.quit:\n return\n }\n }\n }()\n}\n\nfunc (d *Dispatcher) dispatch() {\n for {\n select {\n case job := <-JobQueue:\n go func(job Job) {\n jobChannel := <-d.WorkerPool\n jobChannel <- job\n }(job)\n }\n }\n}The HTTP handler now decodes the incoming JSON, creates a Job for each payload, and pushes it onto JobQueue , allowing the worker pool to process uploads concurrently while keeping request latency low.
Deployed on Amazon Elastic Beanstalk with Docker, the system achieved near‑million‑requests‑per‑minute throughput, reduced the server fleet from 100 to as few as four instances, and leveraged auto‑scaling based on CPU utilization.
The author concludes that simplifying architecture—using Go’s lightweight concurrency and Elastic Beanstalk’s scaling—delivers high performance without the complexity of multiple queues or heavyweight frameworks.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.