Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture
The article describes how a Go‑based backend, using a two‑layer job/worker pattern with buffered channels and configurable worker pools, can reliably ingest millions of POST requests per minute, serialize payloads to Amazon S3, and dramatically reduce server count through Elastic Beanstalk auto‑scaling.
The author, a veteran of anti‑ad and anti‑virus software, explains a real‑world challenge: processing POST requests from millions of endpoints, each containing a JSON payload that must be stored in Amazon S3 for downstream map‑reduce analysis.
Initially a naive implementation launched a new goroutine for every payload ( go payload.UploadToS3()), which quickly exhausted resources under high traffic because the number of goroutines could not be limited.
Switching to a single buffered channel to queue payloads (
var Queue chan Payload
func init() {
Queue = make(chan Payload, MAX_QUEUE)
}) still failed; the queue filled up faster than workers could upload, causing request handling to block.
The final solution adopts a classic job/worker pattern: a Job struct wraps a payload, a pool of workers registers with a WorkerPool, and a dispatcher routes jobs from a JobQueue to idle workers, respecting configurable limits ( MAX_WORKERS and MAX_QUEUE) read from environment variables.
type Job struct {
Payload Payload
}
var JobQueue chan Job
type Worker struct {
WorkerPool chan chan Job
JobChannel chan Job
quit chan bool
}
func (w Worker) Start() {
go func() {
for {
w.WorkerPool <- w.JobChannel
select {
case job := <-w.JobChannel:
if err := job.Payload.UploadToS3(); err != nil {
log.Errorf("Error uploading to S3: %s", err.Error())
}
case <-w.quit:
return
}
}
}()
}
func (d *Dispatcher) dispatch() {
for {
select {
case job := <-JobQueue:
go func(job Job) {
jobChannel := <-d.WorkerPool
jobChannel <- job
}(job)
}
}
}The HTTP handler now decodes the incoming JSON, creates a Job for each payload, and pushes it onto JobQueue, allowing the worker pool to process uploads concurrently while keeping request latency low.
Deployed on Amazon Elastic Beanstalk with Docker, the system achieved near‑million‑requests‑per‑minute throughput, reduced the server fleet from 100 to as few as four instances, and leveraged auto‑scaling based on CPU utilization.
The author concludes that simplifying architecture—using Go’s lightweight concurrency and Elastic Beanstalk’s scaling—delivers high performance without the complexity of multiple queues or heavyweight frameworks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
