Backend Development 12 min read

How I Cut a 20‑Second API Call to Sub‑Second Speed with Three Simple Optimizations

This article walks through a real‑world case where a batch‑score query API that originally took up to 20 seconds was systematically optimized through index tuning, multithreaded execution, and request‑size limiting, ultimately achieving sub‑second response times.

macrozheng

Jan 5, 2026

How I Cut a 20‑Second API Call to Sub‑Second Speed with Three Simple Optimizations

Introduction

Interface performance is a common pain point for backend developers. The author describes a real‑world case where a batch‑score query API took 20 seconds and was reduced to 500 ms through three rounds of optimization.

1. Investigation

Each morning the team receives an email summarising slow queries, showing the endpoint, call count, max latency, average latency and traceId. One batch‑score query showed a max latency of 20 s and an average of 2 s. Using SkyWalking it was observed that most calls return within 500 ms, but a small fraction exceed 20 s.

The root cause was identified as the settlement‑order list page sending an excessively large request payload: the page requests the batch‑score API for every settlement order, resulting in hundreds or thousands of IDs in a single call.

2. Current Situation

Although a bulk primary‑key lookup would be fast, the batch‑score API contains complex logic that performs a remote call to fetch organization info and then iterates over each record to query scores, leading to two performance bottlenecks:

Remote call to another service inside the API.

Database query inside a for loop.

3. First Optimization – Index Tuning

The original table had a simple index on business_id, which did not improve the query. A composite index covering org_code, category_id, business_id and business_type was added:

alter table user_score add index `un_org_category_business` (org_code, category_id, business_id, business_type) USING BTREE;

After applying the composite index, the maximum latency dropped from 20 s to about 5 s.

4. Second Optimization – Multithreaded Query

Because each iteration performed a separate database query, the author switched to parallel execution using Java 8 CompletableFuture and a custom thread pool.

CompletableFuture[] futureArray = dataList.stream()
    .map(data -> CompletableFuture
        .supplyAsync(() -> query(data), asyncExecutor)
        .whenComplete((result, th) -> { /* handle */ }))
    .toArray(CompletableFuture[]::new);
CompletableFuture.allOf(futureArray).join();

The thread pool was configured as follows:

ExecutorService threadPool = new ThreadPoolExecutor(
    8, 10, 60, TimeUnit.SECONDS,
    new ArrayBlockingQueue(500),
    new ThreadPoolExecutor.CallerRunsPolicy());

With multithreading, the latency improved another five‑fold, from ~5 s to ~1 s.

5. Third Optimization – Limiting Batch Size

Even after the first two steps, the API still sometimes exceeded 1 s because a single request could contain too many records. The team introduced a hard limit of 200 records per call, returning an error when the limit is exceeded.

5.1 Front‑end Pagination

Ideally the settlement list page would display only one order per settlement and paginate the rest, reducing the maximum number of IDs per request to 200. However, front‑end resources were unavailable, so this option was postponed.

5.2 Server‑side Batching

The back‑end was changed to split a large request into multiple smaller batches (e.g., five batches of 100 records for a total of 500). These batches can be processed in parallel using the same thread‑pool approach, achieving a final latency of roughly 500 ms.

Both the batch‑limit and multithreaded execution are temporary work‑arounds; a proper solution would involve redesigning the data model and API contract.

The complete micro‑service project referenced in the article is available at https://github.com/macrozheng/mall-swarm (≈ 11 K stars).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

java Indexing Spring MySQL multithreading API performance

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.