How I Cut a 20‑Second API Call to Sub‑Second with Three Simple Optimizations

A backend engineer shares a step‑by‑step case study of reducing a batch‑score query API from 20 seconds to under 500 ms by analyzing the problem, adding a composite index, introducing multithreaded CompletableFuture calls, and limiting batch size with pagination and thread‑pool tuning.

ITPUB
ITPUB
ITPUB
How I Cut a 20‑Second API Call to Sub‑Second with Three Simple Optimizations

Problem Discovery

Monitoring showed a batch‑score query endpoint with occasional max latency of 20 s and average latency of 2 s, while most calls returned in ~500 ms. The root cause was that the settlement‑order list page sent a massive list of IDs (hundreds to thousands) to the API, far exceeding the intended pagination limit of 100 records per request. This caused a for‑loop that performed a remote service call and a per‑record database query for each ID.

Current Implementation

The simplified method is:

public List<ScoreEntity> query(List<SearchEntity> list) {
    List<ScoreEntity> result = Lists.newArrayList();
    List<Long> orgIds = list.stream()
        .map(SearchEntity::getOrgId)
        .collect(Collectors.toList());
    // Remote call to obtain organization info
    List<OrgEntity> orgList = feignClient.getOrgByIds(orgIds);
    for (SearchEntity entity : list) {
        String orgCode = findOrgCode(orgList, entity.getOrgId());
        ScoreSearchEntity scoreSearchEntity = new ScoreSearchEntity();
        scoreSearchEntity.setOrgCode(orgCode);
        scoreSearchEntity.setCategoryId(entity.getCategoryId());
        scoreSearchEntity.setBusinessId(entity.getBusinessId());
        scoreSearchEntity.setBusinessType(entity.getBusinessType());
        List<ScoreEntity> resultList = scoreMapper.queryScore(scoreSearchEntity);
        if (CollectionUtils.isNotEmpty(resultList)) {
            result.add(resultList.get(0));
        }
    }
    return result;
}

Key pain points:

Remote service call inside the loop.

Database query inside the loop for each record.

First Optimization – Index Tuning

A composite index covering the columns used in the WHERE clause was added:

ALTER TABLE user_score ADD INDEX `un_org_category_business` (
    `org_code`,
    `category_id`,
    `business_id`,
    `business_type`
) USING BTREE;

After the index, the maximum latency dropped from 20 s to about 5 s.

Second Optimization – Multithreaded Query

Java 8 CompletableFuture was used to parallelise the per‑record queries with a custom thread pool.

CompletableFuture[] futureArray = dataList.stream()
    .map(data -> CompletableFuture.supplyAsync(() -> query(data), asyncExecutor)
        .whenComplete((result, th) -> { /* handle result or error */ })
    .toArray(CompletableFuture[]::new);
CompletableFuture.allOf(futureArray).join();

Thread‑pool configuration:

ExecutorService threadPool = new ThreadPoolExecutor(
    8,                     // corePoolSize
    10,                    // maxPoolSize
    60,                    // keepAliveSeconds
    TimeUnit.SECONDS,
    new ArrayBlockingQueue<>(500),
    new ThreadPoolExecutor.CallerRunsPolicy()
);

This reduced latency from ~5 s to ~1 s.

Third Optimization – Batch Size Limiting & Pagination

To avoid processing an excessive number of records in a single request, the API was limited to a maximum of 200 IDs per call. Requests exceeding this limit return an error, prompting callers to split large payloads into multiple batches (e.g., five batches of 100 IDs) and process them concurrently. When front‑end changes are possible, pagination should be enforced so that each page respects the 100‑record limit.

Limiting to 200 records per call prevents overload, but coordination with business owners is required to avoid breaking existing workflows.

The thread‑pool’s maximum size was increased accordingly, and the service can be horizontally scaled across multiple nodes to avoid a single‑point failure.

Outcome & Lessons Learned

Through the three optimisations the API response time improved sequentially:

20 s → 5 s (index tuning)

5 s → 1 s (multithreading)

1 s → <500 ms (batch size limit & pagination)

These changes are quick, low‑risk fixes. A long‑term solution would involve redesigning the data model and business flow, which requires cross‑team coordination and a phased rollout.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendJavaIndex OptimizationAPI performance
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.