Optimizing a Slow Batch Scoring Query Interface: From 20 seconds to Sub‑500 ms

This article walks through a real‑world backend performance case, analyzing why a batch scoring API took up to 20 seconds, then applying three rounds of optimizations—including index tuning, multithreaded querying with CompletableFuture, and request‑size limiting—to reduce latency from seconds to under half a second.

IT Services Circle
IT Services Circle
IT Services Circle
Optimizing a Slow Batch Scoring Query Interface: From 20 seconds to Sub‑500 ms

Preface

Interface performance is a critical topic for backend developers; improving a slow API often requires tackling the problem from multiple angles. This article shares a practical experience of optimizing a batch‑scoring query that originally took 20s and was reduced to under 500ms.

1. Incident Scene

Every morning the team receives a summary email listing the batch‑scoring interface address, call count, max latency, average latency, and traceId. One entry showed a maximum latency of 20s and an average of 2s. Using SkyWalking we observed that most calls return within 500 ms, but a few exceed 20 s, suggesting an abnormal case.

The initial suspicion was that large data volume caused the slowdown (e.g., querying the root organization node). However, a colleague discovered that the settlement‑order list page sent a massive list of IDs to the batch‑scoring API, sometimes thousands, far exceeding the intended page‑size limits of 10‑100.

2. Current Situation

If only a few hundred primary‑key IDs are queried, the database can use an index efficiently. The batch‑scoring logic, however, is more complex and involves a remote call to fetch organization info and a per‑record query inside a for loop.

Key pain points:

Remote call to another service inside the API.

Database query inside the for loop for each record.

The remote call is unavoidable because the evaluation table stores only the organization ID; if the organization code changes, the system must fetch the latest code to keep data consistent.

3. First Optimization

The most straightforward improvement is to tune the database index. Adding a composite index on org_code, category_id, business_id, and business_type dramatically reduced the max latency from 20s to about 5s.

alter table user_score add index `un_org_category_business` (`org_code`,`category_id`,`business_id`,`business_type`) USING BTREE;

4. Second Optimization

Since each record still required a separate query, we switched the single‑threaded loop to a multithreaded approach using Java 8 CompletableFuture and a custom thread pool.

CompletableFuture[] futureArray = dataList.stream()
    .map(data -> CompletableFuture
        .supplyAsync(() -> query(data), asyncExecutor)
        .whenComplete((result, th) -> { /* handle */ }))
    .toArray(CompletableFuture[]::new);
CompletableFuture.allOf(futureArray).join();

The thread pool is defined as:

ExecutorService threadPool = new ThreadPoolExecutor(
    8,   // corePoolSize
    10,  // maximumPoolSize
    60,  // keepAliveSeconds
    TimeUnit.SECONDS,
    new ArrayBlockingQueue(500),
    new ThreadPoolExecutor.CallerRunsPolicy());

With this change the latency dropped from ~5 s to ~1 s.

5. Third Optimization

Even after multithreading, the interface sometimes exceeded 1 s because a single request could still contain thousands of IDs. The final step was to limit the batch size to 200 records per call and encourage the caller to split larger workloads. Two practical options were discussed: Front‑end pagination: display only one order per settlement and fetch additional orders on demand. Batch‑calling the API: split a large request into multiple 100‑record calls, optionally using multithreading on the client side. Implementing the batch‑size limit reduced the worst‑case latency to under 500ms . The article notes that these measures are temporary fixes; a long‑term solution would require redesigning the data model and possibly changing business processes. Conclusion Through three iterative optimizations—index tuning, multithreaded querying, and request‑size limiting—the batch‑scoring API’s response time improved from 20 seconds to sub‑500 ms, demonstrating a pragmatic approach to backend performance engineering.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

optimization
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.