How to Cut API Latency from Seconds to Milliseconds: Elegant Optimization Techniques

This article walks through a series of practical backend techniques—batch processing, asynchronous execution, caching, pre‑processing, pooling, parallelization, indexing, transaction management, pagination, and lock granularity—to dramatically reduce API response times from several seconds to just a few milliseconds.

SpringMeng
SpringMeng
SpringMeng
How to Cut API Latency from Seconds to Milliseconds: Elegant Optimization Techniques

Background

The legacy project suffered from excessively long API response times, prompting a focused effort to improve interface performance.

Batch Processing

Replace per‑record inserts with a single batch operation to reduce repeated I/O.

// single‑record insert
list.stream().forEach(msg -> {
    insert();
});
// batch insert
batchInsert();

Asynchronous Execution

Move non‑critical, time‑consuming logic (e.g., accounting and file writing in a financial purchase API) to asynchronous processing using thread pools, message queues, or scheduling frameworks.

Space‑for‑Time (Caching)

Cache frequently accessed, rarely changed data (e.g., weekly stock rotation info) to avoid repeated database queries and heavy calculations.

Pre‑Processing

Pre‑compute derived values such as annualized returns from net values and store them, allowing the API to fetch ready‑made results directly.

Pooling

Reuse resources like database connections and threads through pooling, following the principle of pre‑allocation and cyclic reuse.

Serial to Parallel

Parallelize independent queries (e.g., user account, product info, banner data) to reduce cumulative latency.

Indexing

Add appropriate indexes to speed up data retrieval; the article notes common scenarios where indexes may not be effective.

Avoid Large Transactions

Long‑running transactions hold database connections, causing contention. Example code shows a @Transactional method that performs multiple inserts and a push RPC; moving RPC calls out of the transaction and limiting data processed mitigates the issue.

@Transactional(value = "taskTransactionManager", propagation = Propagation.REQUIRED, isolation = Isolation.READ_COMMITTED, rollbackFor = {RuntimeException.class, Exception.class})
public BasicResult purchaseRequest(PurchaseRecord record) {
    BasicResult result = new BasicResult();
    // insert tasks
    taskMapper.insert(...);
    // ...
    result.setInfo(ResultInfoEnum.SUCCESS);
    return result;
}

Recommendations: 1) Do not place RPC calls inside transactions; 2) Keep read‑only queries outside transactions; 3) Limit the amount of data processed within a transaction.

Program Structure Refactoring

Iterative development can lead to tangled code with redundant queries and object creation; a systematic refactor evaluates each block’s purpose, reorders execution, and removes duplication.

Deep Pagination

Using LIMIT offset, count forces the database to scan all preceding rows. Replace it with an indexed condition such as WHERE id > lastId LIMIT count to leverage the primary‑key index.

SELECT * FROM purchase_record WHERE productCode = 'PA9044' AND status = 4 AND id > 100000 LIMIT 200;

SQL Optimization

Combine indexing, efficient pagination, and other SQL tuning techniques to improve query performance.

Lock Granularity

Avoid coarse‑grained locks that serialize unrelated work. Lock only the critical section that accesses shared resources.

// Wrong: locks both shared and non‑shared work
synchronized(this) {
    share();
    notShare();
}
// Correct: lock only shared part
notShare();
synchronized(this) {
    share();
}

Conclusion

Performance problems often accumulate over multiple iterations. By rethinking design, applying the above techniques, and focusing on efficient interface design, teams can achieve substantial latency reductions and cost savings.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancedatabaseBatch ProcessingAsynchronouscachingAPI optimization
SpringMeng
Written by

SpringMeng

Focused on software development, sharing source code and tutorials for various systems.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.