Backend Development 13 min read

Optimizing a High‑Concurrency Backend Interface: Reducing Response Time from 30 seconds to 0.8 seconds

This article presents a real‑world case study of a high‑traffic backend API that originally took 30 seconds to process over two million records, detailing problem diagnosis, SQL and Java code analysis, and a series of optimizations—including SQL rewrites, database‑side aggregation, and Caffeine caching—that ultimately cut the response time to under one second.

Top Architect
Top Architect
Top Architect
Optimizing a High‑Concurrency Backend Interface: Reducing Response Time from 30 seconds to 0.8 seconds

In a high‑concurrency data‑processing scenario, an interface that queried more than 2 million rows originally required about 30 seconds to respond, which was unacceptable for a transaction‑interception system.

Problem Diagnosis

The latency was traced to the SQL execution layer after ruling out network and server hardware issues. The MyBatis query List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql); caused a 30‑second execution when fetching 2 million rows.

SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';

Running the raw SQL in PostgreSQL took only ~800 ms, indicating the bottleneck was in the Java‑side processing of the result set.

SQL‑Level Analysis

Using EXPLAIN ANALYZE revealed that the query itself was efficient, but the way the result set was handled caused excessive overhead.

Code‑Level Analysis

Each row was materialized into a Map object, creating millions of temporary objects and exhausting CPU time.

Optimization Measures

1. SQL Optimization

Transform the 2 million rows into a single aggregated result using PostgreSQL array functions:

SELECT array_agg(elem) AS concatenated_array
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;

This reduces the result set to one row, cutting the data transferred to the application.

2. Move Business Logic to the Database

Count occurrences of each element directly in SQL, avoiding post‑processing in Java:

SELECT elem, COUNT(*) AS count
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub
GROUP BY elem;

This query runs in ~0.7 seconds, the fastest achievable given the data volume.

3. Introduce Caching

Use Caffeine as a local cache to store historical statistics, reducing repeated database hits. Add the dependency:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>

Configure the cache:

import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        cacheManager.setCaffeine(Caffeine.newBuilder()
                .maximumSize(500)
                .expireAfterWrite(60, TimeUnit.MINUTES));
        return cacheManager;
    }
}

Apply the cache in the service method:

@Autowired
private CacheManager cacheManager;

private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";

@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
    double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
        .get(currentDate.minusDays(1), () -> {
            Map
hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
            // compute hit rate
            return compute(hitRateList);
        });
    return hitRate;
}

Result

After applying the SQL rewrite, moving aggregation to PostgreSQL, and adding Caffeine caching, the interface response time dropped from 30 seconds to under 0.8 seconds, meeting the performance requirement.

Conclusion

The case demonstrates that for massive data‑intensive queries, relational databases may become a bottleneck; leveraging set‑based SQL operations, database‑side aggregation, and appropriate caching can dramatically improve latency. For even larger scales, column‑oriented stores such as ClickHouse or Hive may be more suitable.

BackendJavaperformanceMyBatisPostgreSQLSQL OptimizationCaffeine Cache
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.