Backend Development 13 min read

Optimizing a High‑Concurrency Transaction Statistics Interface from 30 seconds to Sub‑Second Performance

This article presents a real‑world case study of a high‑concurrency data‑processing interface whose response time was reduced from 30 seconds to under one second by diagnosing SQL bottlenecks, refactoring MyBatis code, applying PostgreSQL array aggregation, and introducing a Caffeine cache, while also discussing the limits of relational databases and promoting related AI services.

Top Architect

Aug 3, 2024

Optimizing a High‑Concurrency Transaction Statistics Interface from 30 seconds to Sub‑Second Performance

In a high‑concurrency data‑processing scenario, an interface that handled over 2 million rows originally took about 30 seconds to respond; the article shares a complete troubleshooting and optimization process that brings the response time down to under 0.8 seconds.

Problem Diagnosis

The initial latency was extremely high (≈30 s). Network and server hardware were ruled out, and timing logs pointed to the SQL execution as the culprit.

MyBatis query used to fetch the data:

List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql);

SQL statement retrieving 2 million rows:

SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';

Running the query directly in PostgreSQL took only about 800 ms, revealing that the delay was introduced by the Java layer.

SQL‑Level Analysis

Using EXPLAIN ANALYZE confirmed that the SQL itself was efficient.

EXPLAIN ANALYZE SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';

Code‑Level Analysis

Each row was converted into a Map, creating a massive number of objects and causing heavy GC and memory copy overhead.

Optimization Measures

1. SQL Optimization

Aggregate the 2 million rows into a single array using PostgreSQL array_agg and unnest:

SELECT array_agg(elem) AS concatenated_array FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;

This reduces the result to one row, cutting the data‑transfer cost.

2. Move Business Logic to the Database

Count the occurrences of each ID directly in SQL:

SELECT elem, COUNT(*) AS count FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub GROUP BY elem;

This eliminates the need to process the array in Java.

3. Introduce Caching (Caffeine)

Add the Caffeine dependency:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>

Configure a Spring cache manager:

import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        cacheManager.setCaffeine(Caffeine.newBuilder()
            .maximumSize(500)
            .expireAfterWrite(60, TimeUnit.MINUTES));
        return cacheManager;
    }
}

Use the cache in the rule‑hit‑rate method:

@Autowired
private CacheManager cacheManager;

private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";

@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
    double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
        .get(currentDate.minusDays(1), () -> {
            Map<String, String> hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
            // calculate hitRate ...
            return hitRate;
        });
    // further processing ...
}

Results

After applying the above optimizations, the interface response time dropped from 30 seconds to under 0.8 seconds. The SQL execution now takes ~0.7 seconds, and the Java processing adds only ~1.8 seconds, leaving room for further improvement.

Conclusion

The case demonstrates that relational databases (MySQL, PostgreSQL, etc.) are not ideal for massive data‑analysis workloads; column‑oriented stores such as ClickHouse or Hive are better suited for millisecond‑level queries.

---

Following the technical discussion, the article also contains promotional material for GPT‑4o access, a paid “ChatGPT” community, and various marketing links, which are not part of the technical content.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development Caching Performance Tuning MyBatis Caffeine PostgreSQL sql-optimization

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.