Optimizing a High‑Concurrency Data Processing API: Reducing Response Time from 30 seconds to 0.8 seconds

This article presents a real‑world case study of a high‑concurrency transaction‑interception API handling over two million records, detailing the diagnosis of a 30‑second latency, SQL and Java‑level bottlenecks, and a series of optimizations—including PostgreSQL array aggregation, query refactoring, and Caffeine caching—that reduced response time to under one second.

Architect's Guide
Architect's Guide
Architect's Guide
Optimizing a High‑Concurrency Data Processing API: Reducing Response Time from 30 seconds to 0.8 seconds

Scenario – In a high‑concurrency transaction‑interception system, an API that aggregates rule‑hit rates for over 2 million records took about 30 seconds to respond. The goal was to bring the latency down to sub‑second levels.

Problem Diagnosis – Network and server hardware were ruled out. Timing logs showed the bottleneck was in the SQL query, which took ~30 seconds to fetch 2 million rows. Direct execution of the same SQL in psql took only ~800 ms, indicating additional overhead in the application layer.

SQL‑level Analysis – Using EXPLAIN ANALYZE revealed that the query scanned a large table without efficient aggregation. The original MyBatis call was:

List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql);

The table anti_transhandle stored a column programhandleidlist as an array. Fetching the array for each row and processing it in Java caused massive object creation overhead.

Code‑level Analysis – Each row was mapped to a Map, creating over 2 million map objects, which dramatically slowed down processing.

Optimization Measures

1. SQL Optimization

Transform the 2 million rows into a single aggregated result using PostgreSQL array_agg and unnest:

SELECT array_agg(elem) AS concatenated_array
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;

This reduced the SQL execution time to ~0.8 seconds, but the Java processing still took ~1.8 seconds.

2. Push Business Logic into the Database

Count occurrences of each ID directly in SQL:

SELECT elem, COUNT(*) AS count
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub
GROUP BY elem;

This query ran in ~0.7 seconds but still highlighted the limits of relational databases for massive analytical workloads.

3. Introduce Caching (Caffeine)

Cache historical query results using Caffeine to avoid repeated database hits. Dependency:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>

Cache configuration:

import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        cacheManager.setCaffeine(Caffeine.newBuilder()
            .maximumSize(500)
            .expireAfterWrite(60, TimeUnit.MINUTES));
        return cacheManager;
    }
}

Usage in the service method:

@Autowired
private CacheManager cacheManager;

private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";

@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
    double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
        .get(currentDate.minusDays(1), () -> {
            Map<String, String> hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
            // compute hitRate
            return hitRate;
        });
    // other logic
}

Result – After applying SQL aggregation, moving part of the logic to the database, and caching historical data, the API response time dropped from 30 seconds to under 0.8 seconds.

Conclusion – Relational databases excel at transactional workloads but struggle with massive analytical queries. For such scenarios, column‑oriented stores like ClickHouse or Hive are more appropriate. Nonetheless, careful SQL refactoring, reducing object creation in Java, and leveraging in‑memory caches can dramatically improve backend performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQL Optimization
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.