Backend Development 10 min read

How We Cut a 30‑Second Database Query to Under 1 Second: Real‑World Optimizations

This article details a real‑world performance case where a high‑concurrency interface processing over two million rows was diagnosed and optimized through SQL tuning, Java code refactoring, and caching, ultimately reducing response time from 30 seconds to under 0.8 seconds.

Java Interview Crash Guide

Sep 12, 2024

How We Cut a 30‑Second Database Query to Under 1 Second: Real‑World Optimizations

In high‑concurrency data processing scenarios, interface response time is crucial. This article shares a real case where a query handling over 2 million rows was reduced from 30 seconds to under 0.8 seconds.

Problem Diagnosis

Initially the interface latency was about 30 seconds. Network and server factors were ruled out, and timing logs pointed to the SQL execution as the bottleneck. The MyBatis query that fetched 2 million rows took the full 30 seconds.

List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql);

The relevant SQL was:

SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';

Running the SQL alone took about 800 ms, revealing that the heavy cost was in Java processing.

SQL Layer Analysis

We used EXPLAIN ANALYZE to examine the query execution plan.

EXPLAIN ANALYZE
SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';

Code Layer Analysis

Each row was materialized as a Map object, which for over 2 million rows caused massive overhead and slowed the interface.

Optimization Measures

1. SQL Optimization

We transformed the 2 million‑row result into a single row using PostgreSQL's array_agg and unnest functions.

SELECT array_agg(elem) AS concatenated_array
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;

This aggregates the array elements into one array, reducing the amount of data transferred.

2. Move Business Logic to Database

We let the database count occurrences of each ID directly, avoiding Java‑side aggregation.

SELECT elem, COUNT(*) AS count
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub
GROUP BY elem;

The query returns each element and its frequency.

3. Introduce Cache Mechanism

To avoid repeated database hits, we added a local cache using Caffeine, which integrates easily with Spring.

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>

Cache configuration:

import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        cacheManager.setCaffeine(Caffeine.newBuilder()
                .maximumSize(500)
                .expireAfterWrite(60, TimeUnit.MINUTES));
        return cacheManager;
    }
}

Using the cache in the hit‑rate calculation:

@Autowired
private CacheManager cacheManager;

private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";

@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
    double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
        .get(currentDate.minusDays(1), () -> {
            Map<String, String> hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
            // additional processing
            return computeHitRate(hitRateList);
        });
    // further processing
}

Conclusion

After applying the above optimizations, the interface response time dropped from 30 seconds to under 0.8 seconds.

The experience highlighted the limitations of relational databases for massive analytical workloads; column‑oriented stores such as ClickHouse or Hive are better suited for millisecond‑level queries.

Relational databases (MySQL, Oracle, PostgreSQL) excel at transactional workloads requiring strong consistency.

Columnar databases (ClickHouse, HBase) provide superior read performance for large‑scale analytics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

MyBatis PostgreSQL SQL Optimization Caffeine Cache

Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.