How We Cut a 30‑Second Database Query to Under 1 Second: Real‑World Optimizations
This article details a real‑world performance case where a high‑concurrency interface processing over two million rows was diagnosed and optimized through SQL tuning, Java code refactoring, and caching, ultimately reducing response time from 30 seconds to under 0.8 seconds.
In high‑concurrency data processing scenarios, interface response time is crucial. This article shares a real case where a query handling over 2 million rows was reduced from 30 seconds to under 0.8 seconds.
Problem Diagnosis
Initially the interface latency was about 30 seconds. Network and server factors were ruled out, and timing logs pointed to the SQL execution as the bottleneck. The MyBatis query that fetched 2 million rows took the full 30 seconds.
List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql);The relevant SQL was:
SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';Running the SQL alone took about 800 ms, revealing that the heavy cost was in Java processing.
SQL Layer Analysis
We used EXPLAIN ANALYZE to examine the query execution plan.
EXPLAIN ANALYZE
SELECT programhandleidlist FROM anti_transhandle WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0';Code Layer Analysis
Each row was materialized as a Map object, which for over 2 million rows caused massive overhead and slowed the interface.
Optimization Measures
1. SQL Optimization
We transformed the 2 million‑row result into a single row using PostgreSQL's array_agg and unnest functions.
SELECT array_agg(elem) AS concatenated_array
FROM (
SELECT unnest(programhandleidlist) AS elem
FROM anti_transhandle
WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;This aggregates the array elements into one array, reducing the amount of data transferred.
2. Move Business Logic to Database
We let the database count occurrences of each ID directly, avoiding Java‑side aggregation.
SELECT elem, COUNT(*) AS count
FROM (
SELECT unnest(programhandleidlist) AS elem
FROM anti_transhandle
WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub
GROUP BY elem;The query returns each element and its frequency.
3. Introduce Cache Mechanism
To avoid repeated database hits, we added a local cache using Caffeine, which integrates easily with Spring.
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>3.1.8</version>
</dependency>Cache configuration:
import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager cacheManager = new CaffeineCacheManager();
cacheManager.setCaffeine(Caffeine.newBuilder()
.maximumSize(500)
.expireAfterWrite(60, TimeUnit.MINUTES));
return cacheManager;
}
}Using the cache in the hit‑rate calculation:
@Autowired
private CacheManager cacheManager;
private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";
@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
.get(currentDate.minusDays(1), () -> {
Map<String, String> hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
// additional processing
return computeHitRate(hitRateList);
});
// further processing
}Conclusion
After applying the above optimizations, the interface response time dropped from 30 seconds to under 0.8 seconds.
The experience highlighted the limitations of relational databases for massive analytical workloads; column‑oriented stores such as ClickHouse or Hive are better suited for millisecond‑level queries.
Relational databases (MySQL, Oracle, PostgreSQL) excel at transactional workloads requiring strong consistency.
Columnar databases (ClickHouse, HBase) provide superior read performance for large‑scale analytics.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Interview Crash Guide
Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
