Optimizing Large-Scale Log Reporting in a Backend System Using Archival and Redis Caching
This article describes how a legacy SSM‑based backend reporting service was refactored by archiving log data, leveraging Redis for hourly counters, and scheduling synchronization tasks to dramatically reduce query time for millions of records without adding external processing frameworks.
Background – A client using a WAF firewall product experienced extremely slow loading times on report pages as log data grew, leading to poor user experience. The original code contained a massive method with nearly 1,000 lines, unclear variable names, and minimal documentation.
Technology Stack – The project is built with SSM (Spring, Spring MVC, MyBatis) plus Gateway, Redis, Kafka, and MySQL. The Gateway records request metadata and forwards it to Kafka for persistence.
Optimization Idea – Instead of querying the database directly for each report, the author introduced an archival approach: log events are categorized by hour and status before storage. When a new event arrives, the system checks if a record for that hour and status exists; if so, it increments a counter, otherwise it inserts a new row. This reduces the amount of data scanned during queries.
if (pageResultDTO.isPresent()) {
List<SecurityIncidentDTO> data = pageResultDTO.get().getData();
Long count = Long.parseLong(pageResultDTO.get().getCount().toString());
long normalCount = data.stream().filter(log -> log.getType().equals("正常")).count();
response.setTotalCount(count);
response.setNormalCount(normalCount);
response.setAbNormalCount(count - normalCount);
Map<String, List<SecurityIncidentDTO>> collect = data.stream()
.filter(log -> log.getType().equals("正常"))
.collect(Collectors.groupingBy(item -> new SimpleDateFormat("yyyy-MM-dd HH").format(
com.payegis.antispider.admin.common.utils.DateUtil.pars2Calender(item.getTime()).getTime())));
// ... additional grouping logic omitted for brevity ...
}Redis Caching – To avoid frequent database writes, the hourly counters are stored in Redis using atomic increment operations. A scheduled task synchronizes the cached counts back to MySQL every 30 minutes, and another hourly task pre‑loads the last 23 hours of aggregated data into Redis.
/**
* Archive event details by status and cache the count in Redis for report queries
*/
@Override
public void handleWebEventStatus(Log log) {
String siteId = antispiderDetailLog.getSiteId();
Date curr = new Date();
DateTime beginOfHour = DateUtil.beginOfHour(curr);
Integer eventStatus = log.getAntispiderRule().intValue() == 0 ? 0 : 1;
String cacheKey = StrUtil.format(RedisConstant.REPORT_WEB_TIME_EXIST, siteId, DateUtil.format(beginOfHour, timeFormat), eventStatus);
String cacheKeyAll = StrUtil.format(RedisConstant.REPORT_WEN_TIME_ALL, DateUtil.format(beginOfHour, timeFormat), eventStatus);
if (redisService.exist(cacheKeyAll)) {
redisService.increment(cacheKeyAll, 1L);
} else {
redisService.setValueByHour(cacheKeyAll, 1, 2L);
}
if (redisService.exist(cacheKey)) {
redisService.increment(cacheKey, 1L);
} else {
redisService.setValueByHour(cacheKey, 1, 2L);
}
}
@Scheduled(cron = "0 0/30 * * * ?")
public void synRedisDataToDB() {
synchronized (lock) {
reportWebEventStatusService.synRedisDataToDB();
reportWebEventTopService.synRedisDataToDB();
reportWebIpTopService.synRedisDataToDB();
}
}Parallel Report Assembly – The final dashboard endpoint now assembles each report segment concurrently using Future tasks, reducing overall response time while keeping the original API contract unchanged.
@Override
public ApiDashboardResponse webDashboardV2(DashboardRequest request) throws Exception {
ApiDashboardResponse response = new ApiDashboardResponse();
Future<ReportWebEventTopVo> topFuture = reportTaskExecutor.submit(() -> {
return reportWebEventTopService.getWebEventTopVo(request.getSiteId(), request.getTimeType());
});
Future<ReportWebEventStatusVo> statusFuture = reportTaskExecutor.submit(() -> {
return reportWebEventStatusService.getReportWebEventStatus(request.getSiteId(), request.getTimeType());
});
// ... other futures omitted ...
response.setTopVo(topFuture.get());
response.setStatusVo(statusFuture.get());
// ... assemble remaining data ...
return response;
}Conclusion – By archiving logs, caching hourly aggregates in Redis, and synchronizing data in scheduled jobs, the system reduced query time for 1.5 million log entries to under one second, achieving several‑fold performance improvement without introducing new middleware.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
