Optimizing Large-Scale Log Reporting in a Backend System Using Archival and Redis Caching

This article describes how a legacy SSM‑based backend reporting service was refactored by archiving log data, leveraging Redis for hourly counters, and scheduling synchronization tasks to dramatically reduce query time for millions of records without adding external processing frameworks.

Top Architect
Top Architect
Top Architect
Optimizing Large-Scale Log Reporting in a Backend System Using Archival and Redis Caching

Background – A client using a WAF firewall product experienced extremely slow loading times on report pages as log data grew, leading to poor user experience. The original code contained a massive method with nearly 1,000 lines, unclear variable names, and minimal documentation.

Technology Stack – The project is built with SSM (Spring, Spring MVC, MyBatis) plus Gateway, Redis, Kafka, and MySQL. The Gateway records request metadata and forwards it to Kafka for persistence.

Optimization Idea – Instead of querying the database directly for each report, the author introduced an archival approach: log events are categorized by hour and status before storage. When a new event arrives, the system checks if a record for that hour and status exists; if so, it increments a counter, otherwise it inserts a new row. This reduces the amount of data scanned during queries.

if (pageResultDTO.isPresent()) {
    List<SecurityIncidentDTO> data = pageResultDTO.get().getData();
    Long count = Long.parseLong(pageResultDTO.get().getCount().toString());
    long normalCount = data.stream().filter(log -> log.getType().equals("正常")).count();
    response.setTotalCount(count);
    response.setNormalCount(normalCount);
    response.setAbNormalCount(count - normalCount);
    Map<String, List<SecurityIncidentDTO>> collect = data.stream()
        .filter(log -> log.getType().equals("正常"))
        .collect(Collectors.groupingBy(item -> new SimpleDateFormat("yyyy-MM-dd HH").format(
            com.payegis.antispider.admin.common.utils.DateUtil.pars2Calender(item.getTime()).getTime())));
    // ... additional grouping logic omitted for brevity ...
}

Redis Caching – To avoid frequent database writes, the hourly counters are stored in Redis using atomic increment operations. A scheduled task synchronizes the cached counts back to MySQL every 30 minutes, and another hourly task pre‑loads the last 23 hours of aggregated data into Redis.

/**
 * Archive event details by status and cache the count in Redis for report queries
 */
@Override
public void handleWebEventStatus(Log log) {
    String siteId = antispiderDetailLog.getSiteId();
    Date curr = new Date();
    DateTime beginOfHour = DateUtil.beginOfHour(curr);
    Integer eventStatus = log.getAntispiderRule().intValue() == 0 ? 0 : 1;
    String cacheKey = StrUtil.format(RedisConstant.REPORT_WEB_TIME_EXIST, siteId, DateUtil.format(beginOfHour, timeFormat), eventStatus);
    String cacheKeyAll = StrUtil.format(RedisConstant.REPORT_WEN_TIME_ALL, DateUtil.format(beginOfHour, timeFormat), eventStatus);
    if (redisService.exist(cacheKeyAll)) {
        redisService.increment(cacheKeyAll, 1L);
    } else {
        redisService.setValueByHour(cacheKeyAll, 1, 2L);
    }
    if (redisService.exist(cacheKey)) {
        redisService.increment(cacheKey, 1L);
    } else {
        redisService.setValueByHour(cacheKey, 1, 2L);
    }
}

@Scheduled(cron = "0 0/30 * * * ?")
public void synRedisDataToDB() {
    synchronized (lock) {
        reportWebEventStatusService.synRedisDataToDB();
        reportWebEventTopService.synRedisDataToDB();
        reportWebIpTopService.synRedisDataToDB();
    }
}

Parallel Report Assembly – The final dashboard endpoint now assembles each report segment concurrently using Future tasks, reducing overall response time while keeping the original API contract unchanged.

@Override
public ApiDashboardResponse webDashboardV2(DashboardRequest request) throws Exception {
    ApiDashboardResponse response = new ApiDashboardResponse();
    Future<ReportWebEventTopVo> topFuture = reportTaskExecutor.submit(() -> {
        return reportWebEventTopService.getWebEventTopVo(request.getSiteId(), request.getTimeType());
    });
    Future<ReportWebEventStatusVo> statusFuture = reportTaskExecutor.submit(() -> {
        return reportWebEventStatusService.getReportWebEventStatus(request.getSiteId(), request.getTimeType());
    });
    // ... other futures omitted ...
    response.setTopVo(topFuture.get());
    response.setStatusVo(statusFuture.get());
    // ... assemble remaining data ...
    return response;
}

Conclusion – By archiving logs, caching hourly aggregates in Redis, and synchronizing data in scheduled jobs, the system reduced query time for 1.5 million log entries to under one second, achieving several‑fold performance improvement without introducing new middleware.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

javaperformance optimizationredisKafka
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.