Databases 8 min read

How to Supercharge a Weekly Report Service Handling 100M Records

This article details a step‑by‑step performance overhaul for a weekly report service that will exceed 100 million rows, covering indexing, weekly sharding, Redis caching, and data pre‑warming to dramatically cut query latency and reduce database load.

Java Interview Crash Guide

Oct 7, 2021

How to Supercharge a Weekly Report Service Handling 100M Records

Background

A new requirement was assigned: a weekly report service that will ingest tens of millions of new records each week and is expected to exceed one hundred million rows within two months. The article records the performance‑optimization process for this service.

Details

Requirement Overview

The service provides a weekly summary of each user's usage time and last active time. Although the backend only needs to expose an API to query a user's weekly data, the user base is massive—tens of millions, with about ten million users needing weekly reports, resulting in roughly ten million new rows per week and over one hundred million rows in two months.

Problem Statement

The data is stored in a single database table. With such volume, a naïve full‑table scan for each request would cause severe performance degradation and cannot handle the incoming request load.

Optimization 1: Add Index

Creating a composite index on user_id and week_start_date reduces query time from over 10 seconds to under 10 milliseconds when tested with ten million rows.

Optimization 2: Sharding by Week

Because data is added once per week, the table is partitioned into separate weekly tables (e.g., t_weekly_info_20210705). Queries target the specific week’s table, dramatically shrinking the scanned data set. After sharding, the index can be simplified to only user_id since the week column is constant within each table.

Optimization 3: Cache with Redis

Redis is used to cache weekly reports. On a request, the service first checks Redis; if the data is missing, it falls back to the database and then populates Redis. Empty results are also cached to avoid repeated DB hits. Cache expiration is set to about one day, matching typical weekly‑report access patterns. The article also warns about cache breakdown and suggests mitigation via data pre‑warming.

Optimization 4: Data Pre‑warming

Data pre‑warming (asynchronous cache refresh) loads recent weeks’ data into Redis ahead of time. Since most users only view the last one or two weeks, caching this slice (≈700 MB per week, ~4 GB Redis instance) provides fast responses while keeping memory usage reasonable.

Conclusion

Four simple yet effective techniques—indexing, weekly sharding, Redis caching, and data pre‑warming—significantly improve the service’s performance by minimizing database queries and reducing the amount of data scanned per request.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

caching databases

Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.