Implementing Cumulative Aggregation per Code and Date in MySQL Using Variables and the WITH Clause
This article demonstrates how to calculate a running total for each code and date in a MySQL table by using user-defined variables, addresses the need to reset totals per code, and shows an optimized version with the WITH clause and indexing recommendations.
A colleague raised a requirement to compute cumulative statistics in a MySQL table tt that contains three columns: code (identifier), cdate (date), and ctotal (numeric value). The original query simply grouped by code and cdate to obtain the sum for each month.
The business need, however, is to display a running total for each code across dates, e.g., the first row shows total = 10 , the second row total = 10+9 = 19 , the third row total = 10+9+11 = 30 , and so on.
In MySQL, this can be achieved by defining a user variable that accumulates the sum row‑by‑row. An example implementation is:
SELECT code, cdate, ctotal, @running := IF(@prev_code = code, @running + ctotal, ctotal) AS total, @prev_code := code FROM tt JOIN (SELECT @running := 0, @prev_code := NULL) vars ORDER BY code, cdate;
This approach works but fails to reset the accumulator when the code changes, resulting in incorrect totals for subsequent groups (e.g., a total of 35 instead of the expected 5 for a new code).
To fix the reset issue, the query can be extended to detect a change in code and re‑initialize the accumulator:
SELECT code, cdate, ctotal, @g_total := IF(@prev_code = code, @g_total + ctotal, ctotal) AS g_total, @prev_code := code FROM tt JOIN (SELECT @g_total := 0, @prev_code := NULL) vars ORDER BY code, cdate;
With MySQL 8.0, the same result can be expressed more cleanly using a Common Table Expression (CTE) and window functions, eliminating the need to read the table twice:
WITH ordered AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY code ORDER BY cdate) AS rn FROM tt ) SELECT code, cdate, ctotal, SUM(ctotal) OVER (PARTITION BY code ORDER BY cdate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS g_total FROM ordered;
For better performance, especially in large datasets, adding appropriate indexes on the code and cdate columns is recommended to speed up the ordering and partitioning operations.
In summary, implementing the required cumulative aggregation involves understanding the business logic, mapping it to SQL constructs (variables, CTEs, window functions), and considering non‑functional aspects such as indexing and query efficiency.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.