Estimating Daily Active Users (DAU) Using New Users and Retention Modeling
This article explains how to estimate future daily active users (DAU) for an app by modeling the accumulation of new users and their retention decay, addressing challenges of changing historical retention rates and proposing a combined approach using recent averages and curve‑fitted functions to predict long‑term user activity.
Introduction
New users, retention, and daily active users (DAU) are common product metrics. The article poses two typical forecasting questions: (1) How many DAU can we expect next quarter based on current trends? (2) Is a proposed daily new‑user target realistic for achieving a specific quarterly DAU goal?
DAU as an Accumulation Process
Stacking Concept
Any day's active users are the sum of historical daily new users that have survived through retention decay. New users start to decay from the second day after acquisition; the earlier the acquisition, the smaller the remaining proportion.
The accompanying diagram (not shown) illustrates how each day's new users contribute to the current DAU as vertical slices.
Notation
To avoid verbosity, the following symbols are defined (images represent the symbols):
t – current day in the product’s history.
T – future day whose DAU we want to estimate.
N(t) – new users on day t.
R(t, d) – retention rate of users acquired on day t after d days.
The target estimation equation is then expressed using these symbols (image).
Required Input Data
Daily new‑user count (historical values are known; future planned values are input directly).
Retention rate for each cohort after a given number of days (unknown and needs to be modeled).
The core problem reduces to determining the retention‑decay function.
Challenges with Historical Retention
Two issues arise:
Retention rates have changed over the product’s three‑year history, making simple averages unreliable.
Long‑term forecasts (e.g., six months ahead) require retention data beyond the observed history.
Short‑Term vs. Long‑Term Retention
Short‑Term Variation
Caused by channel expansion or operational campaigns; can be approximated with recent averages.
Long‑Term Variation
Driven by product iterations and evolving user habits; maintaining a full cohort‑by‑cohort retention matrix is computationally heavy and may not reflect future trends.
Proposed Solution
Separate the retention contribution into two parts using the current day as a split point. The first part uses recent (e.g., one‑year) average retention, while the second part models the decay beyond the stable period.
For the stable period, the retention‑decay ratio (R i /R i+1 ) becomes approximately constant, suggesting an exponential decay. Before the stable period, a power‑law function fits better.
Both segments are fitted to historical samples:
Apply a logarithmic transformation to obtain linear relationships.
Estimate parameters via least‑squares regression.
Optimize using Adjusted R‑squared and grid search.
The resulting piecewise function (image) captures both early‑stage and stable‑stage decay.
Final Model
The model outputs the estimated DAU for any future day T as a function of:
Current day’s DAU decomposition (image).
Planned daily new‑user numbers for the forecast horizon.
Recent retention‑rate series (e.g., one‑year sample).
Compared with the simple heuristic “DAU / New Users”, the model provides controlled inputs and reduces sensitivity to historical fluctuations.
Conclusion
The presented methodology enables more reliable DAU forecasting by explicitly modeling new‑user accumulation and retention decay, handling both short‑term variations and long‑term trends through a combination of recent averages and curve‑fitted decay functions.
Liulishuo Tech Team
Help everyone become a global citizen!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.