Frontend Development 17 min read

How to Build a Real‑Time Page Performance Monitoring System

This article explains why monitoring page performance is crucial for user experience and SEO, outlines the design of a three‑part monitoring system—including front‑end data reporting via Navigation Timing, server‑side log collection with Nginx, data aggregation, sampling, storage, and visual dashboards—providing a complete end‑to‑end solution.

QQ Music Frontend Team

Aug 19, 2018

How to Build a Real‑Time Page Performance Monitoring System

Background

Why monitor page performance? Poor performance hurts revenue because users may abandon a slow page, especially on mobile where tolerance for latency is low. Slow loading also harms SEO; high bounce rates lead Google to lower rankings. Since performance degrades over iterations, a continuous monitoring system is needed to evaluate, alert, and guide optimization.

Existing tools like GTmetrix provide static analysis but cannot reflect real‑world user conditions, regional speeds, or functional timings such as time to first click or ad display. Therefore we embed JavaScript on pages to collect real user data, report it to a server, aggregate, process, and visualize it.

Design of the Monitoring System

The system consists of three parts:

Front‑end reporting

How to record timing points

How to report the data

Data sampling

Data processing and storage

Data presentation

Front‑end Reporting

We inject a JavaScript snippet to capture performance metrics that reflect user experience, such as white‑screen time, first‑screen time, and time to interactive.

Determining the Start Point

The start point is when the user presses Enter after entering the URL. Modern browsers provide the Navigation Timing API to obtain this timestamp.

In Chrome, open the console and inspect performance.timing to see a list of timestamps measured in milliseconds since the Unix epoch. A zero value indicates the event did not occur.

The navigationStart property marks the moment the browser begins the request (i.e., the user hits Enter or refreshes the page).

White‑Screen Time

White‑screen time is the interval until the first visual element appears. It is not simply the time to first byte because the page may still be blank while header resources load.

Three scenarios are considered:

Static pages without JavaScript rendering: white‑screen ends after header resources load. A script placed at the end of the <head> can log the time.

Pages built with frameworks like Vue or React: rendering occurs after JavaScript execution or asynchronous data fetching, so white‑screen ends after the loading indicator disappears.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <meta http-equiv="X-UA-Compatible" content="ie=edge">
  <!-- Header resources -->
  <link href="style.css">
  <title>Document</title>
  <scirot>
    // Record white‑screen end time
    var time = +new Date() - performance.timing.navigationStart;
  </scirot>
</head>
<body>
</body>
</html>

First‑Screen Time

First‑screen time is the moment when all resources required for the initial viewport are fully rendered. For image‑heavy pages, it is after the last image loads; for data‑driven pages, it is after data insertion.

Reporting Method

After measuring timestamps, the data must be sent to the backend with minimal impact on the page. An <img> tag with a GET request is used because it avoids CORS issues and works across browsers.

No AJAX cross‑origin problems; can request different origins.

Old tag with universal browser support.

var i = new Image();
 i.onload = i.onerror = i.onabort = function () {
   i = i.onload = i.onerror = i.onabort = null;
 };
 i.src = url;

Modern browsers also support navigator.sendBeacon, which sends small data asynchronously and even works when the page is closed. The final strategy prefers sendBeacon when available, otherwise falls back to the <img> method.

navigator.sendBeacon(url, data ? $.param(data) : null);

Sampling

Because the volume of reported data is huge, sampling is applied on the client side. The sampling rate is indicated by a rate parameter (e.g., rate=10 for 1/10 sampling).

Data Collection and Storage

An Nginx server records each reporting request in logs, capturing request headers, IP, parameters, etc. Logs are rotated every five minutes using a custom configuration rather than the daily logrotate interval.

if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{1})[0-4]") {
  set $logname $1-$2-$3-$4-$50;
}
if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{1})[5-9]") {
  set $logname $1-$2-$3-$4-$55;
}
access_log logs/stat.y.qq.com.sp.access.$logname.log spdata;
log_format spdata '$time_local ~|^ $http_x_forwarded_for ~|^ $request ~|^ $http_referer ~|^ $status ~|^ $http_user_agent ~|^ $cookie_ptisp ~|^ $cookie_uin';

The log fields include timestamp (5‑minute bucket), IP, reported data, product ID, project ID, page ID, measurement points, sampling rate, referer, parsed user‑agent info, ISP, and user ID.

Data Ingestion

To reduce server load, reporting machines and ingestion servers are separated. The ingestion server periodically pulls log files from the reporting machines for processing.

Database Design

Given billions of daily page views, data is partitioned by date into separate tables. Three tables are used:

Statistics table : stores 5‑minute average latency per page.

Raw data table : holds original records.

Index table : provides fast lookup of raw data.

The statistics table enables quick queries for trends, while the raw and index tables support complex multi‑dimensional queries (e.g., by country, ISP, network type). Each raw table is kept under ten million rows to maintain MySQL performance.

Threshold Alerts

If a data interface becomes slow, the system triggers an alert when the 5‑minute average exceeds a configurable threshold (default 10 seconds), notifying developers to investigate.

Data Presentation

The UI shows a bar chart of all monitoring points for a page, daily trends for a single point, and multi‑dimensional analysis tables.

Overall Page Overview

The chart helps developers quickly locate bottlenecks.

Detail of a Monitoring Point

Average latency

Request count

Slow‑user proportion

Latency distribution

Additional dimensions for analysis include country, province, ISP, network type, and operating system.

Abnormal Data Handling

Outliers (e.g., a single report taking >30 minutes) can distort averages. Points exceeding 10 minutes are filtered out to keep charts reliable.

Conclusion

We described a three‑layer monitoring system covering front‑end reporting, data collection and storage, and visualization. Continuous performance monitoring is essential for delivering a smooth user experience.

References

https://fex.baidu.com/blog/2014/05/build-performance-monitor-in-7-days/

https://www.qcloud.com/community/article/655542

http://javascript.ruanyifeng.com/bom/performance.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance monitoring Nginx Data Visualization Web Analytics navigation timing

Written by

QQ Music Frontend Team

QQ Music Web Frontend Team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.