How to Build a Front‑End User Behavior Tracing System for Faster Issue Diagnosis

This article explains the design and implementation of a front‑end user behavior tracing system, covering common external network problems, the importance of collecting runtime environment, data, JS errors, and interaction logs, and detailing SDK data collection, reporting strategies, server processing, and query platform visualization.

QQ Music Frontend Team
QQ Music Frontend Team
QQ Music Frontend Team
How to Build a Front‑End User Behavior Tracing System for Faster Issue Diagnosis

Current Situation Analysis

When diagnosing external network issues, the most challenging cases are those that cannot be reproduced or appear intermittently. Without access to packet captures, breakpoints, or logs on the user's device, we must rely on screenshots and limited user descriptions, using guesswork and elimination methods, often ending with a generic suggestion to clear cache or reinstall the app.

The low efficiency stems from a lack of clues and users' limited technical understanding, which may omit or provide misleading information.

Common Causes of External Network Issues

Backend data returns abnormal or contains empty fields.

Pages lack proper fault‑tolerance for edge cases, causing errors.

User network environment or app version problems.

Missing parameters when navigating from a previous entry point.

Issues triggered by specific user operation steps.

Although we have script exception monitoring (e.g., Sentry), many user‑reported external issues are not caused by script exceptions and therefore cannot trigger automatic reports. A secondary reporting mechanism is needed for these scenarios.

Importance of User Behavior Trace

Collecting the following data greatly improves the ability to locate external issues:

Page runtime environment.

Data loaded by the page.

Page JS error information.

User operation logs (timeline).

By linking these data points with timestamps, we can create a clear timeline similar to a crime‑scene video, making analysis and problem localization much easier.

Design Overview

What to Report: Content and Protocol

Each page visit is treated as a basic query unit. For a user who visits page A three times, three records are stored, each containing multiple child records that share common base information.

const log = {
  baseInfo: {},
  childLogs: [{...}, {...}, ...]
};

Base Information

The baseInfo field records the page's runtime environment, such as browser, OS, and other contextual data.

Child Record Types

Type 1: AJAX Communication

Records all AJAX requests to help determine whether backend data is the root cause of an issue.

Type 2: User Interaction

Records click events and DOM attributes associated with user actions.

Type 3: Error Reporting

Records JavaScript errors and manually thrown exceptions.

How to Report: SDK Data Collection and Reporting Strategy

Data is collected by loading a JavaScript SDK on the page. Collection is performed only for logged‑in users; unauthenticated pages are ignored.

Data Collection Methods

A unique FtraceId (UUID) is generated when the user enters a page and is shared by all subsequent child records.

AJAX Hook

hookAjax({
  open: this.handleOpen,
  onreadystatechange: this.handleStage
});

During the open phase we capture request time, method, and parameters (excluding our own reporting requests). In the send phase we capture POST bodies. The readyStateChange phase records response time, HTTP status, and response data, attaching all collected fields to the current xhr object.

User Interaction Tracking

$(document).on('click', '.js_qm_trace', e => {
  const target = e.currentTarget;
  const FtimeStamp = getNowDate();
  const FdomPath = _getDomPath(e.path);
  let Fattr = null, FtraceContent = null;
  if (target.hasAttributes()) {
    const processed = _processAttrMap(target.attributes);
    Fattr = processed.Fattr;
    FtraceContent = processed.FtraceContent;
  }
  // ...report action...
});

Reporting Strategy

Collected data is first cached locally using IndexedDB (large capacity, asynchronous, supports custom indexes). When the user and page URL are on a whitelist, cached data is uploaded; otherwise, data is uploaded on demand. Errors bypass the cache and are reported immediately.

Server‑Side Data Processing

Data is posted to an Nginx server, which logs the request body using a custom log format.

http {
  log_format trace '$request_body';
  server {
    location /trace/ {
      client_body_buffer_size 1000m;
      client_max_body_size 1000m;
      proxy_pass http://127.0.0.1:6699/env;
      access_log /data/qmtrace/log/access.log trace;
    }
    server {
      listen 6699;
      location /env/ {
        client_max_body_size 1000m;
        alias /data/qmtrace/;
      }
    }
  }
}

A cron job runs every five minutes to rotate the access log, rename it with a timestamp, and signal Nginx to reopen the log file. A Node.js script then parses the rotated logs and stores the records in a database.

Data Presentation

The internal query platform allows searching by user UIN and page URL. Results are displayed as a list of trace IDs on the left and a detailed timeline view on the right, showing the sequence of user actions during a single page visit.

Conclusion

We described what to report (content and protocol), how to report (SDK collection and reporting strategy), server‑side processing, and data visualization, building an initial user behavior tracing system that significantly improves efficiency in handling external network issues. The system is extensible and can be refined further.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MonitoringajaxIndexedDBUser Behavior Tracking
QQ Music Frontend Team
Written by

QQ Music Frontend Team

QQ Music Web Frontend Team

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.