How to Build a Frontend User‑Behavior Tracing System for Debugging External Network Issues

This article analyzes the challenges of reproducing external‑network bugs, outlines common failure causes, and presents a complete design for a JavaScript SDK that records environment data, AJAX calls, errors, and user actions, stores them in IndexedDB, and visualizes the timeline for efficient troubleshooting.

Tencent Music Tech Team
Tencent Music Tech Team
Tencent Music Tech Team
How to Build a Frontend User‑Behavior Tracing System for Debugging External Network Issues

Problem Overview

When debugging external‑network issues, developers often cannot reproduce the problem on their own devices. They must rely on screenshots and vague user descriptions, which leads to low‑efficiency guesswork and generic advice such as clearing cache or reinstalling the app.

The main obstacles are the lack of concrete clues and the fact that users usually do not understand the technical context, so key information may be missing or misleading.

Typical Causes of External‑Network Issues

Backend data errors or missing fields.

Missing fault‑tolerance handling for edge cases, causing page errors.

User network environment or app version problems.

Missing parameters when navigating from a previous page.

Issues triggered by specific user operation steps.

Even with script‑exception monitoring (e.g., Sentry), many user‑reported problems are not caused by JavaScript errors, so a separate monitoring mechanism is required.

Importance of User‑Behavior Traces

Collecting the following data dramatically improves debugging:

Page runtime environment.

Data loaded by the page.

JavaScript error information.

User operation logs (timeline).

By stitching these pieces together via timestamps, the whole execution flow becomes as clear as a surveillance video, making root‑cause analysis much easier.

Design Overview

What to Report: Content and Protocol

The SDK records a baseInfo object describing the environment and a childLogs array for each sub‑event. The data structure is:

const log = {
  baseInfo: {},
  childLogs: [{...}, {...}, ...]
};

Record Types

AJAX Communication

Records every AJAX request to help determine whether a backend data issue is involved.

User Interaction

Captures click events, the DOM xpath of the target element, and all data‑* attributes of the element.

Error Reporting

Logs JavaScript errors and any manually thrown exceptions.

How to Report: SDK Data Collection and Upload Strategy

The SDK only activates in logged‑in contexts; unauthenticated pages are ignored.

Data Collection Steps

Generate a unique FtraceId (UUID) per page entry; all child records share this ID.

Hook XMLHttpRequest using the lightweight ajax‑hook library (≈639 B gzipped) to capture open, send, and readystatechange stages.

hookAjax({
  open: handleOpen,
  onreadystatechange: handleStage
});

During open the SDK records request time, method, URL and query parameters (filtering out its own reporting requests). During send it captures the POST body. During readystatechange it records response time, HTTP status and response body (if JSON).

function handleOpen(arg, xhr) {
  const urlPath = arg[1] && arg[1].split('?');
  if (/stat\.y\.qq\.com/.test(urlPath[0])) return;
  const curAjaxFields = {
    FtimeStamp: getNowDate(),
    FajaxSendTime: getNowDate(),
    FajaxMethod: arg[0].toUpperCase(),
    FajaxUrl: urlPath[0],
    FajaxParam: urlPath[1] || '',
    Forder: logger.order++
  };
  xhr.curAjaxFields = curAjaxFields;
}

function handleStage({ xhr }) {
  if (/stat\.y\.qq\.com/.test(xhr.urlPath)) return;
  if (xhr.readyState === 2) {
    $.extend(xhr.curAjaxFields, { FajaxReceiveTime: getNowDate(), FajaxHttpCode: xhr.status });
  } else if (xhr.readyState === 4) {
    const resp = xhr.response || xhr.responseText;
    let jsonRes = '';
    try { jsonRes = JSON.parse(resp); } catch (e) { console.error(e); }
    $.extend(xhr.curAjaxFields, {
      FajaxReceiveData: resp,
      FajaxStateCode: jsonRes ? getStateCode(jsonRes).join(',') : ''
    });
  }
}

User Interaction Capture

Event delegation on document listens for elements with class .js_qm_trace. For each click the SDK records a timestamp, the DOM xpath, and all data‑* attributes.

$(document).on('click', '.js_qm_trace', e => {
  const target = e.currentTarget;
  const FtimeStamp = getNowDate();
  const FdomPath = _getDomPath(e.path);
  let Fattr = null, FtraceContent = null;
  if (target.hasAttributes()) {
    const processedData = _processAttrMap(target.attributes);
    Fattr = processedData.Fattr;
    FtraceContent = processedData.FtraceContent;
  }
  // push to childLogs and optionally upload
});

Reporting Strategy

Data is first cached in IndexedDB (≈500 MB, asynchronous, supports custom indexes). A whitelist service determines whether cached data should be uploaded immediately. Critical JavaScript errors bypass the cache and are sent instantly.

Server‑Side Data Processing

Incoming reports pass through an Nginx proxy that logs the raw request body using a custom trace log format. The log is written to /data/qmtrace/access.log with large body buffers (1 GB) to avoid truncation.

http {
  log_format trace '$request_body';
  server {
    location /trace/ {
      client_body_buffer_size 1000m;
      client_max_body_size 1000m;
      proxy_pass http://127.0.0.1:6699/env;
      access_log /data/qmtrace/log/access.log trace;
    }
  }
  server {
    listen 6699;
    location /env/ {
      client_max_body_size 1000m;
      alias /data/qmtrace/;
    }
  }
}

A cron job moves the log file every five minutes to a directory named by the current hour, then signals Nginx ( kill -USR1 $(cat ${nginx_pid})) to reopen the log. A Node.js script parses the rotated access_*.log files and inserts the records into a database.

Data Visualization – Query Platform

The internal portal allows searching by user ID (UIN) or page URL. Each page visit is a distinct trace identified by FtraceId. The left pane lists visits; selecting one loads detailed timeline data on the right.

Conclusion

The system consists of four pillars: what to report, how to report, server‑side processing, and data presentation. The prototype shows that collecting environment info, AJAX traces, user interactions, and error logs in a unified timeline greatly speeds up external‑network debugging. Future work includes refining the data schema, optimizing storage, and extending the UI.

References

Frontend Exception Monitoring Solutions – https://cdc.tencent.com/2018/09/13/frontend-exception-monitor-research/

Frontend SDK Development Practices – https://juejin.im/post/598850c9f265da3e3b66c49e

IndexedDB Tutorial – http://www.ruanyifeng.com/blog/2018/07/indexeddb.html

Ajax‑hook Principle – https://www.jianshu.com/p/7337ac624b8e

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DebuggingfrontendMonitoringJavaScriptIndexedDBUser Behavior Trackingajax-hook
Tencent Music Tech Team
Written by

Tencent Music Tech Team

Public account of Tencent Music's development team, focusing on technology sharing and communication.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.