Design and Implementation of a Front‑End Monitoring Platform and SDK
This article presents a comprehensive guide to building a front‑end monitoring system—including pain points, error‑reconstruction techniques, data collection methods, performance metrics, user‑behavior tracking, and a modular SDK architecture—illustrated with detailed code examples for Vue, React, XHR, fetch, and cross‑origin handling.
Preface
Many developers feel their projects lack highlights and are often asked in interviews to showcase standout work; front‑end monitoring is a compelling project that major companies implement internally, and a project without monitoring is like running naked.
Pain Points
Examples such as a user unable to place an order or an advertisement failing on mobile illustrate the difficulty of reproducing bugs reported by users, which monitoring aims to solve.
Error Restoration
The platform provides three ways to restore errors: source‑code location, screen‑record playback, and user‑behavior recording.
Source‑Code Location
Because production code is bundled and source‑maps are not deployed, the platform uses source‑map to map minified code back to the original source.
Screen Playback
Using rrweb , the platform records the last 10 seconds before an error (configurable) and replays the video, showing mouse trajectories.
User‑Behavior Recording
When screen recording is insufficient, the platform logs clicks, API calls, resource loads, route changes, and errors to help reproduce the issue.
Advantages of a Self‑Developed Monitoring Solution
Unified SDK covering monitoring, tracking, recording, and advertising.
Multiple error‑restoration methods with correlation to custom business metrics.
Personalized metrics such as long‑task, memory usage, and first‑screen load time.
Resource cache‑rate statistics to evaluate caching strategies.
White‑screen detection using a sampling‑comparison and polling correction mechanism.
Design Overview
A complete monitoring platform consists of three parts: data collection & reporting, data analysis & storage, and data visualization.
Monitoring Goals
Visual diagrams illustrate the objectives of error detection, performance tracking, and user‑behavior analysis.
Exception Analysis (5W1H)
What – type of error (JS, async, resource, API).
When – timestamp.
Who – affected users, occurrence count, IP.
Where – page and device information.
Why – stack trace, source‑map, replay.
How – location, alerting, prevention.
Error Data Collection
Basic error categories include JS runtime errors, async errors, static‑resource load errors, and API request errors.
Capture Methods
1) try/catch – catches only synchronous runtime errors.
// Example 1: synchronous error – ✅
try {
let a = undefined;
if (a.length) console.log('111');
} catch (e) {
console.log('Caught exception:', e);
}
// Example 2: syntax error – ❌
try {
const notdefined,
} catch (e) {
console.log('Cannot catch syntax error:', e);
}
// Example 3: async error – ❌
try {
setTimeout(() => console.log(notdefined), 0);
} catch (e) {
console.log('Cannot catch async error:', e);
}2) window.onerror – catches runtime and async errors but not resource errors.
window.onerror = function(message, source, lineno, colno, error) {
console.log('Captured error:', message, source, lineno, colno, error);
};3) window.addEventListener('error') – captures resource loading failures.
window.addEventListener('error', (error) => {
console.log('Captured resource error:', error);
}, true);4) Promise unhandledrejection – captures rejected promises.
window.addEventListener('unhandledrejection', function(e) {
console.log('Caught promise rejection:', e);
e.preventDefault();
});Vue Error Handling
Vue cannot capture errors with window.onerror ; instead it uses Vue.config.errorHandler to intercept and report errors.
Vue.config.errorHandler = (err, vm, info) => {
console.log('Vue error captured:', err);
// report to backend
};React Error Boundary
React 16+ provides ErrorBoundary components that catch render‑time errors and allow reporting in componentDidCatch .
class ErrorBoundary extends React.Component {
constructor(props) { super(props); this.state = { hasError: false }; }
static getDerivedStateFromError(error) { return { hasError: true }; }
componentDidCatch(error, errorInfo) { reportError(error, errorInfo); }
render() { return this.state.hasError ?
Something went wrong.
: this.props.children; }
}Cross‑Domain Script Errors
Errors from scripts loaded from other origins are reported only as script error due to browser security. Adding the crossorigin attribute and configuring the server with Access‑Control‑Allow‑Origin enables full error details.
Interface Error Monitoring
By AOP‑style monkey‑patching, the platform intercepts XMLHttpRequest and fetch to collect request/response data and report errors.
function xhrReplace() {
if (!('XMLHttpRequest' in window)) return;
const originalProto = XMLHttpRequest.prototype;
replaceAop(originalProto, 'open', (originalOpen) => function(...args) {
this._xhr = { method: args[0].toUpperCase(), url: args[1], startTime: Date.now(), type: 'xhr' };
return originalOpen.apply(this, args);
});
replaceAop(originalProto, 'send', (originalSend) => function(...args) {
this.addEventListener('loadend', () => {
const { status, response } = this;
const endTime = Date.now();
this._xhr.status = status;
this._xhr.elapsedTime = endTime - this._xhr.startTime;
reportData(this._xhr);
});
return originalSend.apply(this, args);
});
}
function fetchReplace() {
if (!('fetch' in window)) return;
replaceAop(window, 'fetch', (originalFetch) => function(url, config) {
const sTime = Date.now();
const method = (config && config.method) || 'GET';
let handler = { type: 'fetch', method, url };
return originalFetch.apply(window, [url, config]).then(
res => {
const eTime = Date.now();
handler.elapsedTime = eTime - sTime;
handler.status = res.status;
return res.clone().text().then(data => { handler.responseText = data; reportData(handler); }).then(() => res);
},
err => { handler.elapsedTime = Date.now() - sTime; handler.status = 0; reportData(handler); throw err; }
);
});
}Performance Data Collection
The article explains the transition from window.performance.timing to PerformanceObserver for modern browsers, and mentions the web‑vitals library for metrics such as FCP, LCP, CLS, TTFB, and FID.
Long‑Task Monitoring
const observer = new PerformanceObserver(list => {
for (const entry of list.getEntries()) console.log('Long task:', entry);
});
observer.observe({ entryTypes: ['longtask'] });Memory Usage
performance.memory provides jsHeapSizeLimit , totalJSHeapSize , and usedJSHeapSize ; a used size exceeding the total may indicate a memory leak.
First‑Screen Load Time
Using MutationObserver to track DOM changes within the viewport, combined with document.readyState , yields the time from navigationStart to the last visible DOM mutation.
User‑Behavior Data Collection
A Breadcrumb class stores up to 20 recent actions (route changes, clicks, XHR, etc.) and attaches them to error reports.
class Breadcrumb {
maxBreadcrumbs = 20;
stack = [];
push(data) {
if (this.stack.length >= this.maxBreadcrumbs) this.stack.shift();
this.stack.push(data);
this.stack.sort((a, b) => a.time - b.time);
}
}
const breadcrumb = new Breadcrumb();
breadcrumb.push({ type: 'Route', from: '/home', to: '/about', url: location.href, time: Date.now() });
breadcrumb.push({ type: 'Click', dom: '
Button
', time: Date.now() });
breadcrumb.push({ type: 'Xhr', url: '/api/pushData', time: Date.now() });
reportData({ uuid: 'xxxx', stack: breadcrumb.stack });Route Changes
By overriding history.pushState and history.replaceState , the SDK reports route transitions for both history and hash modes.
Click Events
document.addEventListener('click', ({ target }) => {
if (target.tagName.toLowerCase() === 'body') return;
const dom = `<${target.tagName.toLowerCase()} id="${target.id}" class="${target.className}">${target.innerText}
`;
reportData({ type: 'Click', dom });
}, true);Resource Loading
Using performance.getEntriesByType('resource') the SDK builds a waterfall chart, determines cache status, and reports load durations.
Monitoring SDK Architecture
The SDK follows a publish‑subscribe pattern, exposing an init entry point that registers framework‑specific hooks (Vue errorHandler, React ErrorBoundary) and installs the data‑collection modules.
Event Publishing & Subscribing
Modules such as load.js replace native APIs with wrapped versions that emit events for the core system.
User‑Behavior Module
Implemented in core/breadcrumb.js , it maintains a bounded stack of actions and merges it into error payloads.
Data Transport
Both image‑beacon and fetch transports are supported; image beacons are cross‑origin friendly and fire‑and‑forget, while fetch provides richer payloads.
Reporting Timing
The SDK prefers requestIdleCallback to send data during idle periods, falling back to micro‑tasks when unavailable.
References
Links to source repositories, related libraries (source‑map, rrweb, web‑vitals), and articles on performance metrics are listed at the end of the original document.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.