How to Build a Full‑Featured Front‑End Monitoring System
This article explains how to design and implement a comprehensive front‑end monitoring solution that captures errors, performance metrics, and client data, covering data collection, tracing, transmission, storage, and analysis to help developers quickly locate and resolve issues.
Background
Although the company has solid server‑side monitoring, client‑side monitoring was lacking, so a custom front‑end monitoring system was built to capture errors, reconstruct error scenarios, and analyze page performance for early problem detection.
Tracing
Metrics
Logging
Getting Started
Design begins with deciding which data to collect: page performance data, error data, and client information.
Performance Data
Web Vitals such as Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS) are core metrics, complemented by TTFB and FCP for deeper diagnostics.
Error Data
Browser errors are mainly Error and DOMException . They can be captured via script errors (window.onerror), resource loading errors (object.onerror, performance API), and Promise rejections (unhandledrejection).
Client Data
Collect browser name/version, OS version, WebView container info, network details, and optional user identifiers, while respecting data‑privacy principles.
Overall Design
The system consists of three major components: collector, transmission & storage, and query & analysis.
Collector
The collector is an event listener that captures onerror and unhandledrejection events and sends them to the backend. To avoid overload from repetitive errors, a simple queue with length limits and deduplication over a time window is used.
Stack traces are normalized using tools like TraceKit, then mapped back to original sources via SourceMap files uploaded during the Webpack build.
Performance data is gathered by listening to page load events and creating a timeline of spans (resource loading, rendering, network requests, user interactions). Transactions start on page load or navigation and end when a configurable heartbeat detects inactivity.
Transmission & Storage
Data can be sent via an image request (GET) for small payloads or Ajax POST for larger payloads. To avoid CORS preflight, the POST can be a simple request with Content-Type: text/plain. For efficiency, Protobuf can replace JSON, with binary data optionally Base64‑encoded for browser transport.
To handle page unload scenarios, the fetch API with the keepalive option is preferred over navigation.sendBeacon.
Collected records are stored in a document‑oriented or log‑oriented database (e.g., Alibaba Cloud SLS). High‑throughput scenarios may also use a message queue such as Kafka.
Query & Analysis
Data is split into multiple tables (transactions, exceptions, spans) to enable multidimensional analysis. Using SLS’s query capabilities, developers can compute metrics, drill into specific transactions, visualize timelines, and perform error analysis across browsers and WebView containers.
Metric Analysis : Aggregate transaction metrics to spot performance issues.
Transaction Analysis : Reconstruct the full operation timeline for a given transaction.
Error Analysis : Identify and reproduce errors, especially in cross‑platform H5 pages.
Conclusion
The monitoring system comprises tracing, metrics, and logging, and can be further refined using mature libraries such as Sentry.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
