Building a Real‑Time Log Tracker for Phone SDKs Using Cloud‑Native Design
This article describes the design and implementation of a comprehensive log tracking system for a phone SDK, covering client‑side logging, colored classification, plugin mechanisms, cloud‑native architecture, serverless functions, Elasticsearch storage, and real‑time visual debugging to enable rapid issue identification and resolution.
Background
Qidian Phone SDK provides telephone services on web pages; with millions of minutes of daily outbound calls, high stability and rapid fault response are essential.
Common troubleshooting methods
Analyze logs to locate issues, though log readability can be poor.
Ask users for environment details and try to reproduce locally; tools like rrweb can record user actions but may not capture all failure contexts.
Remote or onsite inspection, which often requires customer cooperation and can be time‑consuming.
Tracker system design and implementation
The Tracker system provides three core functions: log classification recording, storage, and remote viewing.
Log classification recording
Uses colored console output to distinguish log categories. Example:
console.log('%cLog content %c2021-12-27', 'color: green;', 'color: red;background-color: yellow;');Six common log data forms are supported: text, data‑source, tag, list, alarm, and error.
Tracker instances are created with optional labels and colors:
let t1 = new T(options);
let t2 = new T('labelName'); // quick way
t1.$log('msg1'); // color 1
t2.$log('msg2'); // color 2Random color generation ensures sufficient contrast:
export const getRandomColor = function (dark) {
let hue = Math.floor(Math.random() * 360);
return `hsl(${hue}, 100%, ${dark ? 30 : 60}%)`;
};Plugin mechanism
Plugins are registered via T.$use:
import pluginName from '@tencent/Tracker/path/to/plugin';
T.$use(pluginName, { /* options */ });Implemented plugins:
logData : local storage (WebSQL, IndexedDB, localStorage) with expiration control.
logViewer : reproduces console output, supports filtering, works on mobile H5 and hybrid pages.
logReport : near‑real‑time upload to server with buffering; success rate reaches six‑nines.
EventPush‑based single pull when upload is disabled.
Log system architecture
Four layers: interface, processing, storage, and data display.
Interface layer handles high concurrency, uses load balancing, and forwards raw logs to a message queue.
Processing layer parses, normalizes, enriches logs, and forwards them to storage.
Storage layer (e.g., Elasticsearch) stores logs by time for efficient query and periodic cleanup.
Data display layer provides query APIs and dashboards such as Kibana.
Technology choices
Traditional solution: ELK + Kafka with load balancer, Logstash, and Elasticsearch.
Cloud‑native solution: leverage cloud services for load balancing, message queues, Elasticsearch, and serverless functions to reduce operational overhead.
Serverless functions
Interface function triggered by API Gateway to receive logs, validate, and push to Kafka.
Processing & storage function triggered by Kafka to format logs and bulk insert into Elasticsearch.
Query function triggered by API Gateway for the logViewer plugin.
Index‑management function runs periodically to delete old indices based on disk usage.
Clue log visualization
logViewer allows users to specify a time range and outbound number to locate issues. Logs are grouped by main account, staff account, and callid, each with its own query page.
User‑side visual replay and troubleshooting
Six categories of logs (page interaction, request/response, event push, call status, websocket status, WebRTC detection) enable replay of user behavior. Playback can be paused, debugged, and resumed to pinpoint failures.
Real‑world case study
On 27 Dec a client reported frequent seat disconnections. Using Tracker, logs were filtered for warnings and errors; ticket‑expiration responses were found but no disconnection events. Further analysis showed missing ticket‑refresh calls after expiration, leading to a fix in the client’s environment.
Future roadmap
Tracker aims to cover the entire development lifecycle: automatic capture of API inputs, outputs, latency, error tracing, and alerting. Planned enhancements include dashboards for real‑time client status and richer visualizations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Qidian Tech Team
Official account of Tencent Qidian R&D team, dedicated to sharing and discussing technology for enterprise SaaS scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
