Operations 8 min read

Automated Error Log Cleanup and Monitoring Mechanism for QA

This article describes how a QA team collaborated with developers to create an automated error‑log cleanup and monitoring system, detailing the background, offline follow‑up process, identified pain points, the design of a scheduled statistics solution, platform capabilities, observed benefits, and future improvement plans.

转转QA
转转QA
转转QA
Automated Error Log Cleanup and Monitoring Mechanism for QA

Software quality assurance includes both external quality perceived by users and internal quality such as architecture and code quality, which can be overlooked as business grows rapidly.

Previously, QA focused on business‑level quality (requirements, test readiness, online issues), but system‑level problems like runtime and business exceptions accumulated, prompting a need for daily system‑level quality control together with developers.

Offline Follow‑up Mechanism was established where QA handles error statistics and confirmation, while developers handle error cleanup and feedback.

The original offline workflow involved maintaining an exception type list, weekly QA statistics via Grafana, developer confirmation of errors, and weekly summary emails.

Pain Points of the offline process included high QA statistical cost (2‑3 hours per week for 13 clusters), long statistical cycles causing delayed issue resolution, and difficulty in data analysis due to fragmented documentation.

Automated Statistics were introduced to reduce QA effort and enable near‑real‑time monitoring. A scheduled task collects exception types from the Venus service, stores required exceptions in a MySQL table, and notifies teams via an enterprise WeChat robot, while a front‑end page displays the data.

The platform provides several functions:

Exception data statistics and storage with configurable thresholds and mandatory‑clear flags.

Real‑time monitoring and alerts via daily group notifications, weekly unresolved reminders, and weekly summaries of resolved exceptions.

Data visualization showing new exceptions each cycle, with color‑coded status for quick identification.

Extensibility through Apollo configuration for adding new business clusters, owners, and robot links.

Usage Effects include an 80 % increase in weekly statistical efficiency, reduced manual errors, daily statistics instead of weekly, over 70 statistical runs handling 900+ exceptions and resolving nearly 300 issues, and expansion from 2 to 4 business lines, supporting long‑term data governance.

Future Outlook aims to increase granularity of statistics for faster issue localization, add new types such as Sonar detection, and further automate the development follow‑up process.

MonitoringAutomationoperationsQAError Logging
转转QA
Written by

转转QA

In the era of knowledge sharing, discover 转转QA from a new perspective.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.