Tagged articles
578 articles
Page 4 of 6
DaTaobao Tech
DaTaobao Tech
Jul 21, 2022 · Frontend Development

Front‑end Performance Optimization: Testing, Analysis, and Sustainable Improvement Loop

The team implements a sustainable “discover‑analyze‑validate” loop that uses a performance blacklist, user‑centric metrics such as interactive time and second‑open rate, automated SDK data collection, and continuous trend monitoring to pinpoint front‑end bottlenecks, apply targeted fixes, and verify measurable load‑time improvements.

Metricsoptimizationtesting
0 likes · 10 min read
Front‑end Performance Optimization: Testing, Analysis, and Sustainable Improvement Loop
21CTO
21CTO
Jun 28, 2022 · Operations

Master Prometheus: From Metrics Collection to Alerts and Grafana Visualization

This comprehensive guide walks you through Prometheus fundamentals, including metric exposure, scraping, storage, querying with PromQL, custom exporter creation in Go, dynamic configuration reloading, and visualizing data with Grafana, while also covering alerting with Alertmanager and best practices for accurate histogram bucket design.

AlertingGrafanaMetrics
0 likes · 20 min read
Master Prometheus: From Metrics Collection to Alerts and Grafana Visualization
IT Architects Alliance
IT Architects Alliance
Jun 27, 2022 · Operations

Comprehensive Guide to Prometheus: Metrics Collection, Storage, Querying, Alerting and Visualization

This article provides a detailed overview of Prometheus, covering its architecture, metric exposure, scraping models, storage format, metric types, custom exporter implementation in Go, PromQL query language, built‑in functions, Grafana integration, and alerting with Alertmanager, offering practical code examples throughout.

AlertingGoGrafana
0 likes · 20 min read
Comprehensive Guide to Prometheus: Metrics Collection, Storage, Querying, Alerting and Visualization
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 23, 2022 · Product Management

Mastering Growth: Frameworks, Strategies, and Sustainable Scaling

This comprehensive guide defines growth, distinguishes it from development, explains how to assess and sustain growth, outlines systematic methodologies, experimentation, growth models, organizational capabilities, and strategic leverage to achieve lasting, scalable product success.

GrowthMetricsbusiness scaling
0 likes · 23 min read
Mastering Growth: Frameworks, Strategies, and Sustainable Scaling
Zuoyebang Tech Team
Zuoyebang Tech Team
Jun 17, 2022 · Big Data

How FlinkSQL Auto‑Tuning Saves Resources and Guarantees SLA

This article describes the design and implementation of an automated FlinkSQL tuning system that monitors metrics, evaluates task health with rule‑based logic, calculates optimal resource adjustments, and performs fast scaling to reduce cluster waste, lower operational costs, and maintain SLA compliance.

AkkaAuto ScalingFlink
0 likes · 15 min read
How FlinkSQL Auto‑Tuning Saves Resources and Guarantees SLA
dbaplus Community
dbaplus Community
Jun 13, 2022 · Operations

How We Built a Mini‑Program Observability Platform to Slash Incident Resolution Time

After a three‑day, ten‑person investigation into a mini‑program image‑upload failure, we designed and implemented an end‑to‑end observability platform using MDD and SRE principles, defining SLI/SLO, instrumenting client, network, gateway and backend layers, and visualizing metrics with Grafana, ClickHouse and Prometheus.

GrafanaMDDMetrics
0 likes · 18 min read
How We Built a Mini‑Program Observability Platform to Slash Incident Resolution Time
Selected Java Interview Questions
Selected Java Interview Questions
Jun 13, 2022 · Backend Development

Guide to Setting Up Spring Boot Admin for Monitoring Spring Boot Applications

This article provides a step‑by‑step tutorial on installing and configuring Spring Boot Admin, including Maven dependencies, server and client setup, YML properties, security, Nacos registration, email notifications, custom health indicators, and Micrometer metrics to monitor Spring Boot services.

Backend DevelopmentConfigurationMetrics
0 likes · 14 min read
Guide to Setting Up Spring Boot Admin for Monitoring Spring Boot Applications
Java Baker
Java Baker
Jun 12, 2022 · Operations

System Capacity Checklist: Key Metrics Every Architect Should Track

Architects should treat system capacity like a pre‑flight checklist, using this comprehensive guide to monitor resource usage across services, databases, and queues, and to define business metrics and state‑machine indicators that reveal bottlenecks and guide scaling decisions.

MetricsOperationsarchitecture
0 likes · 5 min read
System Capacity Checklist: Key Metrics Every Architect Should Track
Baidu Geek Talk
Baidu Geek Talk
Jun 6, 2022 · Industry Insights

Why Video Quality Matters and How Lingjing Revolutionizes Its Evaluation

This article explores the factors influencing video quality, explains why continuous improvement is essential, examines the importance and challenges of both subjective and objective quality assessments, and introduces Lingjing—a comprehensive, standards‑based video quality evaluation platform that addresses confidence issues and supports diverse testing scenarios.

LingjingMetricsassessment
0 likes · 15 min read
Why Video Quality Matters and How Lingjing Revolutionizes Its Evaluation
NetEase Yanxuan Technology Product Team
NetEase Yanxuan Technology Product Team
Jun 6, 2022 · Backend Development

How a Unified Precision Testing Platform Boosted Coverage and Efficiency Across Multiple Business Units

After a year of cross‑BU experimentation, the Tianji precision testing platform was built to centralize coverage data, automate incremental analysis, and provide actionable insights, resulting in higher test coverage, faster release gating, and measurable productivity gains for both developers and QA teams.

AutomationBackendDevOps
0 likes · 16 min read
How a Unified Precision Testing Platform Boosted Coverage and Efficiency Across Multiple Business Units
Tencent Cloud Developer
Tencent Cloud Developer
May 30, 2022 · Cloud Native

An Introduction to Prometheus: Metrics Collection, Storage, Querying, Visualization and Alerting

Prometheus is an open‑source monitoring system that scrapes metrics from services or exporters, stores them in a time‑series database, lets users query with PromQL, visualizes data via its web UI or Grafana, and sends alerts through Alertmanager, supporting custom Go metrics, various discovery methods, and four metric types.

AlertingGoGrafana
0 likes · 21 min read
An Introduction to Prometheus: Metrics Collection, Storage, Querying, Visualization and Alerting
Meituan Technology Team
Meituan Technology Team
May 12, 2022 · Operations

Systematic Data Governance Framework and Practices at Meituan Accommodation

The Meituan Accommodation data governance team shares how they evolved from ad‑hoc, single‑point fixes to a systematic, automated governance framework—covering management, standards, capability, execution, evaluation, and vision—using standardization, digitization, and systematization to achieve measurable quality, cost and efficiency gains across thousands of data assets.

AutomationData GovernanceDigitization
0 likes · 33 min read
Systematic Data Governance Framework and Practices at Meituan Accommodation
IT Architects Alliance
IT Architects Alliance
May 6, 2022 · R&D Management

Optimizing IT System Performance and R&D Workflow: From Metrics to DevOps Value‑Stream

The article explains how technical leaders can apply quantitative analysis and systematic measurement to optimize both software system performance and organizational workflows, using a picture‑recognition service example and a real‑world DevOps incident to illustrate the need for end‑to‑end process mapping, bottleneck identification, and continuous improvement.

MetricsPerformance OptimizationR&D management
0 likes · 11 min read
Optimizing IT System Performance and R&D Workflow: From Metrics to DevOps Value‑Stream
Bilibili Tech
Bilibili Tech
Apr 26, 2022 · Operations

Bilibili's SRE Practice for Business Stability: Theory, Metrics, and Operational Implementation

Bilibili’s SRE team combines stability theory, detailed fault‑stage and operational metrics, and a unified emergency‑response platform—including on‑call scheduling, fault‑command incident commanders, automated fault portraits, and rapid post‑mortems—to transform frequent incidents into data‑driven, collaborative recoveries and lay groundwork for AI‑assisted self‑healing.

Business StabilityMetricsOncall
0 likes · 23 min read
Bilibili's SRE Practice for Business Stability: Theory, Metrics, and Operational Implementation
dbaplus Community
dbaplus Community
Apr 25, 2022 · Operations

From Monitoring to Observability: Expert Insights on Evolving Cloud‑Native Operations

In this interview series, three industry experts explain how monitoring differs from observability, the shifts required for ops, developers, and architects, the core methodologies and technologies behind metrics, traces, and logs, and practical guidance for selecting and integrating observability tools in cloud‑native environments.

MetricsObservabilityOperations
0 likes · 16 min read
From Monitoring to Observability: Expert Insights on Evolving Cloud‑Native Operations
ELab Team
ELab Team
Apr 16, 2022 · Frontend Development

Master Front‑End Monitoring: From Data Collection to Performance Metrics

This article outlines the end‑to‑end workflow for front‑end monitoring in an APM platform, covering data collection, reporting, cleaning, storage, and consumption, and dives deep into environment info, exception handling, performance metrics, and efficient data upload strategies.

APMMetricsWeb
0 likes · 18 min read
Master Front‑End Monitoring: From Data Collection to Performance Metrics
YunZhu Net Technology Team
YunZhu Net Technology Team
Apr 15, 2022 · Operations

Design and Architecture of a Cloud‑Native Monitoring Platform for Business Systems

The document outlines the background, vision, current status, technical research, value, product and technical architecture, and functional design of a cloud‑native monitoring platform that integrates SkyWalking and Prometheus to provide comprehensive APM, resource utilization, alerting, and rapid fault localization for business and technical middle‑platform services.

APMMetricsObservability
0 likes · 10 min read
Design and Architecture of a Cloud‑Native Monitoring Platform for Business Systems
Alibaba Cloud Native
Alibaba Cloud Native
Apr 13, 2022 · Cloud Native

From Dapper to OpenTelemetry: A Practical Guide to Distributed Tracing and Observability

This article explains the challenges of long request chains in micro‑service architectures, reviews Google’s Dapper tracing requirements, introduces OpenTracing and OpenCensus standards, compares their strengths, and details how OpenTelemetry unifies tracing, metrics and logs with practical integration steps and best‑practice guidance.

Cloud NativeDistributed TracingMetrics
0 likes · 24 min read
From Dapper to OpenTelemetry: A Practical Guide to Distributed Tracing and Observability
High Availability Architecture
High Availability Architecture
Mar 28, 2022 · Cloud Native

Best Practices for Building an Integrated Monitoring Platform with Prometheus in a Microservice Architecture

This article explains the monitoring challenges introduced by microservice and container evolution, why Prometheus is the preferred observability solution in the cloud‑native era, and presents a comprehensive, multi‑tenant, high‑availability architecture with practical techniques for data collection, storage, query optimization, security, and future trends.

Cloud NativeMetricsPrometheus
0 likes · 19 min read
Best Practices for Building an Integrated Monitoring Platform with Prometheus in a Microservice Architecture
HomeTech
HomeTech
Mar 16, 2022 · Cloud Native

Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights

This article explains how Kubernetes Horizontal Pod Autoscaler (HPA) balances resource demand and workload by automatically scaling pod replicas, describes the different metric types it supports, walks through the core controller code (Run, worker, reconcile, and replica calculation), highlights current limitations, and shares practical observations from real‑world usage.

Horizontal Pod AutoscalerKubernetesMetrics
0 likes · 11 min read
Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights
DeWu Technology
DeWu Technology
Mar 4, 2022 · Operations

Three-Level Indicator System for Engineering Quality Management

The article outlines a three‑level indicator system that quantifies engineering quality across efficiency, quality, stability, and resource dimensions, using high‑level result metrics, detailed level‑2 breakdowns, and actionable level‑3 measures to enable drill‑down analysis, risk‑warning, and continuous, data‑driven improvement.

EngineeringIndicator SystemMetrics
0 likes · 10 min read
Three-Level Indicator System for Engineering Quality Management
Youzan Coder
Youzan Coder
Mar 3, 2022 · Operations

How Standard Deviation Uncovers Hidden Bottlenecks in Software R&D Throughput

The article introduces a new R&D efficiency metric—throughput standard deviation—explains its statistical basis, shows how it was derived from annual reports, illustrates its application across multiple teams, and discusses practical insights and limitations for software development operations.

MetricsOperationsR&D efficiency
0 likes · 7 min read
How Standard Deviation Uncovers Hidden Bottlenecks in Software R&D Throughput
Efficient Ops
Efficient Ops
Mar 2, 2022 · Operations

Mastering System & Application Monitoring with the USE Method and Prometheus

This article explains how to build a comprehensive monitoring system for both infrastructure and applications, introducing the USE (Utilization‑Saturation‑Errors) method, key performance metrics, and practical components such as Prometheus, Grafana, full‑link tracing, and the ELK stack to detect and diagnose performance bottlenecks.

MetricsPrometheusUSE method
0 likes · 13 min read
Mastering System & Application Monitoring with the USE Method and Prometheus
DevOps Cloud Academy
DevOps Cloud Academy
Mar 2, 2022 · Operations

Key DevOps Metrics for Effective Software Delivery

This article explains the most important DevOps metrics—such as deployment frequency, lead time, automated test pass rate, change failure rate, MTTR, and others—and how tracking them helps teams improve software delivery speed, quality, and operational efficiency.

AutomationDevOpsMetrics
0 likes · 10 min read
Key DevOps Metrics for Effective Software Delivery
DaTaobao Tech
DaTaobao Tech
Feb 21, 2022 · Frontend Development

Focused Gray Release Monitoring and Alert Configuration for Frontend Quality

To raise front‑end quality, the team implements gray‑release monitoring that triggers log analysis at a 5 % rollout, automatically generates reports within ten minutes, and uses dynamic thresholds and noise‑reduction tactics to detect errors early, enabling rapid rollback or expansion and markedly improving stability and release efficiency.

AlertingMetricsfrontend
0 likes · 9 min read
Focused Gray Release Monitoring and Alert Configuration for Frontend Quality
Baobao Algorithm Notes
Baobao Algorithm Notes
Feb 15, 2022 · Industry Insights

Why Your Algorithm Gains May Still Drag Down Overall Business: 6 Hidden Pitfalls

Even when individual algorithm modules show higher accuracy or revenue, the overall platform can decline due to factors like competitor encroachment, macro‑economic shifts, concept drift, overlapping marginal returns, attribution errors, and coupled A/B experiments, all of which require careful analysis and mitigation.

AB testingMetricsalgorithm
0 likes · 7 min read
Why Your Algorithm Gains May Still Drag Down Overall Business: 6 Hidden Pitfalls
Ops Development Stories
Ops Development Stories
Jan 24, 2022 · Cloud Native

Deploy and Configure vmagent on Kubernetes for Efficient Metrics

This guide explains what vmagent is, its key features, and provides step‑by‑step instructions to install, configure, and verify vmagent on a Kubernetes cluster, including namespace and RBAC setup, custom scrape configs, monitoring endpoints, and troubleshooting tips.

KubernetesMetricsVictoriaMetrics
0 likes · 15 min read
Deploy and Configure vmagent on Kubernetes for Efficient Metrics
Efficient Ops
Efficient Ops
Jan 20, 2022 · Operations

Mastering Prometheus Metrics: Best Practices for Effective Monitoring

This article outlines practical guidelines for designing Prometheus metrics, covering how to define monitoring targets, choose appropriate vectors and labels, name metrics and labels correctly, select histogram buckets, and leverage Grafana features to visualize and troubleshoot data effectively.

GrafanaMetricsObservability
0 likes · 11 min read
Mastering Prometheus Metrics: Best Practices for Effective Monitoring
Baidu MEUX
Baidu MEUX
Jan 17, 2022 · Product Management

How the GSM Model Boosts Product Design: From Goals to Metrics

The article explains why and how to use the GSM (Goal‑Signal‑Metric) model in product design, detailing its definition, applicable scenarios, step‑by‑step implementation, and a real‑world case study that demonstrates its value for aligning business, user, and design objectives.

GSM modelMetricsProduct Design
0 likes · 8 min read
How the GSM Model Boosts Product Design: From Goals to Metrics

Common Pitfalls in User Churn Data Analysis

This article explains three frequent mistakes in churn analysis—misinterpreting churn rates, falling into Simpson's paradox, and incorrectly inferring causality—illustrated with game‑related examples and emphasizes the need to combine multiple metrics for accurate conclusions.

MetricsSimpson's paradoxuser churn
0 likes · 5 min read
Common Pitfalls in User Churn Data Analysis
DevOps
DevOps
Jan 6, 2022 · R&D Management

Self‑Organization in Agile Teams: Shifting Management Roles and Enhancing Control

The article explains how adopting self‑organization in agile teams changes the responsibilities of technical managers, makes work visible, introduces servant‑leadership and technical‑expert roles, and uses metrics and coaching to strengthen team accountability while preventing loss of control.

Metricsservant leadership
0 likes · 6 min read
Self‑Organization in Agile Teams: Shifting Management Roles and Enhancing Control
Zhuanzhuan Tech
Zhuanzhuan Tech
Jan 5, 2022 · Operations

Design and Implementation of a Multi‑Dimensional Monitoring Platform Based on Prometheus and M3DB

This article details the background, research, architecture, performance testing, and deployment of a comprehensive monitoring system that leverages Prometheus, Grafana, and M3DB to provide flexible metric collection, automatic dashboard generation, and a custom alerting service for large‑scale business services.

AlertingMetricsTime Series
0 likes · 16 min read
Design and Implementation of a Multi‑Dimensional Monitoring Platform Based on Prometheus and M3DB
DataFunTalk
DataFunTalk
Dec 28, 2021 · Artificial Intelligence

Evaluation Framework and Methodology for OPPO XiaoBu AI Assistant

This article presents a comprehensive evaluation framework for OPPO's XiaoBu AI assistant, covering evaluation concepts, objectives, five key elements, sampling methods, dimension selection, annotation scoring, report generation, and a detailed Q&A that illustrates practical metrics and processes for voice and search services.

AI EvaluationMetricsOPPO
0 likes · 23 min read
Evaluation Framework and Methodology for OPPO XiaoBu AI Assistant
DataFunSummit
DataFunSummit
Dec 27, 2021 · Artificial Intelligence

Evaluation Framework and Methodology for OPPO XiaoBu AI Assistant

This article presents a comprehensive evaluation framework for OPPO's XiaoBu AI assistant, covering the concept and purpose of evaluation, the five key evaluation elements, data sampling strategies, dimension and rule selection, annotation scoring, reporting guidelines, and detailed procedures for assessing wake‑up, ASR, NLU, and TTS performance.

AI EvaluationMetricsReporting
0 likes · 20 min read
Evaluation Framework and Methodology for OPPO XiaoBu AI Assistant
Beike Product & Technology
Beike Product & Technology
Dec 17, 2021 · Operations

Practices for Monitoring, Resource Optimization, and Containerization of Large-Scale Flink Jobs at Beike

This article describes Beike's real‑time computing team's end‑to‑end practices for collecting and storing Flink metrics, building visual monitoring dashboards, implementing multi‑level alerting, analyzing logs, estimating CPU and memory resources, and deploying Flink on Kubernetes with containerization and storage separation to improve stability, resource utilization, and operational efficiency.

FlinkKubernetesMetrics
0 likes · 25 min read
Practices for Monitoring, Resource Optimization, and Containerization of Large-Scale Flink Jobs at Beike
ByteDance SE Lab
ByteDance SE Lab
Dec 17, 2021 · Mobile Development

Douyin's Metric‑Driven Optimization: Boosting Creation Experience and Performance

This article details Douyin's systematic approach to improving creation experience by defining measurable goals, building a comprehensive metric system, performing relevance analysis, and implementing concrete Android and iOS performance optimizations—including album loading, component architecture, and small‑screen video quality—while outlining monitoring, tooling, and internal platform support that together deliver significant user‑facing gains.

AndroidMetricsPerformance Optimization
0 likes · 24 min read
Douyin's Metric‑Driven Optimization: Boosting Creation Experience and Performance
DevOps
DevOps
Dec 15, 2021 · R&D Management

Using Metrics Wisely in Software Development: Avoiding Counterproductive Behaviors and Driving Real Improvement

The article explains how managers' love of metrics can unintentionally promote harmful behaviors in software teams, illustrates the problem with real-world stories, and provides practical guidelines—linking metrics to goals, tracking trends, using shorter cycles, and adapting metrics—to ensure they support, rather than hinder, the delivery of valuable software.

Metricsdouble-loop learningorganizational behavior
0 likes · 21 min read
Using Metrics Wisely in Software Development: Avoiding Counterproductive Behaviors and Driving Real Improvement
MaGe Linux Operations
MaGe Linux Operations
Nov 19, 2021 · Databases

Essential Redis Monitoring Metrics Every Engineer Should Know

This guide outlines the key Redis monitoring metrics—including performance, memory, activity, persistence, and error indicators—provides detailed descriptions, command examples, and practical tips for using tools like redis-cli, redis-benchmark, and info commands to effectively track and troubleshoot your Redis instances.

DevOpsMetricsredis
0 likes · 7 min read
Essential Redis Monitoring Metrics Every Engineer Should Know
Java Architecture Diary
Java Architecture Diary
Nov 19, 2021 · Backend Development

What’s New in Spring Boot 2.6? Key Features and Configuration Changes

Spring Boot 2.6 introduces Cookie SameSite support, reactive session timeout, custom data‑masking rules, automatic Redis pool configuration, richer runtime Java metrics, build‑info personalization, new startup and disk metrics, enhanced Docker image building, and many deprecated properties removed or renamed, improving security and performance.

ConfigurationDockerJava
0 likes · 7 min read
What’s New in Spring Boot 2.6? Key Features and Configuration Changes
DevOps
DevOps
Nov 15, 2021 · R&D Management

Key Indicators for Evaluating Genuine Agile Practices

This article outlines six practical metrics—unified backlog, user‑centric requirement description, consistent requirement flow, short lead time, high iteration completion rate, and stable iteration velocity—to help teams distinguish superficial agile ceremonies from truly effective agile execution.

BacklogMetricsR&D
0 likes · 5 min read
Key Indicators for Evaluating Genuine Agile Practices
Open Source Linux
Open Source Linux
Nov 14, 2021 · Databases

Essential Redis Monitoring Metrics Every Engineer Should Know

This guide outlines the key Redis monitoring metrics—including performance, memory, basic activity, persistence, and error indicators—explains their meanings, shows how to retrieve them with Redis commands, and provides practical tips for effective performance and health tracking.

ErrorMetricsmonitoring
0 likes · 6 min read
Essential Redis Monitoring Metrics Every Engineer Should Know
Efficient Ops
Efficient Ops
Nov 3, 2021 · Operations

How to Visualize JMeter Performance Data with Grafana, InfluxDB, and Prometheus

This article explains step‑by‑step how to collect JMeter test metrics via Backend Listener, store them in InfluxDB, and display real‑time performance charts—including TPS, response time, and error rates—in Grafana, while also covering node_exporter integration with Prometheus for system‑level monitoring.

GrafanaInfluxDBJMeter
0 likes · 15 min read
How to Visualize JMeter Performance Data with Grafana, InfluxDB, and Prometheus
Alimama Tech
Alimama Tech
Nov 3, 2021 · Product Management

Common Pitfalls in AB Testing: Design and Analysis Issues

AB testing often fails because practitioners skip power analysis, peek at interim results, set unrealistic null hypotheses, randomize at inappropriate units, ignore sample‑ratio mismatches, choose misleading metrics, and fall prey to segmentation errors like Simpson’s paradox, any of which can invalidate conclusions.

AB testingMetricsSample Ratio Mismatch
0 likes · 15 min read
Common Pitfalls in AB Testing: Design and Analysis Issues
Open Source Linux
Open Source Linux
Oct 31, 2021 · Operations

Designing Effective Metrics: From Requirements to Labels and Buckets

This guide explains how to define, name, and organize monitoring metrics—covering Google’s four golden indicators, system‑specific measurement objects, vector selection, label conventions, bucket design, and practical Grafana tips—for reliable observability of diverse services.

MetricsObservabilitylabeling
0 likes · 10 min read
Designing Effective Metrics: From Requirements to Labels and Buckets
DevOps
DevOps
Oct 22, 2021 · Operations

Making KPI Work Positively in Agile Transformation: A Three‑Step Approach

This article shares a practical three‑step method for turning KPI from a hindrance into a catalyst during agile and DevOps transformation, covering urgency creation, pilot projects, and performance‑assessment redesign, while illustrating how integrated tools and feedback loops boost delivery efficiency and business value.

DevOpsKPIMetrics
0 likes · 9 min read
Making KPI Work Positively in Agile Transformation: A Three‑Step Approach
Programmer DD
Programmer DD
Oct 18, 2021 · Databases

Master Redis Monitoring: Key Metrics, Commands, and Performance Tips

This article explains how to monitor Redis by categorizing essential performance, memory, activity, persistence, and error metrics, provides detailed tables of metric names and descriptions, lists common monitoring tools, shows configuration snippets, and demonstrates useful Redis CLI commands for real‑time insight.

CLIMetricsdatabase
0 likes · 7 min read
Master Redis Monitoring: Key Metrics, Commands, and Performance Tips
DevOps
DevOps
Oct 14, 2021 · R&D Management

Why Metrics Fail: Historical Lessons, Industry Examples, and Common Pitfalls in R&D Efficiency Measurement

The article examines why measurement systems often backfire by recounting historical tax‑related mis‑metrics, modern corporate examples like Haidilao, and a series of fundamental mistakes in software R&D efficiency metrics, urging a shift from metric‑driven thinking to purpose‑driven measurement.

MetricsR&D efficiencymeasurement pitfalls
0 likes · 12 min read
Why Metrics Fail: Historical Lessons, Industry Examples, and Common Pitfalls in R&D Efficiency Measurement
dbaplus Community
dbaplus Community
Oct 10, 2021 · Databases

Transform MySQL Slow Queries from Passive Fixes to Proactive Risk Scoring

This article presents a comprehensive MySQL slow‑query risk‑scoring model that quantifies each slow query's impact using metrics such as query count, execution time, lock wait, bytes sent and rows examined, assigns weighted scores up to 100, and demonstrates how the model enables proactive, business‑aligned remediation.

Database PerformanceMetricsmysql
0 likes · 15 min read
Transform MySQL Slow Queries from Passive Fixes to Proactive Risk Scoring
Alibaba Cloud Native
Alibaba Cloud Native
Oct 10, 2021 · Cloud Native

How to Detect Service and Workload Anomalies in Kubernetes with Advanced Monitoring

This article explains the common pain points of locating anomalies in Kubernetes environments and presents a multi‑layer monitoring framework—trace, metrics, events, and alerts—along with best‑practice scenarios such as network performance, DNS issues, full‑link stress testing, external MySQL access, and multi‑tenant architectures.

DNSKubernetesMetrics
0 likes · 20 min read
How to Detect Service and Workload Anomalies in Kubernetes with Advanced Monitoring
dbaplus Community
dbaplus Community
Oct 7, 2021 · Databases

How to Measure and Eliminate Slow SQL in Large‑Scale MySQL Deployments

This article explains what MySQL slow queries are, why they cause system failures, proposes multi‑dimensional metrics to assess their severity, outlines concrete guidelines and change standards, and shares real‑world optimization cases and daily operational practices for eliminating slow SQL.

Database PerformanceMetricsOperations
0 likes · 13 min read
How to Measure and Eliminate Slow SQL in Large‑Scale MySQL Deployments
phodal
phodal
Sep 27, 2021 · User Experience Design

Boosting Developer Experience: A Deep Dive into Documentation Engineering

This article examines the challenges faced by documentation engineers, proposes a documentation‑centric workflow, outlines key elements such as edit‑publish separation, automation, formalization, open collaboration, and versioning, and suggests metrics and practices to measure and improve the overall developer experience.

AutomationDeveloper ExperienceMetrics
0 likes · 14 min read
Boosting Developer Experience: A Deep Dive into Documentation Engineering
DevOpsClub
DevOpsClub
Sep 16, 2021 · R&D Management

Why Measuring R&D Efficiency Is Hard—and How to Do It Right

This article explores the fundamental difficulties of quantifying software development efficiency, outlines common measurement pitfalls and anti‑patterns, and offers practical guidance for building a systematic, data‑driven R&D performance framework that truly drives improvement.

DevOpsMetricsR&D efficiency
0 likes · 27 min read
Why Measuring R&D Efficiency Is Hard—and How to Do It Right
Efficient Ops
Efficient Ops
Sep 9, 2021 · Operations

How China Everbright Bank Reached Level‑3 DevOps Continuous Delivery: Insights and Metrics

China Everbright Bank’s interview reveals how adopting DevOps standards and a continuous delivery pipeline boosted development speed, quality, and efficiency, with measurable improvements such as 90% build success, 60 daily integrations, and faster release cycles, illustrating the strategic value of standardized, tool‑enabled software delivery.

Agile TransformationBankingContinuous Delivery
0 likes · 16 min read
How China Everbright Bank Reached Level‑3 DevOps Continuous Delivery: Insights and Metrics
DevOps
DevOps
Sep 7, 2021 · Operations

Effective Automation Testing: Metrics and Measurement Approaches

The article examines why many automation testing initiatives fail to deliver value, introduces a set of practical metrics such as test case count, execution frequency, success rate, coverage, EMTE, ROI, and bug‑detection efficiency, and explains how to combine them to assess and improve automation effectiveness within software development processes.

EMTEMetricsROI
0 likes · 10 min read
Effective Automation Testing: Metrics and Measurement Approaches
dbaplus Community
dbaplus Community
Sep 6, 2021 · Frontend Development

Building a Scalable Frontend Performance Monitoring System at 哈啰

This article details 哈啰's front‑end performance monitoring architecture, covering the background of rapid growth, a three‑step optimization workflow, data collection, cleaning, aggregation, visualization, and practical techniques like pre‑rendering and offline packages to dramatically improve page load metrics.

Metricsdata pipelinefrontend
0 likes · 30 min read
Building a Scalable Frontend Performance Monitoring System at 哈啰
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 3, 2021 · Operations

Optimizing Gray Release for iQIYI Mobile Backend Using Dogfooding

iQIYI’s mobile backend employs dogfooding‑driven gray releases with cloud‑controlled traffic, gray‑tag propagation, comprehensive front‑end and back‑end metrics, device white‑lists, and downstream service integration, allowing internal users to quickly verify code and configuration changes and catch issues before full production rollout.

BackendConfigurationDeployment
0 likes · 9 min read
Optimizing Gray Release for iQIYI Mobile Backend Using Dogfooding
DataFunSummit
DataFunSummit
Aug 8, 2021 · Artificial Intelligence

Diversity as a Means, Not an End, in Recommendation Systems

The article argues that diversity in recommendation systems should be treated as a means rather than an ultimate goal, explains why it is hard to quantify, suggests using real performance metrics such as click‑through rate and dwell time, and offers practical strategies to improve listwise ranking.

DiversityMetricslistwise
0 likes · 7 min read
Diversity as a Means, Not an End, in Recommendation Systems
Qunhe Technology User Experience Design
Qunhe Technology User Experience Design
Aug 6, 2021 · Product Management

How to Systematically Prioritize Design Work in Fast‑Paced Product Iterations

This article outlines a systematic approach for designers to identify key priorities during rapid product iterations, covering business analysis, user segmentation, strategy formulation, implementation tactics, and data‑driven validation, using the real‑world case of the Ku Dashi (Cool Master) platform.

Metricsdesign strategyproduct iteration
0 likes · 10 min read
How to Systematically Prioritize Design Work in Fast‑Paced Product Iterations
Baidu Intelligent Testing
Baidu Intelligent Testing
Aug 3, 2021 · Operations

Stability Governance and Observability in Baidu Search: From Kepler 1.0 to Kepler 2.0

This article examines how Baidu Search achieves five‑nine‑plus availability by analyzing stability challenges, introducing the Kepler 1.0 observability stack, evolving to Kepler 2.0 with full‑trace collection, custom compression, and practical use‑cases that dramatically improve fault diagnosis and capacity management in a massive micro‑service environment.

BackendMetricslarge-scale systems
0 likes · 18 min read
Stability Governance and Observability in Baidu Search: From Kepler 1.0 to Kepler 2.0
MaGe Linux Operations
MaGe Linux Operations
Aug 1, 2021 · Operations

Master Prometheus PQL: Essential Queries, Functions, and Tips

This article provides a comprehensive guide to Prometheus' PQL language, covering instant and range vectors, metric types, label selectors, offsets, arithmetic and logical operators, as well as a wide range of built‑in functions with practical code examples for effective monitoring.

MetricsPQLPrometheus
0 likes · 11 min read
Master Prometheus PQL: Essential Queries, Functions, and Tips
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 30, 2021 · Databases

How to Tackle MySQL Slow Queries: Metrics, Strategies, and Real Cases

This article explains what constitutes a MySQL slow query, why they cause failures, defines quantitative metrics such as micro‑average and macro‑average to assess severity, outlines target goals, presents concrete optimization examples, and shares operational practices for ongoing slow‑SQL governance.

Database PerformanceIndex OptimizationMetrics
0 likes · 13 min read
How to Tackle MySQL Slow Queries: Metrics, Strategies, and Real Cases
Java Interview Crash Guide
Java Interview Crash Guide
Jul 23, 2021 · Operations

How to Build a Scalable APM System: Inside the Dog Architecture

This article explains what an APM system is, compares logs, traces and metrics, reviews popular tools, and then details the design and implementation of the in‑house Dog APM platform—including client data models, Kafka pipelines, processing pipelines, storage in ClickHouse/Cassandra, and UI visualizations.

APMClickHouseJava
0 likes · 28 min read
How to Build a Scalable APM System: Inside the Dog Architecture
Selected Java Interview Questions
Selected Java Interview Questions
Jul 7, 2021 · Operations

Redis Monitoring Metrics and Commands Guide

This article provides a comprehensive overview of Redis monitoring metrics—including performance, memory, basic activity, persistence, and error indicators—along with recommended monitoring tools, configuration settings, and command-line examples for gathering and interpreting these metrics in production environments.

MetricsOperationsdatabase
0 likes · 7 min read
Redis Monitoring Metrics and Commands Guide
Baidu Geek Talk
Baidu Geek Talk
Jun 30, 2021 · Operations

How Baidu Achieves 5‑9+ Availability: Inside Its Stability Engineering and Observability

This article dissects Baidu Search's ultra‑large micro‑service architecture, detailing the challenges of maintaining five‑nine‑plus availability, the diverse failure modes, and the step‑by‑step evolution of its observability stack—from early log‑only analysis to the kepler1.0/kepler2.0 tracing, full‑log indexing, custom span‑id generation, and compression techniques that together enable rapid root‑cause diagnosis at massive scale.

Baidu SearchDistributed TracingMetrics
0 likes · 21 min read
How Baidu Achieves 5‑9+ Availability: Inside Its Stability Engineering and Observability
Liangxu Linux
Liangxu Linux
Jun 29, 2021 · Operations

Mastering System Metrics: QPS, TPS, PV, UV, DAU, and MAU Explained

This article clarifies core web‑service metrics—QPS, TPS, PV, UV, DAU, MAU—explains their differences, shows how concurrency and throughput relate, and outlines key performance‑testing concepts and evaluation methods for modern system capacity planning.

MetricsQPSSystem Design
0 likes · 9 min read
Mastering System Metrics: QPS, TPS, PV, UV, DAU, and MAU Explained
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 21, 2021 · R&D Management

Can Growth‑Hacking Principles Supercharge Software Development Efficiency?

By adapting growth‑hacking concepts such as north‑star metrics, conversion funnels, and A/B testing to the software development lifecycle, this article proposes a data‑driven “efficiency hacker” model that visualizes demand delivery paths, classifies tasks with the RIW framework, and guides teams toward faster, more transparent project outcomes.

AB testingGrowth HackingMetrics
0 likes · 10 min read
Can Growth‑Hacking Principles Supercharge Software Development Efficiency?
Big Data Technology & Architecture
Big Data Technology & Architecture
May 26, 2021 · Big Data

Comprehensive Guide to Data Warehouse Concepts, Modeling, and Data Governance

This article provides an extensive overview of data warehouse fundamentals, including its purpose, core characteristics, layered architecture, modeling methods such as dimensional and normalization, as well as detailed discussions on data governance, metric systems, security standards, and practical implementation strategies for enterprise data management.

Data WarehouseMetrics
0 likes · 70 min read
Comprehensive Guide to Data Warehouse Concepts, Modeling, and Data Governance
Efficient Ops
Efficient Ops
May 21, 2021 · Operations

How Anxin Securities Achieved Leading DevOps Level‑3 Continuous Delivery: Insights and Metrics

Anxin Securities’ CIO discusses how the company’s Internet Customer Service Platform passed the DevOps Standard Continuous Delivery Level 3 assessment, detailing the motivations, implementation challenges, measurable improvements, and future plans for scaling DevOps across its technology stacks.

Continuous DeliveryDevOpsDigital Transformation
0 likes · 13 min read
How Anxin Securities Achieved Leading DevOps Level‑3 Continuous Delivery: Insights and Metrics
dbaplus Community
dbaplus Community
May 18, 2021 · Operations

Mastering End‑to‑End Monitoring: From Purpose to Prometheus Implementation

This guide explains why monitoring is essential throughout a product lifecycle, outlines monitoring modes and methods, compares health checks, logs, tracing and metric solutions, and provides a detailed Prometheus‑based monitoring architecture with concrete metric definitions, alerting rules, and incident‑response procedures.

AlertingMetricsOperations
0 likes · 25 min read
Mastering End‑to‑End Monitoring: From Purpose to Prometheus Implementation