Topic

monitoring

Collection size
1674 articles
Page 3 of 84
JD Tech Talk
JD Tech Talk
Dec 4, 2024 · Operations

Gray Release, Verification, and Rollback Strategies in Software Deployment

The article outlines a comprehensive release management framework that emphasizes gray (canary) deployments, detailed verification steps, monitoring practices, and rollback procedures to mitigate risks and ensure system stability for production rollouts.

DeploymentSoftware OperationsVerification
0 likes · 13 min read
Gray Release, Verification, and Rollback Strategies in Software Deployment
JD Tech Talk
JD Tech Talk
Oct 21, 2024 · Operations

Observability and Quality Assurance: Strategies for Test Teams

This article examines how test teams can enhance application observability and quality assurance by distinguishing observability from traditional monitoring, defining goals, outlining a monitoring foundation, and proposing module‑level and system‑level strategies for proactive fault detection, data analysis, and alerting.

Testingmonitoringobservability
0 likes · 12 min read
Observability and Quality Assurance: Strategies for Test Teams
Beike Product & Technology
Beike Product & Technology
Jul 31, 2020 · Mobile Development

Design and Evolution of a Mobile Live‑Streaming Platform at Beike

This article describes how Beike built, refined, and scaled a mobile live‑streaming platform—detailing early challenges, architectural pain points of version 1.0, and the systematic improvements introduced in version 2.0 such as clear boundaries, functional aggregation, layered platform design, dynamic configuration, monitoring, and zero‑cost integration to support diverse business scenarios.

Mobile ArchitecturePlatform EngineeringSDK
0 likes · 11 min read
Design and Evolution of a Mobile Live‑Streaming Platform at Beike
Test Development Learning Exchange
Test Development Learning Exchange
Jan 11, 2025 · Operations

Python System Administration Scripts for DevOps Engineers

This article provides comprehensive Python scripts for system administration tasks including CPU monitoring, memory usage tracking, log analysis, file backup, system updates, network monitoring, service management, user administration, and system information collection.

DevOpsNetwork MonitoringPython
0 likes · 6 min read
Python System Administration Scripts for DevOps Engineers
Refining Core Development Skills
Refining Core Development Skills
Oct 19, 2020 · Operations

Linux Network Packet Monitoring and Tuning: Tools, RingBuffer, Interrupts, and SoftIRQ Optimization

This article explains how to monitor and tune Linux network packet reception using tools such as ethtool, ifconfig, and procfs, covering RingBuffer inspection, hardware and soft interrupt analysis, multi‑queue configuration, interrupt coalescing, and GRO settings to improve throughput and reduce packet loss.

KernelLinuxNetwork
0 likes · 17 min read
Linux Network Packet Monitoring and Tuning: Tools, RingBuffer, Interrupts, and SoftIRQ Optimization
Watermelon Video Tech Team
Watermelon Video Tech Team
Jan 31, 2024 · Mobile Development

Optimizing Android Process Startup in Xigua Video: Strategies, Implementation, and Benefits

This article details how Xigua Video analyzed and optimized the startup of multiple Android subprocesses—including push, mini‑app, sandboxed, and exec processes—by applying on‑demand loading, SDK integration, and monitoring techniques, resulting in measurable performance and quality improvements.

AndroidMobile DevelopmentMulti-process
0 likes · 23 min read
Optimizing Android Process Startup in Xigua Video: Strategies, Implementation, and Benefits
政采云技术
政采云技术
Oct 28, 2021 · Backend Development

HikariCP Overview (Part 1): Initialization, Core Components, Monitoring and Configuration

This article provides a detailed analysis of HikariCP’s initialization, core components, startup flow, connection acquisition logic, monitoring metrics, and key configuration parameters, illustrating how Spring Boot 2.x leverages this high‑performance JDBC connection pool and offering guidance for tuning and extending it.

Connection PoolHikariCPSpring Boot
0 likes · 14 min read
HikariCP Overview (Part 1): Initialization, Core Components, Monitoring and Configuration
JD Tech
JD Tech
Oct 10, 2023 · Operations

Technical Case Study of JDV Visual Dashboard Platform for the 618 Promotion

This article details how JDV, JD.com’s internal visual dashboard platform, tackled the massive data‑intensive 618 promotion by implementing real‑time updates, cross‑midnight count stops, request‑state control, heartbeat monitoring, proxy data sources, and a suite of developer tools to ensure stability, performance, and rapid feature delivery.

data platformlarge-scalemonitoring
0 likes · 18 min read
Technical Case Study of JDV Visual Dashboard Platform for the 618 Promotion
Ctrip Technology
Ctrip Technology
Sep 23, 2024 · Frontend Development

Intelligent Alert Attribution System for Ctrip Hotel Frontend: Design, Implementation, and Outcomes

This article details the design and deployment of an intelligent alert attribution system for Ctrip Hotel's front‑end, describing the background challenges, the unified data pool, weighted alert rules, three attribution algorithms, achieved improvements in accuracy and troubleshooting speed, and future enhancement plans.

AlertAttributionFrontend
0 likes · 18 min read
Intelligent Alert Attribution System for Ctrip Hotel Frontend: Design, Implementation, and Outcomes
Ctrip Technology
Ctrip Technology
Oct 10, 2018 · Operations

Design and Implementation of Ctrip's Fourth-Generation Full-Link Performance Testing System

This article outlines the evolution of Ctrip’s performance testing approaches across three generations, analyzes their limitations, and presents the design, architecture, data construction, request tracing, monitoring, and operational considerations of the fourth-generation full‑link testing platform, including case studies and future outlook.

Capacity Planningfull-link testingload testing
0 likes · 14 min read
Design and Implementation of Ctrip's Fourth-Generation Full-Link Performance Testing System
Ctrip Technology
Ctrip Technology
Aug 17, 2017 · Operations

Design, Evolution, and Future of Ctrip's Operations Workflow Platform

This article details the challenges, architectural evolution, key components, implementation experiences, and future directions of Ctrip's operations workflow platform, illustrating how a multi‑stage, layered design and standardized services have transformed manual IT operations into an automated, observable, and scalable system.

Platform Architecturemonitoringoperations automation
0 likes · 16 min read
Design, Evolution, and Future of Ctrip's Operations Workflow Platform
360 Tech Engineering
360 Tech Engineering
Sep 6, 2019 · Operations

StackStorm-Based ChatOps Solution for Automated Monitoring Alert Self‑Healing

This article introduces a StackStorm‑driven ChatOps framework that consolidates monitoring alerts, applies rule‑based root‑cause analysis, and automatically executes self‑healing actions, outlining its architecture, components, workflow definitions, and practical deployment results within an enterprise operations environment.

ChatOpsStackStormmonitoring
0 likes · 6 min read
StackStorm-Based ChatOps Solution for Automated Monitoring Alert Self‑Healing
360 Smart Cloud
360 Smart Cloud
Jul 3, 2024 · Operations

Practical Practices for Enhancing Kafka Cluster Stability at 360

This article details 360's comprehensive approach to improving Apache Kafka cluster stability through proactive operations, capacity assessment, parameter tuning, monitoring, version upgrades, and traffic control, offering concrete guidelines and best‑practice recommendations for large‑scale message‑queue deployments.

ClusterKafkaStability
0 likes · 33 min read
Practical Practices for Enhancing Kafka Cluster Stability at 360
58 Tech
58 Tech
Nov 27, 2024 · Operations

Building an Observability System for Cloud Authentication: Practices, Metrics, and Lessons Learned

This article details how 58 Group’s cloud authentication service introduced an observability framework—optimizing logs, employing distributed tracing, defining SLO/SLA metrics, and implementing burn‑rate alerts—to improve fault detection, reduce false alarms, and achieve faster root‑cause analysis across the system.

Distributed TracingError BudgetSLO
0 likes · 16 min read
Building an Observability System for Cloud Authentication: Practices, Metrics, and Lessons Learned
58 Tech
58 Tech
Apr 21, 2022 · Frontend Development

Interview with Li Yi on Building 58 Group’s Large Front‑End Technology Service System

In this interview, Li Yi, head of 58 Group’s Front‑End Technology Department, explains how the company built its large‑scale front‑end service system—including a Hybrid permission platform, a React Native hot‑update platform, and the Beidou monitoring system—while discussing cross‑platform frameworks, performance challenges, low‑code adoption, and advice for newcomers.

FrontendReact Nativecross‑platform
0 likes · 11 min read
Interview with Li Yi on Building 58 Group’s Large Front‑End Technology Service System
Aikesheng Open Source Community
Aikesheng Open Source Community
May 13, 2024 · Databases

Profiling Memory Usage in MySQL Queries

This article explains how to use MySQL's performance_schema to monitor and analyze per‑connection memory consumption, provides SQL queries to list memory instruments, shows Python scripts for sampling and visualizing memory usage over time, and demonstrates practical usage with example commands and output.

Memory ProfilingMySQLPerformance Schema
0 likes · 14 min read
Profiling Memory Usage in MySQL Queries
Aikesheng Open Source Community
Aikesheng Open Source Community
Jun 9, 2021 · Databases

Monitoring MySQL Full-Text Indexes: Parameters, Metadata Tables, and Practical Demonstrations

This article explains how to monitor MySQL full-text indexes by describing relevant InnoDB parameters, the metadata tables that expose index activity, and step‑by‑step examples that create a sample table, configure monitoring, observe cache behavior, and manage index maintenance operations.

Database performanceFull-Text IndexInnoDB
0 likes · 13 min read
Monitoring MySQL Full-Text Indexes: Parameters, Metadata Tables, and Practical Demonstrations
Architecture and Beyond
Architecture and Beyond
Jul 28, 2024 · Frontend Development

Comprehensive Guide to Front‑End Stability: Observability, Full‑Chain Monitoring, High‑Availability Architecture, Performance Management, Risk Governance, Process Mechanisms, and Engineering Practices

This extensive article presents a systematic approach to front‑end stability, covering observability systems, full‑chain monitoring, high‑availability design, performance management, risk governance, process mechanisms, and engineering practices to ensure reliable user experiences and business continuity.

FrontendHigh AvailabilityStability
0 likes · 44 min read
Comprehensive Guide to Front‑End Stability: Observability, Full‑Chain Monitoring, High‑Availability Architecture, Performance Management, Risk Governance, Process Mechanisms, and Engineering Practices
IT Architects Alliance
IT Architects Alliance
Feb 5, 2025 · Cloud Native

Performance Optimization Strategies for Cloud‑Native Applications

This article examines the rapid adoption of cloud‑native architectures and presents a comprehensive guide to identifying performance bottlenecks and applying architectural, resource‑management, caching, networking, and tooling techniques—such as Kubernetes, Prometheus, Grafana, and JMeter—to achieve high‑performance, scalable cloud‑native systems.

CI/CDKubernetesPerformance optimization
0 likes · 22 min read
Performance Optimization Strategies for Cloud‑Native Applications