Tag

Change Management

0 views collected around this technical thread.

Efficient Ops
Efficient Ops
Jan 6, 2025 · Operations

Treat Every Ops Change Like a Project: Lessons from a Simple Raid Rebuild

The article uses a real‑world raid‑rebuild incident to illustrate why operations teams must understand change background, schedule, and risk, act as project managers, follow a formal change process, and treat production environments with utmost respect.

Change ManagementProductionRisk Assessment
0 likes · 9 min read
Treat Every Ops Change Like a Project: Lessons from a Simple Raid Rebuild
JD Tech Talk
JD Tech Talk
Oct 17, 2024 · Operations

Comprehensive Guide to Change Management: Compatibility Design, Release Planning, Gray Deployment, Data Migration, Rollback, and Configuration Control

This article presents a detailed overview of change management practices, covering compatibility design across hardware, base software, and applications, release strategies, gray‑deployment techniques, data migration analysis, rollback planning, configuration change control, and verification procedures to ensure system stability and reliability.

Change ManagementCompatibilityRelease Planning
0 likes · 26 min read
Comprehensive Guide to Change Management: Compatibility Design, Release Planning, Gray Deployment, Data Migration, Rollback, and Configuration Control
Efficient Ops
Efficient Ops
Aug 21, 2024 · Operations

10 Proven Practices to Prevent System Failures in Operations

This article shares ten practical operations strategies—ranging from change‑rollback procedures and cautious handling of destructive commands to robust backup verification, alerting, and meticulous hand‑over practices—that together help teams dramatically reduce system outages and maintain high availability.

BackupChange ManagementLinux
0 likes · 17 min read
10 Proven Practices to Prevent System Failures in Operations
Bilibili Tech
Bilibili Tech
Aug 9, 2024 · Operations

Design and Implementation of Bilibili's Change Control Platform

Bilibili’s Change Prevention Platform consolidates data from over 60 systems to proactively detect and block more than 100 risky changes daily, reducing change‑related incidents by applying a four‑pillar framework of technical support, landing, cross‑domain enablement, and cultural safeguards, while evolving toward AI‑driven, end‑to‑end change defense.

BilibiliChange ManagementDevOps
0 likes · 20 min read
Design and Implementation of Bilibili's Change Control Platform
DevOps Cloud Academy
DevOps Cloud Academy
May 30, 2024 · Operations

Case Study: Overcoming Resistance in a Large Manufacturing Company's IT Department During DevOps Transformation

This case study describes how a large manufacturing company's IT department, led by Michael, overcame strong internal resistance from senior staff to transition from a traditional waterfall development model to an agile and DevOps approach through personalized communication, stakeholder engagement, and transparent implementation planning.

AgileChange ManagementDevOps
0 likes · 8 min read
Case Study: Overcoming Resistance in a Large Manufacturing Company's IT Department During DevOps Transformation
Cognitive Technology Team
Cognitive Technology Team
Apr 15, 2024 · Operations

Tencent Cloud Service Outage on April 8: Root Cause, Impact, and Improvement Measures

On April 8, Tencent Cloud experienced a major service outage caused by a cloud API failure that prevented console login and disrupted several public cloud services for 87 minutes, prompting a detailed post‑mortem that outlines the root cause, impact, and a series of operational and change‑management improvements.

Change ManagementTencent Cloudcloud API
0 likes · 4 min read
Tencent Cloud Service Outage on April 8: Root Cause, Impact, and Improvement Measures
Bilibili Tech
Bilibili Tech
Jan 5, 2024 · Cloud Native

ChangePilot: Bilibili’s Unified Change Management Platform and Practices

ChangePilot is Bilibili’s unified change‑management platform that standardizes change definition, lifecycle, and risk governance through a platform‑scenario model and five control levels (G0‑G4), offering built‑in checks, searchable records, subscription alerts, intelligent correlation, and emergency channels to boost production stability while maintaining operational efficiency.

Change ManagementSREcloud native
0 likes · 29 min read
ChangePilot: Bilibili’s Unified Change Management Platform and Practices
Bilibili Tech
Bilibili Tech
Dec 22, 2023 · Cloud Native

Safe Change Management in Bilibili's Cloud‑Native Container Platform Caster

The paper describes Bilibili’s Caster platform, which implements standardized workflows, left‑shifted pre‑checks, tiered release checkpoints, and an emergency green‑channel to safely manage containerized application changes, providing real‑time observability, automated rollback, and capacity‑aware scaling that together cut change‑induced incidents and improve production stability.

CI/CDChange Managementcloud native
0 likes · 17 min read
Safe Change Management in Bilibili's Cloud‑Native Container Platform Caster
AntTech
AntTech
Dec 18, 2023 · Cloud Native

AlterShield Open‑Source Change Risk Control Platform: Architecture, Features, and Future Roadmap

AlterShield is an open‑source change‑risk prevention solution originally built by Ant Group that provides lifecycle‑aware change defense, cloud‑native operator integration, KDE‑based anomaly detection, and extensible plug‑in frameworks, with detailed module descriptions, recent v1.0 releases, and a roadmap for advanced monitoring and noise‑reduction capabilities.

Change ManagementKubernetesSRE
0 likes · 13 min read
AlterShield Open‑Source Change Risk Control Platform: Architecture, Features, and Future Roadmap
AntTech
AntTech
Jul 20, 2023 · Operations

AlterShield: An Open‑Source Change Management Platform for Risk Control and Observability

AlterShield is an open‑source, end‑to‑end change‑control platform that systematizes change perception, risk analysis, and defense across distributed cloud‑native environments, enabling SRE teams to mitigate stability risks through standardized protocols, incremental rollout, and automated observability checks.

Change ManagementObservabilityOpen-source
0 likes · 24 min read
AlterShield: An Open‑Source Change Management Platform for Risk Control and Observability
DevOps
DevOps
Jun 9, 2023 · R&D Management

Preparing for Organizational Change: Building Urgency, Leadership, Team Participation, Goals, Research, and Action Plans

The article explains how to prepare for successful organizational change by creating urgency and recognition, establishing a change leadership team, guiding team participation, defining clear goals, conducting research interviews, and developing detailed action plans, all supported by practical examples and visual illustrations.

Change ManagementR&Dleadership
0 likes · 11 min read
Preparing for Organizational Change: Building Urgency, Leadership, Team Participation, Goals, Research, and Action Plans
Efficient Ops
Efficient Ops
Jun 7, 2023 · Artificial Intelligence

How Guangdong Mobile Scaled AIOps: From Manual Ops to Intelligent Automation

This article details Guangdong Mobile's evolution of IT systems and operations, explains the four domain architecture, chronicles the AIOps adoption timeline, showcases intelligent anomaly detection, change assessment, fault diagnosis, and operation robots, and shares practical promotion methods and future outlook for AI‑driven IT operations.

AIOpsArtificial IntelligenceChange Management
0 likes · 19 min read
How Guangdong Mobile Scaled AIOps: From Manual Ops to Intelligent Automation
DevOps
DevOps
May 8, 2023 · Operations

Key Strategies for Successful Digital Transformation and Overcoming Organizational Resistance

The article outlines why many digital transformation initiatives fail, emphasizes the importance of bottom‑up empowerment over top‑down mandates, and provides practical guidance on building small elite pilot teams, addressing dissent, and sustaining change to achieve long‑term organizational success.

Change ManagementDigital Transformationleadership
0 likes · 7 min read
Key Strategies for Successful Digital Transformation and Overcoming Organizational Resistance
Model Perspective
Model Perspective
Dec 21, 2022 · Fundamentals

What the Kubler‑Ross Change Curve Reveals About Our COVID‑19 Reactions

The article explains how the Kubler‑Ross change curve's seven emotional stages—shock, denial, frustration, depression, experiment, decision, and integration—map onto public responses to the evolving COVID‑19 pandemic, offering insights for coping and adaptation.

COVID-19Change ManagementKubler-Ross
0 likes · 7 min read
What the Kubler‑Ross Change Curve Reveals About Our COVID‑19 Reactions
DeWu Technology
DeWu Technology
Oct 17, 2022 · Operations

High Availability: Principles and Practices for System Stability

High availability—measured in nines of uptime—requires partitioning systems, decoupling components, choosing robust technologies, deploying redundant instances with automatic failover, capacity planning, rapid scaling, traffic shaping, resource isolation, global protection, observability, and disciplined change management to achieve stable, resilient services.

Change ManagementHigh AvailabilityObservability
0 likes · 10 min read
High Availability: Principles and Practices for System Stability
Top Architect
Top Architect
Sep 4, 2022 · Backend Development

Designing Fault‑Tolerant Microservices Architecture

The article explains how to build highly available microservice systems by isolating failures, applying graceful degradation, change‑management, health checks, self‑healing, fallback caches, circuit breakers, retry policies, rate limiting and testing strategies, while acknowledging the cost and operational complexity involved.

Change ManagementHealth ChecksRate Limiting
0 likes · 16 min read
Designing Fault‑Tolerant Microservices Architecture
Architects Research Society
Architects Research Society
May 22, 2022 · Operations

Designing Resilient Microservices: Fault‑Tolerance Patterns and Practices

This article explains how to build highly available microservice systems by defining clear service boundaries, employing graceful degradation, change‑management strategies, health checks, self‑healing, cache failover, retry logic, rate limiting, bulkheads, circuit breakers, and testing techniques to mitigate failures in distributed environments.

Change ManagementRate Limitingcircuit breaker
0 likes · 15 min read
Designing Resilient Microservices: Fault‑Tolerance Patterns and Practices
Architects Research Society
Architects Research Society
Aug 17, 2021 · Fundamentals

The Critical Role of Enterprise Architecture in Successful Business Transformations

The article explains how enterprise architecture, when focused on agile change processes and integrated with strategy, risk, compliance, and portfolio management, becomes a vital knowledge hub that enables organizations to accelerate digital transformation, reduce costs, and improve customer satisfaction.

AgileChange ManagementDigital Transformation
0 likes · 9 min read
The Critical Role of Enterprise Architecture in Successful Business Transformations
DevOps
DevOps
Aug 4, 2021 · R&D Management

Five Key Lessons for Successful Digital Transformation

The article analyzes why many digital transformation initiatives fail, presents five practical lessons—including aligning business strategy, leveraging internal capabilities, designing customer experience from the outside in, addressing employee concerns, and adopting a Silicon Valley‑style entrepreneurial culture—to help leaders drive effective change.

Change ManagementDigital Transformationbusiness strategy
0 likes · 10 min read
Five Key Lessons for Successful Digital Transformation