Surviving DevOps: Key Practices to Eliminate Waste and Ensure Reliability
This article shares practical DevOps insights—covering waste elimination, automation, disaster coverage, comprehensive documentation, and change control—to help developers and operators build resilient, collaborative, and efficient production pipelines.
21CTO community guide: Whether you are just starting your development career, have a solid project portfolio, or are considering a shift to DevOps, this is my view on how to survive in this demanding world.
I have been reading the "Phoenix Plan" book and realized many patterns in my career; while I don’t aim to become a DevOps specialist, I see the importance of a DevOps team so developers don’t have to handle server deployment, configuration, or change management.
First Goal
Eliminate waste through automation, system monitoring, and a culture of information sharing across business, development, and operations.
This goal is crucial: IT operations and development must collaborate, not operate in silos, which is harder in practice than theory. The idea comes from Lean Six Sigma, which improves performance by systematically removing waste and variation.
You must communicate with both senior and junior management to identify and eliminate all interferences and waste.
Stability is paramount; maintain high standards or you’ll spend nights in the server room. You act as the glue binding business, development, and operations—if that glue thins, the business suffers.
Ensuring Coverage When Disasters Strike
Disasters are inevitable—whether caused by human error, malicious insiders, buggy software, or external factors. Complex systems mean failure.
Complex systems mean failure.
A complex system contains more than seven interacting nodes or components (e.g., UI, web app, backend service, database server, OS). Humans can only effectively manage a limited number of items simultaneously.
To meet these challenges, everything must be automated.
Run tests: all test cases, including unit tests, should be part of the pipeline to prevent interruptions on failures.
Collect metrics: gather logs and monitoring data in a non‑blocking way, storing them securely.
Run reports and alerts.
Self‑repair: software running on servers should adjust itself to recover from failures automatically.
Backup
This is the minimum requirement.
Record Everything
Automation: keep everything in automated scripts; maintain a real‑time wiki via APIs for configs, requirements, validation rules, etc.
Continuity: regenerate documentation on every change and keep it in sync with the current environment.
Traceability: documentation must include metadata—who updated, when, and why.
Detail: provide thorough information, not just vague statements like "updated 10 terminals with latest OS patches"; include patch versions, steps, and observed effects.
Control Change Requirements
Developers must ensure new code integrates seamlessly, while operators need tighter collaboration and automated deployment processes.
This leads to frequent releases.
Frequent releases are not a problem; they indicate a healthy pipeline. Keep track of business needs and development status to plan releases on time.
Conclusion
We have explored the expectations of the DevOps role from a developer’s perspective. Remember, this is just one viewpoint; as you define your own role, you will discover additional best practices.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
