Raymond Ops
Author

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

607
Articles
0
Likes
2.1k
Views
0
Comments
Recent Articles

Latest from Raymond Ops

100 recent articles max
Raymond Ops
Raymond Ops
Mar 7, 2026 · Cloud Native

Master Kubernetes Troubleshooting: From Pod Crashes to Network Failures

This comprehensive guide walks you through Kubernetes fault‑tolerance by covering core components, classifying six major failure types, presenting a three‑step troubleshooting methodology, and detailing six real‑world case studies with commands, manifests, monitoring setups and preventive best practices.

NetworkPodStorage
0 likes · 36 min read
Master Kubernetes Troubleshooting: From Pod Crashes to Network Failures
Raymond Ops
Raymond Ops
Mar 7, 2026 · Operations

7 Hidden Traps in Nginx+Lua Gray Releases and How to Fix Them

This article reveals seven critical pitfalls that can cripple Nginx+Lua gray‑release deployments—ranging from memory leaks and blocking I/O to uneven traffic hashing, configuration reload races, cross‑datacenter latency, session stickiness issues, and blind‑spot monitoring—while providing concrete Lua scripts, Nginx configurations, monitoring commands, and step‑by‑step remediation strategies.

DevOpsLuaNginx
0 likes · 43 min read
7 Hidden Traps in Nginx+Lua Gray Releases and How to Fix Them
Raymond Ops
Raymond Ops
Mar 6, 2026 · Cloud Native

Scaling Kubernetes from 1k to 5k Nodes: Complete Performance Tuning Playbook

This article presents a comprehensive, real‑world guide for expanding a Kubernetes cluster from 1,000 to 5,000 nodes, covering control‑plane HA, etcd optimization, network and scheduler tuning, monitoring, and automation, with detailed configurations, code snippets, and a step‑by‑step case study of a large‑scale production environment.

CNIPerformance Tuningcluster scaling
0 likes · 22 min read
Scaling Kubernetes from 1k to 5k Nodes: Complete Performance Tuning Playbook
Raymond Ops
Raymond Ops
Mar 4, 2026 · Operations

Build an Enterprise‑Grade DevOps CI/CD Pipeline in 7 Days with Ready‑to‑Use Scripts

This guide walks you through constructing a full‑stack, enterprise‑level DevOps pipeline—from environment preparation and tool installation to Jenkins pipeline scripting, Kubernetes deployment, monitoring, security hardening, and cost optimization—providing complete scripts and step‑by‑step instructions to achieve automated, reliable releases within a week.

AutomationCI/CDDevOps
0 likes · 27 min read
Build an Enterprise‑Grade DevOps CI/CD Pipeline in 7 Days with Ready‑to‑Use Scripts
Raymond Ops
Raymond Ops
Mar 3, 2026 · Operations

How I Turned a Firefighter Ops Engineer into a High‑Paid Tech Expert in 3 Years

This article chronicles a three‑year journey from a junior operations engineer blamed for outages to a senior technical specialist, detailing the four pivotal turning points, concrete learning plans, automation projects, cost‑optimization strategies, and actionable advice for anyone seeking to advance in modern operations.

Monitoringcareercloud-native
0 likes · 27 min read
How I Turned a Firefighter Ops Engineer into a High‑Paid Tech Expert in 3 Years
Raymond Ops
Raymond Ops
Mar 2, 2026 · Operations

Why Most Alerts Fail and How to Build a Night‑Quiet, High‑Signal Monitoring System

This article examines the root causes of alert fatigue—mis‑configured thresholds, noisy alerts, lack of context, and poor routing—then presents a step‑by‑step guide using golden signals, dynamic baselines, enriched alert payloads, severity‑based routing, and suppression techniques to create an effective, low‑noise monitoring system.

AlertmanagerMonitoringPrometheus
0 likes · 24 min read
Why Most Alerts Fail and How to Build a Night‑Quiet, High‑Signal Monitoring System
Raymond Ops
Raymond Ops
Mar 2, 2026 · Cloud Native

ELK vs EFK vs Loki: 2025’s Best Log Solution for Cost, Performance & Simplicity

This comprehensive 2025 guide compares ELK, EFK, and Loki across architecture, deployment complexity, storage cost, query performance, feature completeness, high‑availability, and real‑world case studies, helping teams of any size choose the most cost‑effective and operationally suitable log collection stack.

EFKELKLoki
0 likes · 37 min read
ELK vs EFK vs Loki: 2025’s Best Log Solution for Cost, Performance & Simplicity
Raymond Ops
Raymond Ops
Mar 1, 2026 · Operations

How I Transitioned from Traditional Ops to SRE/DevOps in 18 Months

This detailed guide shares a step‑by‑step 18‑month roadmap, covering self‑assessment, skill acquisition (Python, Kubernetes, monitoring), project execution, interview preparation, and real‑world outcomes for engineers moving from legacy operations to SRE/DevOps roles.

CI/CDCareer transitionKubernetes
0 likes · 35 min read
How I Transitioned from Traditional Ops to SRE/DevOps in 18 Months