Raymond Ops
Author

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

624
Articles
0
Likes
3.1k
Views
0
Comments
Recent Articles

Latest from Raymond Ops

100 recent articles max
Raymond Ops
Raymond Ops
Apr 24, 2026 · Cloud Native

Multi‑Stage Docker Builds & SBOM: Shrink Images and Meet Security Compliance

This guide shows how to dramatically reduce container image size using Docker multi‑stage builds, choose minimal base images, automatically generate SPDX/CycloneDX SBOMs, sign images with Cosign, and integrate the whole process into CI/CD pipelines for secure, lightweight deployments.

DockerImage optimizationSBOM
0 likes · 21 min read
Multi‑Stage Docker Builds & SBOM: Shrink Images and Meet Security Compliance
Raymond Ops
Raymond Ops
Apr 23, 2026 · Operations

Advanced Nginx Load Balancing: How to Choose and Tune Layer 4 vs Layer 7

This guide walks through the differences between 4‑layer (TCP) and 7‑layer (HTTP) load balancing in Nginx, explains when to use each, and provides step‑by‑step configuration examples, health‑check setups, performance tuning, SSL handling, WebSocket support, and common pitfalls.

Layer 4Layer 7Nginx
0 likes · 25 min read
Advanced Nginx Load Balancing: How to Choose and Tune Layer 4 vs Layer 7
Raymond Ops
Raymond Ops
Apr 22, 2026 · Operations

How Prometheus Recording Rules Can Reduce Alert Noise by 70%

This guide explains how to use Prometheus Recording Rules to pre‑compute, aggregate, and smooth metrics in large‑scale microservice environments, cutting daily alert noise by up to 70% through hierarchical alert design, practical examples, and best‑practice recommendations.

Alert Noise ReductionKubernetesObservability
0 likes · 22 min read
How Prometheus Recording Rules Can Reduce Alert Noise by 70%
Raymond Ops
Raymond Ops
Apr 20, 2026 · Operations

How to Build a Standardized SRE On‑Call Process: From Alert Grading to Handoff Templates

This article presents a complete SRE on‑call handbook that defines alert severity levels, provides concrete Prometheus Alertmanager configurations, outlines a step‑by‑step response flow, details war‑room roles, escalation paths, handoff checklists, post‑mortem procedures, and dozens of ready‑to‑use templates to reduce MTTR and improve reliability.

Alert ManagementOn-CallRunbook
0 likes · 27 min read
How to Build a Standardized SRE On‑Call Process: From Alert Grading to Handoff Templates
Raymond Ops
Raymond Ops
Apr 19, 2026 · Cloud Native

How to Double K8s Ingress Performance: Nginx vs Envoy Gateway Tuning Guide

This article walks through a real‑world performance bottleneck on a high‑traffic e‑commerce platform, explains step‑by‑step deep tuning of Nginx Ingress Controller, compares it with Envoy Gateway, and provides concrete configurations, benchmark results, monitoring rules, and best‑practice recommendations for Kubernetes Ingress optimization.

EnvoyKubernetesNginx
0 likes · 27 min read
How to Double K8s Ingress Performance: Nginx vs Envoy Gateway Tuning Guide
Raymond Ops
Raymond Ops
Apr 18, 2026 · Operations

How to Build a Lightweight Log Platform with Grafana and Loki in 3 Simple Steps

This guide walks you through replacing a heavyweight ELK stack with a minimal Grafana‑Loki logging solution, covering environment requirements, installation of Loki and Promtail, configuration details, best‑practice tips, troubleshooting, and backup strategies for reliable log aggregation.

LoggingLokiObservability
0 likes · 25 min read
How to Build a Lightweight Log Platform with Grafana and Loki in 3 Simple Steps
Raymond Ops
Raymond Ops
Apr 18, 2026 · Operations

Rapid CPU Spike Diagnosis: Resolve High CPU Usage in Under 5 Minutes

This guide presents a step‑by‑step, standardized process for detecting, analyzing, and fixing sudden CPU usage spikes on Linux servers, covering preparation, quick identification, deep thread‑level investigation, stack and system‑call analysis, flame‑graph generation, emergency mitigation, and best‑practice recommendations.

CPULinuxPerformance
0 likes · 21 min read
Rapid CPU Spike Diagnosis: Resolve High CPU Usage in Under 5 Minutes
Raymond Ops
Raymond Ops
Apr 16, 2026 · Operations

Mastering Nginx 502/504 Errors: A Complete Troubleshooting Guide with Scripts

This comprehensive guide explains the differences between Nginx 502 and 504 errors, provides step‑by‑step troubleshooting procedures, detailed configuration examples, one‑click diagnostic scripts, real‑world case studies, best‑practice optimizations, monitoring setups, and advanced learning paths to help you quickly resolve gateway issues and improve server reliability.

502504Nginx
0 likes · 26 min read
Mastering Nginx 502/504 Errors: A Complete Troubleshooting Guide with Scripts
Raymond Ops
Raymond Ops
Apr 11, 2026 · Operations

Why TCP’s Three‑Way Handshake Matters: Deep Dive into States, Tuning, and Real‑World Pitfalls

This article explains the TCP three‑way handshake in depth, covering the state machine, kernel‑level packet analysis, performance tuning, security hardening, real‑world case studies such as SYN‑Flood mitigation and TIME_WAIT overload, and provides complete C and Python examples, monitoring metrics, troubleshooting steps, and backup procedures for production environments.

LinuxNetworkingPerformance
0 likes · 28 min read
Why TCP’s Three‑Way Handshake Matters: Deep Dive into States, Tuning, and Real‑World Pitfalls
Raymond Ops
Raymond Ops
Mar 26, 2026 · Cloud Native

How to Shrink Docker Images by 70% and Harden Them with Trivy

This guide explains how to dramatically shrink Docker image sizes by up to 70% using multi‑stage builds, Alpine or Distroless bases, layer merging, .dockerignore, and BuildKit, while also integrating Trivy security scanning, non‑root users, SUID removal, and CI/CD automation to ensure a lean, secure container deployment.

AlpineCI/CDDocker
0 likes · 29 min read
How to Shrink Docker Images by 70% and Harden Them with Trivy