Tagged articles
928 articles
Page 1 of 10
Ray's Galactic Tech
Ray's Galactic Tech
Apr 15, 2026 · Cloud Native

From Solo Demo to Cloud‑Native: Building a High‑Availability Real‑Time Translation Bot with AgentScope Java

This article walks through the complete engineering practice of turning a single‑machine demo into a cloud‑native, highly available real‑time translation robot using AgentScope Java, covering business requirements, architecture evolution, core AgentScope concepts, code examples, deployment, observability, performance tuning, and common pitfalls.

Agent ArchitectureMicroservicescloud-native
0 likes · 29 min read
From Solo Demo to Cloud‑Native: Building a High‑Availability Real‑Time Translation Bot with AgentScope Java
Ray's Galactic Tech
Ray's Galactic Tech
Apr 8, 2026 · Cloud Native

Go Full‑Stack Mastery: From High‑Concurrency Order Systems to Cloud‑Native Production

This comprehensive guide walks you through building a production‑grade Go order service—from understanding the high‑concurrency business scenario and Go’s runtime advantages, to designing microservice architecture, handling idempotency, outbox patterns, observability, Kubernetes deployment, incident response, and testing strategies.

Distributed ConsistencyMicroservicescloud-native
0 likes · 54 min read
Go Full‑Stack Mastery: From High‑Concurrency Order Systems to Cloud‑Native Production
Ops Community
Ops Community
Apr 2, 2026 · Operations

Build a Production‑Ready Prometheus + Grafana Monitoring Stack in Minutes

Learn how to quickly set up a complete, production‑grade monitoring system using Prometheus 3.x and Grafana 11, covering installation, service discovery, PromQL queries, recording rules, Alertmanager routing, Grafana dashboards, best‑practice configurations, and troubleshooting for environments of any size.

AlertingGrafanacloud-native
0 likes · 55 min read
Build a Production‑Ready Prometheus + Grafana Monitoring Stack in Minutes
Alibaba Cloud Native
Alibaba Cloud Native
Mar 20, 2026 · Cloud Native

How a Gaming Platform Scaled to Millions with RocketMQ & Kafka: A Cloud‑Native Success Story

Facing explosive growth, the game‑service platform 悠悠有品 rebuilt its architecture on Alibaba Cloud, using RocketMQ for core transaction messaging and Kafka for data synchronization, achieving elastic scaling, high availability, cost reduction, and reliable high‑concurrency processing across its trading and analytics pipelines.

KafkaMessagingRocketMQ
0 likes · 8 min read
How a Gaming Platform Scaled to Millions with RocketMQ & Kafka: A Cloud‑Native Success Story
Alibaba Cloud Observability
Alibaba Cloud Observability
Mar 16, 2026 · Artificial Intelligence

How LoongSuite Python Probe Simplifies AI Agent Observability

This article explains the observability challenges of modern AI agents—such as context drift, performance spikes, and opaque data semantics—and introduces the LoongSuite Python probe, an OpenTelemetry‑based, zero‑code‑change solution that automatically instruments AI workloads, provides unified GenAI semantics, and offers a three‑step quick‑start for full‑stack tracing.

AI ObservabilityGenAILoongSuite
0 likes · 14 min read
How LoongSuite Python Probe Simplifies AI Agent Observability
Alibaba Cloud Native
Alibaba Cloud Native
Mar 15, 2026 · Cloud Native

Build an AI‑Powered Game Support System with Alibaba Cloud SLS Smart Q&A Assistant

This guide explains how game operation teams can automate complaint handling by creating a SOP‑based knowledge base, deploying Alibaba Cloud Log Service (SLS) Smart Q&A Assistant as a digital employee, integrating it with Git repositories and messaging platforms like DingTalk, and testing end‑to‑end scenarios to dramatically reduce response time.

SLSai-assistantcloud-native
0 likes · 11 min read
Build an AI‑Powered Game Support System with Alibaba Cloud SLS Smart Q&A Assistant
Tech Musings
Tech Musings
Mar 5, 2026 · Cloud Native

Why Default Java GC Settings Kill Performance on Kubernetes (And How to Fix It)

Through a controlled experiment with four Spring Boot service groups on Kubernetes, this article shows that relying on Java’s default GC and heap settings can drastically reduce throughput and increase tail latency, especially under higher load, and demonstrates how explicit GC algorithm and Xms/Xmx tuning restores performance.

JVMJavaKubernetes
0 likes · 13 min read
Why Default Java GC Settings Kill Performance on Kubernetes (And How to Fix It)
Raymond Ops
Raymond Ops
Mar 3, 2026 · Operations

How I Turned a Firefighter Ops Engineer into a High‑Paid Tech Expert in 3 Years

This article chronicles a three‑year journey from a junior operations engineer blamed for outages to a senior technical specialist, detailing the four pivotal turning points, concrete learning plans, automation projects, cost‑optimization strategies, and actionable advice for anyone seeking to advance in modern operations.

careercloud-nativemonitoring
0 likes · 27 min read
How I Turned a Firefighter Ops Engineer into a High‑Paid Tech Expert in 3 Years
Alibaba Cloud Native
Alibaba Cloud Native
Feb 13, 2026 · Cloud Native

How a Tea Chain Achieved Seamless Mega‑Promotions with Cloud‑Native Architecture

Facing massive traffic spikes from viral marketing events, the leading tea brand Guming transformed its digital foundation by adopting a cloud‑native micro‑service architecture, leveraging Alibaba Cloud MSE and RocketMQ Serverless to achieve elastic scaling, cost savings, strong consistency, and full‑stack observability for stable, high‑speed operations.

Digital TransformationMessagingMicroservices
0 likes · 8 min read
How a Tea Chain Achieved Seamless Mega‑Promotions with Cloud‑Native Architecture
Alibaba Cloud Native
Alibaba Cloud Native
Feb 5, 2026 · Cloud Native

How to Cut Cross‑Cloud Data Transfer Costs with CDN and LoongCollector

In multi‑cloud environments, enterprises face high outbound traffic fees for unified observability, but by routing logs through a CDN and using the high‑performance LoongCollector agent, they can reduce cross‑cloud transfer costs by up to 70%, improve throughput, and simplify deployment.

CDNcloud-nativecost-optimization
0 likes · 10 min read
How to Cut Cross‑Cloud Data Transfer Costs with CDN and LoongCollector
Raymond Ops
Raymond Ops
Feb 3, 2026 · Operations

Zabbix vs Prometheus: Which Monitoring System Wins in 2024?

This guide compares Zabbix and Prometheus across architecture, performance, features, operational costs, and real‑world scenarios, providing a detailed selection roadmap for traditional IT, cloud‑native microservices, and hybrid environments while offering optimization tips and future trends.

PrometheusZabbixcloud-native
0 likes · 16 min read
Zabbix vs Prometheus: Which Monitoring System Wins in 2024?
Alibaba Cloud Native
Alibaba Cloud Native
Jan 29, 2026 · Cloud Native

How to Build a Production‑Ready AI Platform with Alibaba Cloud SAE & SLS

This article walks through the architectural bottlenecks of scaling Dify AI applications, explains how Alibaba Cloud Serverless Application Engine (SAE) and Log Service (SLS) jointly provide a fully managed, elastic compute base and storage‑separated logging layer, and offers step‑by‑step deployment, performance‑tuning, and analytics guidance for achieving up to 500 QPS with low cost.

Log ServiceSAEai-infrastructure
0 likes · 19 min read
How to Build a Production‑Ready AI Platform with Alibaba Cloud SAE & SLS
Linux Cloud Computing Practice
Linux Cloud Computing Practice
Jan 29, 2026 · Operations

174 Must‑Know Operations Engineer Interview Questions

This article compiles 174 essential interview questions covering Linux system administration, container orchestration, networking, high‑availability, storage, security, and cloud‑native concepts to help aspiring operations engineers prepare for technical interviews.

Operationscloud-native
0 likes · 15 min read
174 Must‑Know Operations Engineer Interview Questions
Alibaba Cloud Observability
Alibaba Cloud Observability
Jan 26, 2026 · Cloud Native

How LoongCollector Delivers 10× Throughput and 80% Resource Savings in Cloud‑Native Observability

LoongCollector, the open‑source cloud‑native collector behind Alibaba Cloud's Simple Log Service, achieves ten‑fold higher throughput, up to 80% lower CPU and memory usage, near‑linear scaling, zero‑copy processing, lock‑free event pools and adaptive concurrency, while guaranteeing enterprise‑grade reliability for petabyte‑scale log and metric ingestion.

High ThroughputLoongCollectorObservability
0 likes · 16 min read
How LoongCollector Delivers 10× Throughput and 80% Resource Savings in Cloud‑Native Observability
Alibaba Cloud Observability
Alibaba Cloud Observability
Jan 26, 2026 · Cloud Native

Solving Edge Observability: How LoongCollector Ensures Reliable Data Collection

This article explains the three major challenges of collecting observability data on edge devices—unstable networks, reliable delivery, and bandwidth limits—and shows how LoongCollector’s persistent‑asynchronous architecture, smart back‑pressure, and configurable flow control provide a low‑resource, high‑reliability solution with real‑world performance results.

Edge ComputingObservabilitycloud-native
0 likes · 14 min read
Solving Edge Observability: How LoongCollector Ensures Reliable Data Collection
Alibaba Cloud Observability
Alibaba Cloud Observability
Jan 19, 2026 · Information Security

How AI Companies Can Overcome Global Compliance Hurdles with Cloud‑Native Log Auditing

The article explains the complex data‑sovereignty and privacy regulations that AI enterprises face when expanding overseas, analyzes the three‑tier "sandwich" data architecture and regional regulatory differences, and demonstrates how Alibaba Cloud Log Service (SLS) and Cloud Monitoring 2.0 provide unified log collection, cross‑domain correlation, risk tracing, and masking functions to achieve continuous, scalable compliance.

AIcloud-nativecompliance
0 likes · 16 min read
How AI Companies Can Overcome Global Compliance Hurdles with Cloud‑Native Log Auditing
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jan 15, 2026 · Cloud Native

Mastering Microservice Deployment: K8s, Service Mesh, Containerization & Serverless

This guide outlines four primary microservice deployment strategies—Kubernetes orchestration, service‑mesh architecture, containerization, and serverless functions—detailing their principles, core advantages, and ideal use cases for large‑scale distributed systems, and highlights self‑healing, auto‑scaling, zero‑ops, and observability features that help handle massive traffic spikes.

MicroservicesServerlesscloud-native
0 likes · 5 min read
Mastering Microservice Deployment: K8s, Service Mesh, Containerization & Serverless
Alibaba Cloud Observability
Alibaba Cloud Observability
Jan 12, 2026 · Cloud Native

How Alibaba Cloud’s One‑Click I/O Diagnosis Detects and Resolves Storage Anomalies

This article explains how Alibaba Cloud CloudMonitor 2.0 integrates SysOM intelligent diagnostics to automatically detect, analyze, and remediate I/O performance issues in multi‑tenant, hybrid‑cloud environments by using dynamic thresholds, a monitor‑first on‑demand capture architecture, and automated root‑cause reporting.

Operationscloud-nativedynamic-threshold
0 likes · 13 min read
How Alibaba Cloud’s One‑Click I/O Diagnosis Detects and Resolves Storage Anomalies
Alibaba Cloud Native
Alibaba Cloud Native
Jan 8, 2026 · Cloud Native

Mastering Cloud‑Native Browser Automation: Advanced AgentRun Sandbox Integration & Production Best Practices

This guide walks you through advanced integration of AgentRun Browser Sandbox with BrowserUse, covering architecture, dependency setup, environment configuration, multi‑step task orchestration, VNC monitoring, sandbox lifecycle management patterns, security hardening, observability, cost‑optimization strategies, and production deployment with health checks and troubleshooting tips.

Browser Automationai-agentcloud-native
0 likes · 26 min read
Mastering Cloud‑Native Browser Automation: Advanced AgentRun Sandbox Integration & Production Best Practices
Architect Chen
Architect Chen
Jan 3, 2026 · Cloud Native

Essential Docker Commands Every Cloud‑Native Engineer Should Know

This guide compiles the most frequently used Docker commands, covering version checks, image management, container lifecycle, network and volume operations, as well as cleanup techniques, providing a concise reference for developers working with cloud‑native container environments.

ContainerDevOpsImage
0 likes · 4 min read
Essential Docker Commands Every Cloud‑Native Engineer Should Know
Alibaba Cloud Observability
Alibaba Cloud Observability
Dec 29, 2025 · Cloud Native

How to Seamlessly Import Massive S3 Logs into Alibaba Cloud SLS with Real‑Time Analysis

This article explains how to centralize and analyze massive multi‑cloud log data stored in object storage by moving AWS S3 logs into Alibaba Cloud Log Service (SLS) using dual‑mode file discovery, SQS event‑driven import, elastic scaling, and pre‑ingestion processing to achieve low latency, high reliability, and cost efficiency.

AWS S3Real-time Processingalibaba-sls
0 likes · 12 min read
How to Seamlessly Import Massive S3 Logs into Alibaba Cloud SLS with Real‑Time Analysis
Efficient Ops
Efficient Ops
Dec 28, 2025 · Cloud Native

Master Helm: Simplify Kubernetes Deployments with Charts and Releases

This guide explains what Helm is, its core concepts of charts and releases, how to install and upgrade Helm across platforms, use it in CI/CD pipelines, and leverage essential Helm commands and chart repositories to streamline Kubernetes application deployment and management.

cloud-nativehelmpackage management
0 likes · 7 min read
Master Helm: Simplify Kubernetes Deployments with Charts and Releases
DataFunTalk
DataFunTalk
Dec 26, 2025 · Cloud Native

How Haier Built a Cloud‑Native Multi‑Modal Data Lake for AI‑Ready Manufacturing

Haier’s digital transformation leverages a cloud‑native, open‑source‑based multi‑modal data lake that unifies structured and unstructured industrial data, uses metadata models and knowledge graphs for governance, and provides AI‑ready services that balance performance, cost, and real‑time requirements.

AIData LakeMultimodal Data
0 likes · 12 min read
How Haier Built a Cloud‑Native Multi‑Modal Data Lake for AI‑Ready Manufacturing
Raymond Ops
Raymond Ops
Dec 24, 2025 · Cloud Native

Mastering Kubernetes Networking: How to Choose the Right CNI Plugin and Boost Performance

This comprehensive guide walks you through the Kubernetes network model, compares seven major CNI plugins with real‑world performance data, provides detailed configuration examples, offers a decision‑tree framework for production environments, and shares practical tuning, troubleshooting, and monitoring techniques for reliable cloud‑native networking.

CNIKubernetesNetworking
0 likes · 20 min read
Mastering Kubernetes Networking: How to Choose the Right CNI Plugin and Boost Performance
DataFunSummit
DataFunSummit
Dec 19, 2025 · Cloud Native

How HiSilicon Uses Cloud‑Native Architecture to Build a Multi‑Modal Data Lake

Amid the AI wave, HiSilicon’s digital transformation tackles fragmented industrial data by adopting a cloud‑native, open‑source stack centered on Paimon, creating a unified metadata model, knowledge graph, and elastic scheduling that balances performance and cost while powering AI‑ready services across nine business domains.

AIKnowledge Graphbig-data
0 likes · 12 min read
How HiSilicon Uses Cloud‑Native Architecture to Build a Multi‑Modal Data Lake
Alibaba Cloud Native
Alibaba Cloud Native
Dec 9, 2025 · Cloud Native

How UModel Simplifies Observability with Unified Entity Search and Table/Object Modes

This article explains how UModel abstracts observability data into unified table and object models, hides complex routing and field‑mapping logic, provides a single SPL‑based query language, supports metadata reflection for AI agents, and offers SDK and dry‑run examples to streamline metric, log, and trace queries across multiple storage backends.

AI AgentAPIObservability
0 likes · 15 min read
How UModel Simplifies Observability with Unified Entity Search and Table/Object Modes
Ray's Galactic Tech
Ray's Galactic Tech
Dec 5, 2025 · Operations

How to Diagnose and Fix Expired Kubernetes Certificates with kubeadm

This guide walks SREs and DevOps engineers through the typical failures caused by expired kubeadm‑issued Kubernetes certificates, explains root causes, and provides a step‑by‑step, production‑ready process for checking expiration, backing up critical directories, renewing master and worker node certificates, and verifying cluster health, with long‑term maintenance recommendations.

certificatescloud-nativekubeadm
0 likes · 7 min read
How to Diagnose and Fix Expired Kubernetes Certificates with kubeadm
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Nov 28, 2025 · Operations

How the GC Agent System Enables Intelligent, Scalable Cloud‑Native Monitoring

The article details the design, core components, and implementation of the GC Agent System—a modular, cloud‑native monitoring platform that uses natural‑language interaction, dual‑mode execution, intent recognition, and secure multi‑tenant authentication to provide real‑time observability and automated fault diagnosis for enterprise IT environments.

Agent ArchitectureLLM OrchestrationSecurity
0 likes · 19 min read
How the GC Agent System Enables Intelligent, Scalable Cloud‑Native Monitoring
IT Architects Alliance
IT Architects Alliance
Nov 25, 2025 · Operations

Making Architecture Decisions Observable with DevOps Monitoring

The article explains how to integrate architecture decision tracking into DevOps monitoring, detailing tagging, multi‑layer metric design, time‑window analysis, automated alerts, reporting, and continuous optimization to turn architectural choices into measurable, data‑driven outcomes.

DevOpsObservabilitycloud-native
0 likes · 9 min read
Making Architecture Decisions Observable with DevOps Monitoring
360 Smart Cloud
360 Smart Cloud
Nov 25, 2025 · Cloud Native

How PoleFS Achieves Microsecond I/O with Multi‑Layer Caching and CTO Consistency

PoleFS is a high‑performance, cloud‑native distributed file system that combines NVMe‑accelerated hot storage with S3‑based cold storage, offering multiple client access methods, multi‑level metadata and data caches, prefetch/warm‑up strategies, and a Close‑to‑Open consistency model to balance performance and data correctness.

ConsistencyDistributed File Systemcaching
0 likes · 11 min read
How PoleFS Achieves Microsecond I/O with Multi‑Layer Caching and CTO Consistency
Ray's Galactic Tech
Ray's Galactic Tech
Nov 24, 2025 · Cloud Native

Choosing the Right Service Registry: Deep Comparison of Nacos, Zookeeper, and Consul

This guide provides a comprehensive, dimension‑by‑dimension analysis of Nacos, Zookeeper, and Consul—covering architecture, consistency models, health‑check mechanisms, deployment patterns, client language support, performance, security, and practical recommendations—to help engineers select the most suitable service‑registry solution for their microservice ecosystem.

ConsulMicroservicesNacos
0 likes · 10 min read
Choosing the Right Service Registry: Deep Comparison of Nacos, Zookeeper, and Consul
Ray's Galactic Tech
Ray's Galactic Tech
Nov 21, 2025 · Cloud Native

Mastering Kubernetes HPA: How It Works, Real‑World Setup, and Troubleshooting

Horizontal Pod Autoscaler (HPA) in Kubernetes automatically scales pod replicas based on metrics like CPU, memory, or custom indicators, and this guide explains its core principles, configuration pitfalls, step‑by‑step troubleshooting commands, and advanced considerations such as API versions, stabilization windows, and integration with Cluster Autoscaler.

HPAKubernetesautoscaling
0 likes · 9 min read
Mastering Kubernetes HPA: How It Works, Real‑World Setup, and Troubleshooting
Architect Chen
Architect Chen
Nov 12, 2025 · Cloud Native

Understanding Docker: Core Principles, Architecture, and Runtime Workflow

This article provides a comprehensive overview of Docker, explaining its lightweight container model, client‑server architecture, key Linux kernel features such as namespaces and cgroups, image layering, networking, and the three‑stage process of building, distributing, and running containers.

DockerLinux Namespacescgroups
0 likes · 5 min read
Understanding Docker: Core Principles, Architecture, and Runtime Workflow
Ops Development Stories
Ops Development Stories
Nov 10, 2025 · Operations

Build a Low‑Cost Observability Platform with OpenObserve and Vector

This guide walks you through the architecture, deployment, and configuration of the Rust‑based OpenObserve observability platform together with the high‑performance Vector data pipeline, covering log, metric, and trace collection, Docker‑Compose setup, UI usage, and common FAQs for small teams.

ObservabilityVectorcloud-native
0 likes · 11 min read
Build a Low‑Cost Observability Platform with OpenObserve and Vector
Baidu Geek Talk
Baidu Geek Talk
Nov 10, 2025 · Cloud Native

How Polar‑TCP Breaks Kernel Network Bottlenecks for Cloud‑Native High‑Performance Services

This article explains how traditional kernel network stacks struggle with high‑concurrency, low‑latency cloud data‑center workloads and introduces Baidu Intelligent Cloud’s Polar solution—Polar‑TCP and Polar‑RDMA—which combine user‑space DPDK drivers, a lightweight TCP stack, and an industrial RPC framework to achieve near‑RDMA performance while preserving compatibility with existing TCP ecosystems.

DPDKNetwork StackPerformance Optimization
0 likes · 23 min read
How Polar‑TCP Breaks Kernel Network Bottlenecks for Cloud‑Native High‑Performance Services
Alibaba Cloud Observability
Alibaba Cloud Observability
Nov 10, 2025 · Cloud Native

How a Next‑Gen Cloud‑Native Observability Platform Boosted Ticketing Stability by 80%

A leading digital‑entertainment group tackled severe stability and monitoring challenges in its high‑traffic ticketing system by building a cloud‑native, full‑link observability platform on Alibaba Cloud, achieving an 80% improvement in fault detection speed, a 40% reduction in operational costs, and establishing data‑driven operations as the digital foundation for product growth.

ObservabilityOperationsaiops
0 likes · 15 min read
How a Next‑Gen Cloud‑Native Observability Platform Boosted Ticketing Stability by 80%
Xiao Liu Lab
Xiao Liu Lab
Nov 9, 2025 · Operations

50 Essential Docker Maintenance Commands for Daily Ops and Security

This guide compiles 50 practical Docker commands covering daily status checks, weekly resource cleanup, monthly security hardening, logging and monitoring, image management, high‑availability, and disaster‑recovery, helping operators maintain healthy containers across Rocky, CentOS, and Kylin environments.

ContainerDockerOperations
0 likes · 10 min read
50 Essential Docker Maintenance Commands for Daily Ops and Security
Cloud Native Technology Community
Cloud Native Technology Community
Nov 7, 2025 · Cloud Native

How Platform Engineering Enables Self‑Service Cloud‑Native Development in the X‑as‑Code Era

With the rise of the X‑as‑Code paradigm, developers face increasingly complex toolchains and infrastructure; this article explains how platform engineering builds developer portals and capability platforms to deliver efficient, standardized, self‑service cloud‑native development, covering infrastructure evolution, challenges, and open‑source toolsets.

cloud-nativeinfrastructure-as-codeplatform engineering
0 likes · 13 min read
How Platform Engineering Enables Self‑Service Cloud‑Native Development in the X‑as‑Code Era
Raymond Ops
Raymond Ops
Nov 6, 2025 · Cloud Native

Master Helm Repository Management: Add, Update, Search & Secure Charts

This guide explains Helm repositories—how they store chart packages and index files, the types of repositories (official, community, private), and provides step‑by‑step commands for adding, updating, listing, removing, searching, pulling charts, and managing private repo indexes.

Chart Repositorycloud-nativehelm
0 likes · 7 min read
Master Helm Repository Management: Add, Update, Search & Secure Charts
Open Source Linux
Open Source Linux
Nov 6, 2025 · Operations

How to Break the 20K Salary Ceiling in Operations: 4 Power Moves

This article reveals why many ops engineers are stuck below 20K, outlines four high‑impact practices—including coding automation, mastering cloud‑native, aligning with business performance, and shifting from firefighting to prevention—and presents concrete career paths and daily actions to boost expertise and salary.

Operationscareercloud-native
0 likes · 7 min read
How to Break the 20K Salary Ceiling in Operations: 4 Power Moves
Linux Ops Smart Journey
Linux Ops Smart Journey
Nov 5, 2025 · Cloud Native

Why Switch from Prometheus? Deploy a High‑Performance vmagent Cluster with VictoriaMetrics

This article explains the scalability limits of Prometheus, introduces vmagent as a lightweight, high‑performance collector compatible with Prometheus, and provides a step‑by‑step guide—including configuration, systemd service setup, and verification—to deploy a resilient vmagent cluster in production.

DeploymentPrometheusVictoriaMetrics
0 likes · 5 min read
Why Switch from Prometheus? Deploy a High‑Performance vmagent Cluster with VictoriaMetrics
IT Services Circle
IT Services Circle
Nov 3, 2025 · Cloud Native

Why MinIO Dropped Official Docker Images and What It Means for Users

MinIO, the high‑performance distributed object storage system with over a billion downloads, stopped providing pre‑built Docker images after its October 2025 CVE‑compliant release, forcing users to build from source and sparking heated community debate over licensing, security, and the sustainability of free open‑source distribution.

DockerMiniocloud-native
0 likes · 11 min read
Why MinIO Dropped Official Docker Images and What It Means for Users
Linux Cloud Computing Practice
Linux Cloud Computing Practice
Oct 31, 2025 · Cloud Native

Essential Docker Commands Every Developer Should Master

This guide compiles the most frequently used Docker commands, organized into categories such as basic operations, image management, container handling, data and volume control, network configuration, security, and additional utilities, helping developers streamline deployment and resource management in containerized environments.

ContainerDevOpscloud-native
0 likes · 2 min read
Essential Docker Commands Every Developer Should Master
Ray's Galactic Tech
Ray's Galactic Tech
Oct 30, 2025 · Operations

Master Kubernetes Troubleshooting: Common Issues and How to Fix Them

This guide walks you through the most frequent Kubernetes problems—from image pull failures and CrashLoopBackOff to DNS, storage, node readiness, and RBAC errors—providing clear diagnosis steps, essential kubectl commands, and concrete solutions to keep your clusters healthy.

DevOpsKubernetescloud-native
0 likes · 11 min read
Master Kubernetes Troubleshooting: Common Issues and How to Fix Them
php Courses
php Courses
Oct 27, 2025 · Backend Development

Why PHP-FPM Struggles in Cloud‑Native Environments and What Modern Runtimes Offer

This article examines PHP-FPM's historical strengths, its growing incompatibilities with containerized and micro‑service architectures, outlines three key bottlenecks, and presents modern alternatives such as Swoole, FrankenPHP, and serverless runtimes that enable higher performance, better resource efficiency, and improved observability.

ServerlessSwoolecloud-native
0 likes · 7 min read
Why PHP-FPM Struggles in Cloud‑Native Environments and What Modern Runtimes Offer
IT Architects Alliance
IT Architects Alliance
Oct 24, 2025 · Cloud Native

Balancing Flexibility and Complexity: Strategies for Modern Architecture

This article explores how architects can reconcile flexibility and complexity through layered design, progressive complexity management, adaptive architecture, and team‑capacity alignment, offering practical principles, decision‑making frameworks, and monitoring metrics to guide sustainable system evolution.

Designarchitecturecloud-native
0 likes · 11 min read
Balancing Flexibility and Complexity: Strategies for Modern Architecture
MaGe Linux Operations
MaGe Linux Operations
Oct 21, 2025 · Operations

Mastering Prometheus: Proven Strategies to Optimize Monitoring Performance

This article shares real‑world experiences and step‑by‑step techniques—including metric pruning, sampling interval tuning, TSDB configuration, query rewriting, and federation—to dramatically improve Prometheus memory usage, query latency, and overall scalability for large‑scale cloud‑native environments.

OperationsPrometheuscloud-native
0 likes · 11 min read
Mastering Prometheus: Proven Strategies to Optimize Monitoring Performance
Alibaba Cloud Observability
Alibaba Cloud Observability
Oct 20, 2025 · Cloud Native

How ‘泡姆泡姆’ Leverages Cloud‑Native Architecture for Global Low‑Latency Gaming

The multiplayer party game 泡姆泡姆 combines colorful shooting, match‑3, physics puzzles and arcade mini‑games, and uses a cloud‑native stack on Alibaba Cloud Container Service with OpenKruiseGame, Keda‑driven auto‑scaling, multi‑region deployment, zero‑downtime updates and a three‑layer observability platform to deliver seamless low‑latency experiences worldwide.

Game DevelopmentObservabilityScalability
0 likes · 10 min read
How ‘泡姆泡姆’ Leverages Cloud‑Native Architecture for Global Low‑Latency Gaming
IT Architects Alliance
IT Architects Alliance
Oct 19, 2025 · Cloud Native

Mastering Cloud‑Native Autoscaling: HPA, VPA, CA, and Cost‑Aware Strategies

This article explores the challenges and best practices of cloud‑native scaling, covering Horizontal and Vertical Pod Autoscalers, Cluster Autoscaler cost optimization, event‑driven scaling with KEDA, traffic‑aware scaling in service meshes, and intelligent cost‑aware strategies backed by monitoring and future AI‑driven trends.

Cost OptimizationKubernetesService Mesh
0 likes · 11 min read
Mastering Cloud‑Native Autoscaling: HPA, VPA, CA, and Cost‑Aware Strategies
MaGe Linux Operations
MaGe Linux Operations
Oct 15, 2025 · Cloud Native

Master Kubernetes Troubleshooting: From CrashLoopBackOff to Network Failures

This comprehensive guide walks you through Kubernetes fault diagnosis, covering pod lifecycle issues, resource scheduling, network communication errors, storage mounting problems, and node failures, with step‑by‑step methodologies, essential kubectl commands, real‑world case studies, and best‑practice recommendations to quickly identify and resolve production incidents.

DevOpsKubernetescloud-native
0 likes · 36 min read
Master Kubernetes Troubleshooting: From CrashLoopBackOff to Network Failures
Alibaba Cloud Native
Alibaba Cloud Native
Oct 15, 2025 · Cloud Native

What’s New in Higress 2.0? 30 Updates Including RAG MCP Server and Performance Fixes

The Higress 2.0 release introduces 30 changes—13 new features such as a RAG MCP server and ECDS‑based configuration refactor, 7 bug fixes, 5 refactorings, documentation updates and a test improvement—providing developers with enhanced knowledge‑management capabilities, more stable routing, and clearer documentation for cloud‑native service‑mesh environments.

MCPRAGRelease Notes
0 likes · 20 min read
What’s New in Higress 2.0? 30 Updates Including RAG MCP Server and Performance Fixes
Ray's Galactic Tech
Ray's Galactic Tech
Oct 11, 2025 · Operations

Essential Kubernetes Ops Cheat Sheet: Quick Commands & Tips

A concise reference guide that outlines core Kubernetes concepts, categorizes essential kubectl commands for creation, troubleshooting, rollout, scaling, port‑forwarding, node management, and multi‑cluster contexts, and provides practical tips and a quick‑lookup command table for everyday operations.

Cheat SheetKubernetescloud-native
0 likes · 6 min read
Essential Kubernetes Ops Cheat Sheet: Quick Commands & Tips
IT Architects Alliance
IT Architects Alliance
Oct 8, 2025 · R&D Management

How to Master Collaborative Architecture Changes Without Derailing Your Team

This article explores the common collaboration pitfalls during architecture redesigns—such as information asymmetry, skill‑stack gaps, and mismatched timelines—and presents practical solutions like ADR documentation, progressive migration patterns, structured RFC processes, visual communication, and risk‑controlled rollout strategies.

ADRCollaborationMicroservices
0 likes · 11 min read
How to Master Collaborative Architecture Changes Without Derailing Your Team
DataFunTalk
DataFunTalk
Oct 7, 2025 · Big Data

How ByteHouse Tackles Data Warehouse Cost and Efficiency Challenges

This article examines the exploding data volumes that pressure modern enterprises, outlines the explicit and hidden cost challenges of data warehouses, and presents ByteHouse’s cloud‑native architecture and features as a solution for reducing expenses while boosting analytical performance.

ByteHouseCost OptimizationOLAP
0 likes · 6 min read
How ByteHouse Tackles Data Warehouse Cost and Efficiency Challenges
MaGe Linux Operations
MaGe Linux Operations
Oct 6, 2025 · Cloud Native

Prometheus vs Cloud Provider Monitoring: Which Is the Most Cost‑Effective Choice for 2025?

This article compares open‑source Prometheus + Grafana with managed cloud monitoring services, evaluating deployment complexity, functionality, scalability, security, and total cost of ownership across small, medium, and large workloads, and provides practical decision‑making guidance for teams of different sizes and requirements.

ObservabilityPrometheuscloud-native
0 likes · 56 min read
Prometheus vs Cloud Provider Monitoring: Which Is the Most Cost‑Effective Choice for 2025?
ITPUB
ITPUB
Sep 11, 2025 · Operations

Beyond 35: Viable Career Paths for Operations Professionals

The article compiles diverse viewpoints on how operations engineers can sustain and advance their careers after age 35, highlighting cloud‑native/DevOps, security operations, automation engineering, ITIL/service management, and broader roles such as consulting, project management, or training, while also noting industry realities and personal limits.

AutomationDevOpsITIL
0 likes · 10 min read
Beyond 35: Viable Career Paths for Operations Professionals
IT Architects Alliance
IT Architects Alliance
Sep 10, 2025 · Cloud Native

How AI, Cloud‑Native, and Platform Engineering Redefine System Architecture in 2024

Amid rapid AI breakthroughs, mature cloud‑native infrastructure, and rising edge computing, architects must adopt platform engineering, event‑driven and composable architectures, and AI‑native designs, while evolving technical and soft skills to meet escalating business complexity and guide technology selection over the next five years.

AI ArchitectureEdge ComputingEvent-driven
0 likes · 12 min read
How AI, Cloud‑Native, and Platform Engineering Redefine System Architecture in 2024
php Courses
php Courses
Sep 2, 2025 · Backend Development

Why Asynchronous PHP Will Transform Web Development by 2025

This article examines how asynchronous programming revitalizes PHP, detailing its core principles, emerging ecosystem, cloud‑native advantages, performance gains, improved developer experience, industry adoption, and the challenges ahead, forecasting that by 2025 async PHP will reshape modern web development.

Backend PerformanceMicroservicesSwoole
0 likes · 8 min read
Why Asynchronous PHP Will Transform Web Development by 2025
Wukong Talks Architecture
Wukong Talks Architecture
Aug 19, 2025 · Backend Development

From Monolith to Microservices: A Real‑World Online Supermarket Migration Story

This article walks through the evolution of an online supermarket from a simple monolithic website to a fully‑featured microservice architecture, highlighting the challenges, design decisions, component choices, monitoring, tracing, testing, and the trade‑offs of service mesh versus custom frameworks.

DeploymentMicroservicesarchitecture
0 likes · 22 min read
From Monolith to Microservices: A Real‑World Online Supermarket Migration Story
MaGe Linux Operations
MaGe Linux Operations
Aug 12, 2025 · Cloud Native

Master kubectl: 15 Essential Tips to Supercharge Your Kubernetes Workflow

This guide presents fifteen practical kubectl techniques—from resource abbreviations and context switching to advanced JSONPath queries and custom output formats—empowering operators to manage Kubernetes clusters more efficiently, troubleshoot issues faster, and automate routine tasks with confidence.

KubernetesOperationsTips
0 likes · 12 min read
Master kubectl: 15 Essential Tips to Supercharge Your Kubernetes Workflow
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 7, 2025 · Cloud Native

How GitOps Powers Cloud‑Native Large‑Scale Cluster Management

This article details Alibaba Cloud's intelligent operations team’s challenges and solutions for managing thousands of cloud‑native clusters, covering their multi‑layered operation architecture, GitOps workflow, infrastructure‑as‑code integration, and the role of AI‑driven intelligent operations in large‑scale environments.

GitOpsKubernetescloud-native
0 likes · 23 min read
How GitOps Powers Cloud‑Native Large‑Scale Cluster Management
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Aug 2, 2025 · Cloud Native

How Tea Brand ChaBaiDao Scaled to 800M Cups with Cloud‑Native Transformation

This case study details how ChaBaiDao, a Chinese tea chain, leveraged cloud‑native technologies such as Alibaba Cloud ACK, ECI, MSE, and ARMS to overhaul its supply‑chain, marketing, and operations systems, achieving faster customer acquisition, lower costs, and higher reliability across 7,000+ stores.

Digital TransformationMicroservicescloud-native
0 likes · 16 min read
How Tea Brand ChaBaiDao Scaled to 800M Cups with Cloud‑Native Transformation
Alibaba Cloud Native
Alibaba Cloud Native
Jul 29, 2025 · Cloud Native

How LoongCollector Redefines Cloud‑Native Observability for AI Workloads

LoongCollector, the core component of Alibaba Cloud's LoongSuite, delivers zero‑intrusion, multi‑tenant, high‑performance data collection and processing for AI services, integrating logs, metrics, traces, events, and profiles into a unified, programmable pipeline that scales elastically across heterogeneous GPU clusters.

AIcloud-nativedata collection
0 likes · 17 min read
How LoongCollector Redefines Cloud‑Native Observability for AI Workloads
dbaplus Community
dbaplus Community
Jul 28, 2025 · Cloud Native

Why Your Container Strategy Is Quietly Killing Performance—and How to Fix It

A former monolith‑to‑containers migration revealed hidden performance penalties—namespace conversion, network overhead, storage I/O, and resource contention—plus over‑decomposed microservices, memory overallocation, bloated images, and mis‑tuned orchestration, all of which can be diagnosed and remedied with systematic measurement, tracing, and configuration adjustments.

Resource Optimizationcloud-nativeperformance
0 likes · 16 min read
Why Your Container Strategy Is Quietly Killing Performance—and How to Fix It
21CTO
21CTO
Jul 24, 2025 · Artificial Intelligence

How AI and DevSecOps Will Transform Software Testing by 2025

The article outlines seven emerging software‑testing trends—including AI‑driven test case generation, shift‑left/right strategies, AI‑enhanced CI pipelines, security testing within DevSecOps, and cloud‑native testing—explaining how they will boost automation, reliability, and user‑centric quality for 2025 and beyond.

AI testingAutomationDevSecOps
0 likes · 8 min read
How AI and DevSecOps Will Transform Software Testing by 2025
DataFunTalk
DataFunTalk
Jul 14, 2025 · Artificial Intelligence

How AI4Data Is Revolutionizing Large‑Model Data Production

This article outlines how the Shanghai AI Lab’s Jiang Qian is tackling the efficiency and usability challenges of massive training‑data generation for large models by introducing the AI4Data paradigm, a cloud‑native, AI‑driven data‑production pipeline that transforms Data4AI into a smarter, faster process.

AI4Datacloud-native
0 likes · 5 min read
How AI4Data Is Revolutionizing Large‑Model Data Production
Ops Development & AI Practice
Ops Development & AI Practice
Jul 12, 2025 · Cloud Native

Mastering Observability: A Deep Dive into OpenTelemetry’s Architecture

This article explains OpenTelemetry’s purpose, three‑layer architecture (instrumentation, collector, backend), practical Go instrumentation code, and how the collector processes and exports telemetry to both open‑source and SaaS backends, helping developers avoid vendor lock‑in and achieve unified observability.

CollectorDistributed TracingInstrumentation
0 likes · 9 min read
Mastering Observability: A Deep Dive into OpenTelemetry’s Architecture
Alibaba Cloud Native
Alibaba Cloud Native
Jul 11, 2025 · Cloud Native

Zero‑Downtime Deployments with Alibaba Cloud Lightweight Message Queue

This article explains how Alibaba Cloud Lightweight Message Queue (formerly MNS) enables lossless, zero‑downtime service releases by redesigning the network entry layer, using load‑balancer draining, injecting HTTP close frames, and providing CI/CD scripts that work across ECS and Kubernetes environments.

Alibaba CloudMNSZero Downtime
0 likes · 12 min read
Zero‑Downtime Deployments with Alibaba Cloud Lightweight Message Queue
High Availability Architecture
High Availability Architecture
Jul 11, 2025 · Databases

Inside TDSQL-C: Tencent’s Cloud‑Native Database Scaling Globally with High Performance

This article details Tencent Cloud's TDSQL‑C, a cloud‑native relational database that combines traditional MySQL compatibility with a storage‑compute separated architecture, global multi‑region replication, advanced log handling, and continuous performance upgrades to deliver ultra‑high throughput, massive storage, and strong data reliability.

Storage Enginecloud-nativedatabase
0 likes · 12 min read
Inside TDSQL-C: Tencent’s Cloud‑Native Database Scaling Globally with High Performance
IT Architects Alliance
IT Architects Alliance
Jul 10, 2025 · Cloud Native

Inside Alibaba’s Tech Stack: Cloud‑Native Architecture Behind Billions of Transactions

This article examines Alibaba's extensive cloud‑native technology stack—including distributed computing, storage, middleware, real‑time data processing, AI platforms, performance engineering, and security—revealing how its architects design systems that handle massive transaction volumes during events like Double 11.

Big DataDistributed SystemsMicroservices
0 likes · 12 min read
Inside Alibaba’s Tech Stack: Cloud‑Native Architecture Behind Billions of Transactions
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 9, 2025 · Cloud Native

Scaling a Financial Accounting System to 100k TPS with Cloud‑Native Microservices

This article examines how a ten‑year‑old financial accounting platform transformed from a monolithic design into a cloud‑native, micro‑service architecture that achieved massive scalability, high availability, and 24‑hour real‑time processing through distributed batch scheduling, elastic scaling, and intelligent fault‑tolerance.

Batch ProcessingScalabilitycloud-native
0 likes · 14 min read
Scaling a Financial Accounting System to 100k TPS with Cloud‑Native Microservices
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jul 9, 2025 · Cloud Native

How We Transformed a FPS Game to Cloud‑Native with OpenKruiseGame in 2 Months

Facing tight deadlines, Yahaha Studios rebuilt the STRIDEN FPS game's server deployment from a traditional Auto Scaling Group to a cloud‑native architecture using OpenKruiseGame, achieving second‑level startup, automated global scaling, lossless scaling, and significant cost reductions while improving player experience.

Auto ScalingDeploymentKubernetes
0 likes · 18 min read
How We Transformed a FPS Game to Cloud‑Native with OpenKruiseGame in 2 Months
IT Architects Alliance
IT Architects Alliance
Jul 6, 2025 · Backend Development

What Core Skills Make Million‑Dollar Architects Stand Out Globally?

An in‑depth look at the five essential competencies—ultra‑fine performance tuning, cross‑cultural architectural thinking, deep domain modeling, forward‑looking technology judgment, and powerful team leadership—that enable top‑earning Chinese architects to earn international respect and drive high‑impact systems.

Domain Modelingcloud-nativeperformance
0 likes · 12 min read
What Core Skills Make Million‑Dollar Architects Stand Out Globally?
Linux Ops Smart Journey
Linux Ops Smart Journey
Jun 19, 2025 · Cloud Native

How to Deploy JuiceFS: A Cloud‑Native Distributed File System Tutorial

This guide explains what JuiceFS is, its cloud‑native architecture separating data and metadata, and provides step‑by‑step instructions—including prerequisites, client installation, formatting, mounting, and verification—to help you deploy the high‑performance distributed file system on object storage and PostgreSQL.

Distributed File SystemJuiceFScloud-native
0 likes · 7 min read
How to Deploy JuiceFS: A Cloud‑Native Distributed File System Tutorial
iQIYI Technical Product Team
iQIYI Technical Product Team
Jun 12, 2025 · Operations

How iQIYI’s “Qijing” Platform Revolutionizes Testing Across Devices and Teams

This article explores iQIYI’s comprehensive testing ecosystem, detailing industry trends, the platform’s multi‑terminal challenges, fragmented legacy solutions, and the unified, cloud‑native “Qijing” environment that streamlines test access, zero‑trust security, and real‑world validation for rapid product delivery.

Software qualityZero Trustcloud-native
0 likes · 20 min read
How iQIYI’s “Qijing” Platform Revolutionizes Testing Across Devices and Teams