Tagged articles
3116 articles
Page 5 of 32
MaGe Linux Operations
MaGe Linux Operations
Apr 25, 2025 · Cloud Native

Essential Docker Commands Cheat Sheet: Quick Reference for Developers

This comprehensive guide presents over twenty essential Docker CLI commands, covering image management, container lifecycle, registry operations, and system cleanup, with clear syntax examples and practical use‑case snippets to help developers and DevOps engineers work efficiently with containers.

CLICloud NativeContainers
0 likes · 11 min read
Essential Docker Commands Cheat Sheet: Quick Reference for Developers
Baidu Geek Talk
Baidu Geek Talk
Apr 23, 2025 · Operations

Baidu SRE Digital Immunity System: Construction, Evolution, and Practice

Baidu’s SRE digital‑immune system, evolved into an AI‑powered intelligent immunity platform, quantifies and mitigates risk across thousands of services by integrating data‑driven monitoring, rule‑based detection, and large‑model GraphRAG knowledge mining, cutting degradation cases by ~40% and shifting operations from reactive troubleshooting to proactive, data‑centric quality assurance.

AICloud NativeDigital Immunity
0 likes · 14 min read
Baidu SRE Digital Immunity System: Construction, Evolution, and Practice
Go Programming World
Go Programming World
Apr 22, 2025 · Artificial Intelligence

Design and Implementation of an Enterprise‑Grade LLMOPS Platform (EasyAI)

This article presents a comprehensive overview of building an enterprise‑level LLMOPS platform—including concept definitions, the relationship between LLMOPS, MLOps and intelligent agent platforms, four development tiers, architecture layers, core technical concerns, deployment options, and the benefits of cloud‑native AI development.

AI PlatformCloud NativeDevOps
0 likes · 15 min read
Design and Implementation of an Enterprise‑Grade LLMOPS Platform (EasyAI)
IT Xianyu
IT Xianyu
Apr 21, 2025 · Cloud Native

Step-by-Step Guide to Setting Up a Kubernetes 1.19 Cluster on CentOS 7.9

This guide walks through preparing two CentOS 7.9 servers, installing Docker and Kubernetes 1.19 components, initializing a master node, joining a worker node, and validating the cluster with a sample Nginx deployment, including common troubleshooting tips.

CalicoCentOSCloud Native
0 likes · 10 min read
Step-by-Step Guide to Setting Up a Kubernetes 1.19 Cluster on CentOS 7.9
Pan Zhi's Tech Notes
Pan Zhi's Tech Notes
Apr 21, 2025 · Cloud Native

Build a Clean Microservice Config Center with Nacos in One Step

This article walks through using Nacos as a centralized configuration center for Spring Cloud microservices, showing how to create configuration data, set up a Maven client, enable dynamic refresh with @RefreshScope, and manage multi‑environment and multi‑file configurations.

@RefreshScopeCloud NativeConfiguration Center
0 likes · 16 min read
Build a Clean Microservice Config Center with Nacos in One Step
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 18, 2025 · Operations

How Baidu’s AI‑Powered Digital Immune System Reinvents SRE Risk Management

This article explains why modern SRE teams need a digital immune system, describes Baidu’s data‑driven approach to improve system resilience, outlines the three‑phase evolution from digital transformation to AI‑enhanced risk mining, and shares concrete results and future directions for sustainable operations.

AICloud NativeDigital Immune System
0 likes · 15 min read
How Baidu’s AI‑Powered Digital Immune System Reinvents SRE Risk Management
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Apr 17, 2025 · Cloud Native

Kubernetes Architecture and Core Principles Explained

This article provides a comprehensive overview of Kubernetes, covering its cloud‑native architecture, core components such as API Server, Scheduler, Controller Manager, etcd, kubelet and kube‑proxy, and explains the workflow that enables automated deployment, scaling and management of containerized applications.

Cloud NativeDevOpsKubernetes
0 likes · 6 min read
Kubernetes Architecture and Core Principles Explained
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 17, 2025 · Cloud Native

OpenKruise 1.8 Release Highlights: In‑Place VPA, StatefulSet Volume Expansion, AI WorkloadSpread, Serverless Probe, SidecarSet Gray‑Release, and Helm Pre‑Delete Hook

OpenKruise 1.8, the latest CNCF‑incubated cloud‑native automation suite, introduces in‑place vertical pod autoscaling, native StatefulSet volume expansion, AI‑aware WorkloadSpread, serverless probe support, sidecar gray‑release capabilities, and a Helm pre‑delete safety hook, all backed by detailed YAML examples and future roadmap.

Cloud NativeInPlaceVPAKubernetes
0 likes · 13 min read
OpenKruise 1.8 Release Highlights: In‑Place VPA, StatefulSet Volume Expansion, AI WorkloadSpread, Serverless Probe, SidecarSet Gray‑Release, and Helm Pre‑Delete Hook
dbaplus Community
dbaplus Community
Apr 16, 2025 · Backend Development

How Ctrip’s Kafka Gatekeeper Boosts FinOps Data Quality and Automates Cost Governance

This article explains how Ctrip’s hybrid‑cloud FinOps billing system uses a custom Kafka Gatekeeper to detect, locate, and automatically remediate data‑quality issues across dozens of self‑built PaaS services, improving coverage, timeliness, and responsibility attribution while supporting high‑availability deployments.

BackendCloud NativeData Quality
0 likes · 19 min read
How Ctrip’s Kafka Gatekeeper Boosts FinOps Data Quality and Automates Cost Governance
Ops Development Stories
Ops Development Stories
Apr 15, 2025 · Cloud Native

Boost Kubernetes Management with AI: Introducing the Lightweight k8m Console

This article introduces k8m, a lightweight AI‑enhanced console for Kubernetes that simplifies cluster management, installation, configuration, and daily operations, while offering features such as YAML auto‑translation, AI‑driven event and log diagnostics, command generation, multi‑cluster support, and role‑based access control.

AICloud NativeDevOps
0 likes · 13 min read
Boost Kubernetes Management with AI: Introducing the Lightweight k8m Console
Ops Development Stories
Ops Development Stories
Apr 15, 2025 · Artificial Intelligence

Unlocking the AI USB‑C: Deep Dive into the Model Context Protocol (MCP)

This article explores the Model Context Protocol (MCP), the emerging “USB‑C” for AI, detailing its core advantages, implementation with Kubernetes, a six‑layer cloud‑native architecture, practical code examples, and developer guidelines for building AI‑powered, secure, and scalable services.

AICloud NativeDevOps
0 likes · 8 min read
Unlocking the AI USB‑C: Deep Dive into the Model Context Protocol (MCP)
Linux Kernel Journey
Linux Kernel Journey
Apr 15, 2025 · Operations

Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console

The article explains how Alibaba Cloud's SysOM console uses low‑overhead process hotspot tracing, stack unwinding, symbol resolution, eBPF and AI diagnostics to pinpoint CPU, memory, lock and network issues, offering visual flame‑graph analysis and real‑world case studies for faster root‑cause identification.

AI diagnosticsCloud NativeSysOM
0 likes · 15 min read
Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console
Ops Development & AI Practice
Ops Development & AI Practice
Apr 14, 2025 · Industry Insights

When a “Perfect” EKS Terraform Module Becomes a Debugging Nightmare

The author recounts the high hopes and subsequent frustrations of adopting the community‑maintained terraform‑aws‑eks module for AWS EKS, detailing hidden complexities, limited AI assistance, and practical lessons on embracing complexity, critical use of open‑source modules, and the importance of rest during tough debugging sessions.

AI CopilotAWSCloud Native
0 likes · 9 min read
When a “Perfect” EKS Terraform Module Becomes a Debugging Nightmare
Alibaba Cloud Observability
Alibaba Cloud Observability
Apr 14, 2025 · Cloud Native

How to Connect Grafana to Large Language Models with MCP (Model Context Protocol)

This guide shows how to use the Model Context Protocol (MCP) to build a lightweight server that links Grafana dashboards to large language models, covering MCP concepts, FastMCP setup, Python client implementation, environment preparation, and integration with Cherry Studio for seamless AI-driven data access.

AI integrationCloud NativeGrafana
0 likes · 12 min read
How to Connect Grafana to Large Language Models with MCP (Model Context Protocol)
Cloud Native Technology Community
Cloud Native Technology Community
Apr 11, 2025 · Cloud Native

How Kube-OVN Enables Seamless Live Migration for KubeVirt VMs

This article explains the challenges of live‑migrating KubeVirt virtual machines, how Kube‑OVN addresses network‑bridge limitations and IP changes, provides the required VM annotation, step‑by‑step migration commands, and details the multi‑stage migration process that keeps network interruption under 0.5 seconds with no TCP break.

Cloud NativeKube-OVNKubeVirt
0 likes · 7 min read
How Kube-OVN Enables Seamless Live Migration for KubeVirt VMs
21CTO
21CTO
Apr 9, 2025 · Operations

9 Must‑Have Container Monitoring Tools and Best Practices for Modern Cloud‑Native Environments

This article reviews nine practical container‑monitoring solutions—from Last9 and Prometheus to Dynatrace and Elastic Observability—detailing their key features, pricing, and why developers prefer them, and then offers comprehensive best‑practice guidance for metrics, tagging, alerts, and advanced observability strategies in Kubernetes‑driven cloud‑native deployments.

AlertingCloud NativeDevOps
0 likes · 25 min read
9 Must‑Have Container Monitoring Tools and Best Practices for Modern Cloud‑Native Environments
Alibaba Cloud Native
Alibaba Cloud Native
Apr 6, 2025 · Cloud Native

How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility

This article details ZEEK's cloud‑native transformation, covering the strategic shift to open‑source standards, unified microservice architecture, high‑availability practices, upgraded traffic gateways, visual data analysis, car‑network data collection, and AI‑assisted development, illustrating how these steps enhanced system stability, scalability, and development efficiency.

AICloud NativeMicroservices
0 likes · 22 min read
How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility
php Courses
php Courses
Mar 31, 2025 · Backend Development

PHP Ecosystem in 2025: New Language Features, Framework Trends, Design Patterns, and Emerging Applications

The 2025 PHP ecosystem overview details the language’s new features such as enhanced generics and fibers, performance improvements via JIT and OPcache, evolving best practices, the latest trends in major and micro frameworks, modern design pattern implementations, cloud‑native deployment, AI integration, and future directions.

BackendCloud NativeDesign Patterns
0 likes · 17 min read
PHP Ecosystem in 2025: New Language Features, Framework Trends, Design Patterns, and Emerging Applications
FunTester
FunTester
Mar 30, 2025 · Cloud Native

Mastering Kubernetes Resources with Java: EndpointSlice, PVC, PV, NetworkPolicy & More

This guide shows how to use the Fabric8 Kubernetes Java client to load, create, apply, list, watch, and delete core Kubernetes objects such as EndpointSlice, PersistentVolumeClaim, PersistentVolume, NetworkPolicy, PodDisruptionBudget, and various RBAC resources, with complete code examples for each operation.

APICloud NativeDevOps
0 likes · 12 min read
Mastering Kubernetes Resources with Java: EndpointSlice, PVC, PV, NetworkPolicy & More
Ops Development & AI Practice
Ops Development & AI Practice
Mar 27, 2025 · Cloud Native

Master Kustomize: Simplify Kubernetes Configs with Generators and Transformers

Kustomize, built into kubectl, lets you declaratively manage Kubernetes YAML by organizing base resources, dynamically generating ConfigMaps and Secrets, applying transformers for environment‑specific tweaks, and optionally validating output, enabling a clean Base + Overlay workflow that reduces duplication and simplifies multi‑environment configuration.

Cloud NativeConfiguration ManagementDevOps
0 likes · 8 min read
Master Kustomize: Simplify Kubernetes Configs with Generators and Transformers
ITPUB
ITPUB
Mar 26, 2025 · Cloud Native

How KubeBlocks Enables Scalable, Automated Redis on Kubernetes at Kuaishou

This article details Kuaishou's migration of massive Redis clusters to Kubernetes using the KubeBlocks Operator, covering architecture, multi‑layer management requirements, federated cluster deployment, custom controllers, performance and stability considerations, and the resulting operational benefits.

Cloud NativeKubeBlocksKubernetes
0 likes · 15 min read
How KubeBlocks Enables Scalable, Automated Redis on Kubernetes at Kuaishou
Huolala Tech
Huolala Tech
Mar 25, 2025 · Backend Development

How Huolala Built a Scalable Distributed Load‑Testing Platform with JMeter

This article details Huolala's performance testing platform architecture, covering background challenges, a JMeter‑based solution, distributed agent design, unified logging, plugin management, data collection via Kafka, and future enhancements such as AI integration and improved file distribution, illustrating a comprehensive backend development effort.

Cloud NativeJMeterPerformance Testing
0 likes · 25 min read
How Huolala Built a Scalable Distributed Load‑Testing Platform with JMeter
FunTester
FunTester
Mar 25, 2025 · Operations

Integrating Chaos Engineering into Service Dependency Governance for Resilient Cloud‑Native Systems

This article explores how to embed chaos engineering practices into service dependency governance, detailing dynamic validation versus static analysis, fault injection techniques, multi‑point failure simulations, and data‑driven optimizations to build robust, self‑healing microservice architectures in cloud‑native environments.

Cloud NativeMicroservicesOperations
0 likes · 18 min read
Integrating Chaos Engineering into Service Dependency Governance for Resilient Cloud‑Native Systems
Ops Development Stories
Ops Development Stories
Mar 19, 2025 · Cloud Native

Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices

KubeDoor 1.0 introduces a new architecture for unified multi‑Kubernetes monitoring, offering components for master and agent, flexible deployment options, Helm‑based installation, configurable storage and alerting settings, and detailed guidance on integrating with existing Prometheus/VictoriaMetrics setups while providing automatic peak‑usage data collection.

AlertingCloud NativeKubernetes
0 likes · 14 min read
Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices
Tencent Cloud Developer
Tencent Cloud Developer
Mar 19, 2025 · Cloud Native

Kubernetes Monitoring: Why It’s Needed, Core Components, and Metric Exposure

Monitoring Kubernetes is essential to detect resource contention, component failures, and network issues; it involves tracking core component metrics such as API server latency, etcd write times, scheduler delays, as well as node‑level CPU, memory, disk, and network statistics, pod health, and custom application metrics exposed via Prometheus exporters for comprehensive observability.

Cloud NativeExportersKubernetes
0 likes · 23 min read
Kubernetes Monitoring: Why It’s Needed, Core Components, and Metric Exposure
Su San Talks Tech
Su San Talks Tech
Mar 19, 2025 · Operations

10 Proven Strategies to Achieve 99.99% System Availability

This article presents ten practical techniques—including redundant deployment, circuit breaking, traffic shaping, auto‑scaling, gray releases, downgrade switches, full‑link stress testing, data sharding, chaos engineering, and three‑layer monitoring—to dramatically improve system high‑availability from 99% to 99.99% in production environments.

BackendCloud NativeMicroservices
0 likes · 12 min read
10 Proven Strategies to Achieve 99.99% System Availability
Python Programming Learning Circle
Python Programming Learning Circle
Mar 18, 2025 · Cloud Native

Automating Kubernetes Operations with the Python Client

This article demonstrates how to use the Python Kubernetes client to programmatically restart deployments, scale them, execute commands inside pods, apply node taints, retrieve cluster metrics, and convert between YAML/JSON and client objects, providing practical code examples for cloud‑native automation.

APICloud NativeDevOps
0 likes · 8 min read
Automating Kubernetes Operations with the Python Client
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceCloud Native
0 likes · 25 min read
Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes
MaGe Linux Operations
MaGe Linux Operations
Mar 18, 2025 · Cloud Native

How to Deploy a Kubernetes v1.28.8 Cluster with KubeKey on Ubuntu

This guide walks through configuring three Ubuntu servers, installing KubeKey, creating a Kubernetes v1.28.8 cluster with HAProxy load balancing, deploying a sample nginx workload, and verifying the installation using kubectl and curl, providing all necessary commands and configuration details for a successful deployment.

Cloud NativeKubekeyKubernetes
0 likes · 13 min read
How to Deploy a Kubernetes v1.28.8 Cluster with KubeKey on Ubuntu
Alibaba Cloud Observability
Alibaba Cloud Observability
Mar 17, 2025 · Cloud Native

How to Master LLM Observability in Cloud‑Native Environments

This article explains the unique observability challenges of large language model (LLM) applications, outlines essential performance, cost, and safety metrics, and presents a comprehensive cloud‑native solution—including trace, metric, and log collection, domain‑specific dashboards, and step‑by‑step integration with Alibaba Cloud's Python Agent—to ensure reliable, efficient LLM deployments.

AI gatewayCloud NativeLLM Observability
0 likes · 18 min read
How to Master LLM Observability in Cloud‑Native Environments
Python Programming Learning Circle
Python Programming Learning Circle
Mar 17, 2025 · Cloud Native

Automating Kubernetes Tasks with the Python Client Library

This tutorial demonstrates how to set up a local KinD cluster, configure authentication, use raw curl commands, and employ the official Kubernetes Python client to list pods, create deployments, watch events, and manage RBAC, providing a complete guide for automating Kubernetes operations with Python.

APICloud NativeDevOps
0 likes · 11 min read
Automating Kubernetes Tasks with the Python Client Library
IT Architects Alliance
IT Architects Alliance
Mar 16, 2025 · Cloud Native

Why Does Scaling a Kubernetes Cluster Slow Down? Uncover the Hidden Bottlenecks

When a Kubernetes cluster grows, many teams expect faster performance, yet scaling often becomes slower due to hardware limits, network congestion, data‑sync overhead, load‑balancing misconfigurations, and component bottlenecks, and this article explains each cause and offers concrete optimization strategies.

Cloud NativeKubernetescluster scaling
0 likes · 27 min read
Why Does Scaling a Kubernetes Cluster Slow Down? Uncover the Hidden Bottlenecks
Ops Development & AI Practice
Ops Development & AI Practice
Mar 16, 2025 · Cloud Native

Why Quarkus Is Revolutionizing Cloud‑Native Java Development

Quarkus, a Kubernetes‑native Java framework built for GraalVM and HotSpot, delivers millisecond startup, low memory usage, developer‑friendly features, and seamless integration with cloud‑native platforms, making it ideal for microservices, serverless, and modern cloud applications.

Cloud NativeFast StartupKubernetes
0 likes · 7 min read
Why Quarkus Is Revolutionizing Cloud‑Native Java Development
MaGe Linux Operations
MaGe Linux Operations
Mar 15, 2025 · Cloud Native

How MetalLB Transforms Load Balancing for Bare‑Metal Kubernetes Clusters

This guide explains Kubernetes Service types, the role of MetalLB in providing LoadBalancer functionality for bare‑metal clusters, step‑by‑step installation, configuration of address pools, testing with a sample service, integration with Ingress, and an overview of the Calico network plugin for pod isolation.

CalicoCloud NativeIngress
0 likes · 14 min read
How MetalLB Transforms Load Balancing for Bare‑Metal Kubernetes Clusters
JakartaEE China Community
JakartaEE China Community
Mar 15, 2025 · Backend Development

Key Jakarta EE Q&A: Naming, Governance, Roadmap, and How to Contribute

This article provides a comprehensive Q&A covering Jakarta EE’s definition, naming origin, platform scope, namespace shift, governance model, specification process, release cadence, future roadmap, relationship with EE4J, microservice and cloud‑native support, trademark usage, and step‑by‑step guidance on becoming a contributor or member.

Cloud NativeEclipse FoundationEnterprise Java
0 likes · 12 min read
Key Jakarta EE Q&A: Naming, Governance, Roadmap, and How to Contribute
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 13, 2025 · Artificial Intelligence

How to Master LLM Observability: End-to-End Monitoring with Alibaba Cloud

This article outlines Alibaba Cloud’s comprehensive LLM observability solution, covering challenges, key metrics, component architecture, data collection, tracing, performance analysis, and practical integration steps—including Python agent setup and Dify demo—to help developers monitor and optimize large language model applications.

AI MonitoringCloud NativeLLM Observability
0 likes · 19 min read
How to Master LLM Observability: End-to-End Monitoring with Alibaba Cloud
Sohu Tech Products
Sohu Tech Products
Mar 12, 2025 · Cloud Native

Argo Workflows: Container-Native Workflow Engine for Kubernetes

Argo Workflows is an open‑source, container‑native engine that runs on Kubernetes via Custom Resource Definitions, letting users declaratively define complex, step‑or DAG‑based pipelines—including CI/CD, data processing, and machine‑learning jobs—through reusable templates, with a server UI, controller, and pod architecture monitored by Prometheus.

Argo WorkflowsCNCFCloud Native
0 likes · 16 min read
Argo Workflows: Container-Native Workflow Engine for Kubernetes
Ops Development Stories
Ops Development Stories
Mar 10, 2025 · Cloud Native

What Are Kubernetes Core Components and How Do They Work?

This article provides a comprehensive overview of Kubernetes fundamentals, covering core control‑plane and node components, key object differences such as Pod vs Deployment, Service types, ConfigMap vs Secret, scheduling, health checks, scaling, security, storage, and troubleshooting techniques.

Cloud NativeContainersDeployment
0 likes · 19 min read
What Are Kubernetes Core Components and How Do They Work?
Ops Development & AI Practice
Ops Development & AI Practice
Mar 7, 2025 · Cloud Native

Mastering Kubernetes StatefulSets: How to Run Stateful Apps Reliably

This article explains Kubernetes StatefulSets, covering their core concepts, guarantees such as stable network IDs and persistent storage, the controller’s components, deployment workflow, typical use cases, best‑practice recommendations, and a detailed comparison with Deployments to help you manage stateful workloads effectively.

Cloud NativeDeploymentKubernetes
0 likes · 8 min read
Mastering Kubernetes StatefulSets: How to Run Stateful Apps Reliably
Alibaba Cloud Native
Alibaba Cloud Native
Mar 7, 2025 · Artificial Intelligence

8 Real-World AI Gateway Use Cases Every Enterprise Should Know

This article outlines eight practical AI gateway scenarios—from multi‑model services and consumer authentication to token rate limiting, content safety, semantic caching, and observability—explaining the business needs behind each and how Alibaba Cloud's cloud‑native API gateway provides concrete technical solutions.

AI gatewayCloud NativeContent Safety
0 likes · 15 min read
8 Real-World AI Gateway Use Cases Every Enterprise Should Know
ITPUB
ITPUB
Mar 6, 2025 · Cloud Native

Mastering Portainer: Simplify Docker and Kubernetes Management with Easy Deployment

This guide explains what Portainer is, compares its Community and Business editions, details its core architecture, provides step‑by‑step installation using Docker, Docker‑Compose, and Docker‑Stack, and demonstrates key features such as dashboards, container, image, service, volume, and user management for Docker and Kubernetes environments.

Cloud NativeContainer ManagementDocker
0 likes · 43 min read
Mastering Portainer: Simplify Docker and Kubernetes Management with Easy Deployment
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 6, 2025 · Big Data

Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization

This article examines how Apache Iceberg’s snapshot‑based ACID transactions, logical‑physical partition evolution, and COW/MOR update modes enable efficient real‑time data lake ingestion, and demonstrates AutoMQ’s Kafka‑to‑Iceberg Table Topic solution that simplifies schema management, reduces latency, and cuts operational costs.

Apache IcebergAutoMQBig Data
0 likes · 14 min read
Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 5, 2025 · Cloud Native

Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud

This article describes how JoinQuant built a cloud‑native quantitative research platform on Alibaba Cloud, identified performance, cost, data‑management, and security challenges, and solved them with Fluid’s JindoRuntime data‑caching, elastic scaling, and Python‑driven workflows, achieving dramatic speed and cost improvements.

Cloud NativeData CachingFluid
0 likes · 18 min read
Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud
Practical DevOps Architecture
Practical DevOps Architecture
Mar 5, 2025 · Cloud Native

Kubernetes DNS Resolution Issues and Troubleshooting Guide

This guide explains common Kubernetes DNS problems—including failure to resolve external domains, inter‑pod service discovery addresses, and related impacts on applications like Nginx reverse proxies—and provides step‑by‑step troubleshooting procedures such as checking CoreDNS, inspecting resolv.conf, and customizing dnsPolicy and dnsConfig in pod specifications.

Cloud NativeCoreDNSDNS
0 likes · 6 min read
Kubernetes DNS Resolution Issues and Troubleshooting Guide
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 4, 2025 · Cloud Native

Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features

The Koordinator v1.6 release introduces a suite of innovations—including GPU topology‑aware scheduling, end‑to‑end GPU & RDMA joint allocation, strong GPU isolation, differentiated GPU scoring, fine‑grained resource reservation, mixed‑workload QoS, and extensive scheduler and rescheduler optimizations—to efficiently manage heterogeneous resources in Kubernetes clusters for AI and high‑performance computing workloads.

Cloud NativeGPU schedulingHeterogeneous Resources
0 likes · 24 min read
Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features
DataFunSummit
DataFunSummit
Mar 1, 2025 · Databases

Innovations and Breakthroughs of ClickHouse in Real‑Time OLAP

This article introduces ClickHouse as an open‑source column‑store OLAP database, outlines its core features, explains its distributed and cloud‑native architectures—including SharedMergeTree for serverless operation—presents benchmark results, compares community and enterprise editions, and answers common questions about its future direction.

Cloud NativeReal-time OLAPclickhouse
0 likes · 15 min read
Innovations and Breakthroughs of ClickHouse in Real‑Time OLAP
Pan Zhi's Tech Notes
Pan Zhi's Tech Notes
Feb 28, 2025 · Cloud Native

Spring Cloud Quick‑Start Guide: Getting Started with Microservices

This article introduces Spring Cloud's background, core components (including first‑ and second‑generation modules derived from Netflix OSS), versioning scheme, compatibility with Spring Boot, and practical advice for selecting matching releases to avoid runtime issues in microservice projects.

Cloud NativeMicroservicesNetflix OSS
0 likes · 12 min read
Spring Cloud Quick‑Start Guide: Getting Started with Microservices
Ops Development & AI Practice
Ops Development & AI Practice
Feb 27, 2025 · Cloud Native

Boost Kubernetes Efficiency with Offline‑Online Hybrid Deployment

This article explains how to combine online services and offline tasks within a single Kubernetes cluster using offline‑online hybrid deployment, detailing its benefits such as cost savings and higher resource utilization, and walks through practical implementation methods like CronJobs, HPA, priority classes, node affinity, custom schedulers, and the open‑source Koordinator project, while also addressing associated challenges.

Cloud NativeKubernetesOffline Tasks
0 likes · 6 min read
Boost Kubernetes Efficiency with Offline‑Online Hybrid Deployment
dbaplus Community
dbaplus Community
Feb 25, 2025 · Cloud Native

Why We Dropped Kubernetes and Boosted DevOps Happiness by 89%

A DevOps team managing 47 Kubernetes clusters across three clouds faced burnout, high costs, and operational chaos, so they gradually replaced Kubernetes with simpler AWS services, cutting infrastructure spend by 58%, speeding deployments by 89%, and dramatically improving team morale and reliability.

Cloud NativeCost OptimizationDevOps
0 likes · 9 min read
Why We Dropped Kubernetes and Boosted DevOps Happiness by 89%
Cloud Native Technology Community
Cloud Native Technology Community
Feb 25, 2025 · Cloud Native

Understanding k8gb: A Kubernetes Global Load Balancer for Multi‑Cluster Deployments

This article explains the theory and practical usage of k8gb, a Kubernetes Global Balancer that provides DNS‑based load balancing, fault‑tolerant traffic routing, and seamless failover across multiple clusters to improve resilience, latency, and compliance in cloud‑native environments.

Cloud NativeGlobal Load BalancingKubernetes
0 likes · 8 min read
Understanding k8gb: A Kubernetes Global Load Balancer for Multi‑Cluster Deployments
Tencent Cloud Developer
Tencent Cloud Developer
Feb 25, 2025 · Artificial Intelligence

Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide

This guide walks developers through the full lifecycle of using DeepSeek—choosing the right deployment method (API, local machine, or private cloud), selecting model sizes based on hardware, configuring Tencent Cloud services, building AI applications, and integrating the model into development tools and mini‑programs.

AI Model DeploymentAI application developmentCloud Native
0 likes · 12 min read
Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide
FunTester
FunTester
Feb 24, 2025 · Cloud Native

Master Kubernetes with Fabric8 Java Client: Quick Guide & Advanced Tips

This article introduces the Fabric8 KubernetesClient for Java, explains why it outperforms the official client, shows how to add the Maven dependency, and provides step‑by‑step code examples for listing, creating, deleting, and watching Pods, as well as advanced operations on ConfigMaps, Deployments, and custom resources, illustrating real‑world use cases such as log collection, self‑healing, and dynamic scaling.

Cloud NativeDevOpsFabric8
0 likes · 8 min read
Master Kubernetes with Fabric8 Java Client: Quick Guide & Advanced Tips
Alibaba Cloud Native
Alibaba Cloud Native
Feb 22, 2025 · Artificial Intelligence

Boost Your Development with Alibaba Cloud’s Tongyi Lingma AI Coding Assistant – A Hands‑On Guide

This guide walks developers through installing the Tongyi Lingma AI coding assistant plugin, switching between large language models, using smart Q&A, terminal integration, code completion, bug‑fix suggestions, and multi‑file refactoring, showcasing how the tool streamlines everyday development tasks.

AI coding assistantCloud NativeIDE plugin
0 likes · 8 min read
Boost Your Development with Alibaba Cloud’s Tongyi Lingma AI Coding Assistant – A Hands‑On Guide
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 20, 2025 · Cloud Native

TrafficRoute GTM: GEO‑Based Routing and Traffic Orchestration at ByteDance

This article explains how ByteDance’s TrafficRoute GTM, a DNS‑based global traffic routing service, uses GEO‑based routing, health‑check orchestration, and intelligent load‑balancing to achieve high stability, performance, and cost efficiency for ultra‑large‑scale traffic across multiple regions and CDN providers.

ByteDanceCloud NativeDNS Load Balancing
0 likes · 11 min read
TrafficRoute GTM: GEO‑Based Routing and Traffic Orchestration at ByteDance
Architecture Development Notes
Architecture Development Notes
Feb 19, 2025 · Operations

Avoid Prometheus Label Pitfalls: Best Practices for Scalable Monitoring

This article examines common label misuse in Prometheus, explains why adding global labels to every metric can cause data bloat, configuration rigidity, and dimensional pollution, and provides concrete best‑practice patterns, dynamic injection techniques, and governance rules to keep monitoring systems efficient and maintainable.

Cloud NativeLabelsPrometheus
0 likes · 7 min read
Avoid Prometheus Label Pitfalls: Best Practices for Scalable Monitoring
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 17, 2025 · Cloud Native

Multi‑Cluster Delivery with ACK One GitOps: A Case Study at Wondershare Technology

Wondershare Technology adopted Alibaba Cloud's ACK One GitOps platform to automate and unify the deployment of dozens of Kubernetes clusters across multiple regions, addressing manual deployment inefficiencies, traceability, rollback challenges, and multi‑tenant permission management while achieving a 50% increase in release efficiency.

Argo CDCloud NativeGitOps
0 likes · 7 min read
Multi‑Cluster Delivery with ACK One GitOps: A Case Study at Wondershare Technology
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Feb 17, 2025 · Cloud Native

Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator

To reduce resource contention and improve offline task reliability, this article examines the challenges of using Koordinator with Hadoop Yarn pods on Kubernetes, proposes real‑time resource reporting and task‑level eviction strategies, details community and custom solutions, and outlines future enhancements with Volcano.

Big DataCloud NativeKoordinator
0 likes · 9 min read
Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator
DataFunSummit
DataFunSummit
Feb 16, 2025 · Big Data

Bilibili Big Data Task Migration to Cloud‑Native Kubernetes Using Volcano Scheduler

This article shares Bilibili’s experience migrating its offline big‑data workloads to a cloud‑native Kubernetes environment using the Volcano scheduler, covering migration background, scheduler adaptation, hierarchical queue implementation, over‑commit framework (Amiyad), and future work to improve performance and resource utilization.

Cloud NativeKubernetesResource Overcommit
0 likes · 15 min read
Bilibili Big Data Task Migration to Cloud‑Native Kubernetes Using Volcano Scheduler
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 14, 2025 · Cloud Native

Blue‑Green Deployment with Kruise Rollouts: Concepts, Implementation, and Comparison

This article explains the blue‑green deployment strategy, introduces Kruise Rollouts’ blue‑green capabilities, provides a step‑by‑step Kubernetes example with YAML manifests and kubectl commands, compares it to Argo Rollouts and Flux Flagger, discusses resource considerations and serverless advantages, and concludes with best‑practice recommendations.

Blue‑Green deploymentCloud NativeDevOps
0 likes · 16 min read
Blue‑Green Deployment with Kruise Rollouts: Concepts, Implementation, and Comparison
Ops Development Stories
Ops Development Stories
Feb 13, 2025 · Cloud Native

KubeDoor: AI‑Driven Kubernetes Load‑Aware Scheduling & Capacity Management

KubeDoor is an open‑source platform built with Python and Vue that leverages Kubernetes admission control, AI recommendations, and expert experience to provide load‑aware scheduling, capacity governance, real‑time resource analytics, and automated scaling for microservices, featuring a web UI, Grafana dashboards, and extensible control mechanisms.

AI schedulingAdmission ControllerCloud Native
0 likes · 11 min read
KubeDoor: AI‑Driven Kubernetes Load‑Aware Scheduling & Capacity Management
FunTester
FunTester
Feb 13, 2025 · Operations

Why Fault Testing Is Critical for Modern Online Systems

In today's digital era, online services face increasing fault risks, and systematic fault testing—through chaos engineering, fault injection, stress testing, and disaster recovery drills—helps teams anticipate, evaluate, and improve system resilience, ultimately reducing downtime and protecting business continuity.

Cloud NativeOperationsautomation
0 likes · 9 min read
Why Fault Testing Is Critical for Modern Online Systems
Alibaba Cloud Observability
Alibaba Cloud Observability
Feb 11, 2025 · Operations

Alibaba Cloud’s Compile‑Time Go Instrumentation: A New Era for Cloud‑Native Observability

Amid the surge of cloud‑native architectures, Alibaba Cloud showcases its open‑source, compile‑time Go instrumentation that delivers non‑intrusive monitoring, richer data, and cross‑vendor standards via OpenTelemetry, while highlighting extensive community contributions and collaborations that position it as a leading force in modern observability.

Alibaba CloudCloud NativeGo
0 likes · 6 min read
Alibaba Cloud’s Compile‑Time Go Instrumentation: A New Era for Cloud‑Native Observability
Practical DevOps Architecture
Practical DevOps Architecture
Feb 11, 2025 · Operations

Kubernetes Operations and Cloud Native Architecture Training Course

This comprehensive training program for intermediate to advanced users covers Kubernetes high‑availability deployment, elastic scaling, Helm package management, Ceph distributed storage integration, microservice container migration, Jenkins‑based CI/CD pipelines, and Istio service‑mesh governance, providing hands‑on labs, detailed chapters, and practical resources for mastering modern cloud‑native operations.

CephCloud NativeDevOps
0 likes · 7 min read
Kubernetes Operations and Cloud Native Architecture Training Course
Java Tech Enthusiast
Java Tech Enthusiast
Feb 8, 2025 · Cloud Native

Bun 1.2 Release: Enhanced Node.js Compatibility, Built-in Database & Cloud-Native Features

Bun 1.2 delivers its biggest upgrade yet, boosting Node.js compatibility above 90% for core modules, adding built‑in PostgreSQL and native S3 support that outperforms the AWS SDK, switching to a readable lock file for faster installs, enhancing testing tools, and improving HTTP/2, filesystem, JSON and Windows performance while targeting remaining compatibility gaps.

BunCloud NativeJavaScript runtime
0 likes · 5 min read
Bun 1.2 Release: Enhanced Node.js Compatibility, Built-in Database & Cloud-Native Features
Tencent Cloud Developer
Tencent Cloud Developer
Feb 7, 2025 · Artificial Intelligence

Launch DeepSeek Models in Seconds with One‑Click Cloud Development

This guide shows how to start DeepSeek large‑language models on cnb.cool in just 5‑10 seconds without downloading, using a simple three‑step process that includes forking the repository, selecting a model branch, and running Ollama or Docker commands, plus options for long‑term cloud deployment.

AICloud NativeDeepSeek
0 likes · 3 min read
Launch DeepSeek Models in Seconds with One‑Click Cloud Development
Alibaba Cloud Native
Alibaba Cloud Native
Feb 7, 2025 · Information Security

How DeepSeek’s Attack Highlights the Need for Robust Cloud‑Native Security Observability

The article examines DeepSeek’s rapid rise, the large‑scale malicious attacks it suffered, and then provides a detailed, cloud‑native security observability guide using Alibaba Cloud services such as DDoS protection, WAF, CLB, SAS, and SLS for logging, monitoring, anomaly detection, and alert response.

AI securityAlibaba CloudCloud Native
0 likes · 15 min read
How DeepSeek’s Attack Highlights the Need for Robust Cloud‑Native Security Observability
Efficient Ops
Efficient Ops
Feb 6, 2025 · Operations

Inside Alipay’s Full‑Ecosystem Availability Monitoring: Architecture and Practices

At the 2024 GOPS Global Operations Conference in Shanghai, Alipay’s monitoring lead Tang Liang presented the challenges, architecture, risk‑prevention practices, and implementation details of the company’s full‑ecosystem availability monitoring system, highlighting its role in DevOps, SRE, and AIOps initiatives.

AvailabilityCloud NativeDevOps
0 likes · 4 min read
Inside Alipay’s Full‑Ecosystem Availability Monitoring: Architecture and Practices
DataFunSummit
DataFunSummit
Feb 6, 2025 · Big Data

Migrating Big Data Workloads to Cloud‑Native Kubernetes: Challenges, Solutions, and Lessons from OPPO

This article describes how OPPO's big‑data team transitioned from traditional IDC and EMR environments to a cloud‑native Kubernetes architecture, detailing the motivations, design principles, elastic scaling challenges, custom solutions, and future directions for large‑scale data processing on the cloud.

Cloud NativeKuberneteselastic scaling
0 likes · 18 min read
Migrating Big Data Workloads to Cloud‑Native Kubernetes: Challenges, Solutions, and Lessons from OPPO
21CTO
21CTO
Jan 30, 2025 · Cloud Native

How ByteDance Uses eBPF netkit to Replace veth for Faster Container Networking

ByteDance engineers are adopting the Linux kernel's new netkit feature, an eBPF‑based container network device that bypasses veth's L2 bottlenecks, delivering up to 10% performance gains and lower CPU usage while maintaining compatibility with existing workloads.

Cloud NativeVethcontainer networking
0 likes · 7 min read
How ByteDance Uses eBPF netkit to Replace veth for Faster Container Networking
FunTester
FunTester
Jan 27, 2025 · Operations

Mastering Chaos Engineering: Build Resilient Systems with Proven Practices

In today's always‑on digital era, this article explains chaos engineering concepts, step‑by‑step experimental methods, best‑practice guidelines, and a comparison of leading fault‑injection tools to help organizations proactively strengthen system resilience and reduce downtime risk.

Cloud NativeDevOpsFault Injection
0 likes · 11 min read
Mastering Chaos Engineering: Build Resilient Systems with Proven Practices
IT Architects Alliance
IT Architects Alliance
Jan 22, 2025 · Cloud Native

Kubernetes in the Cloud‑Native Era: Architecture, Core Components, and Practical Practices

This article introduces Kubernetes as the cornerstone of cloud‑native architecture, explains its control‑plane and node components, demonstrates practical tasks such as namespace isolation, custom scheduling, and persistent storage with code examples, and showcases real‑world success cases across industries.

Cloud NativeDevOpsKubernetes
0 likes · 12 min read
Kubernetes in the Cloud‑Native Era: Architecture, Core Components, and Practical Practices
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jan 21, 2025 · Cloud Native

OpenYurt v1.6 Release: Node-Level Traffic Multiplexing and Enhanced Edge Autonomy

The OpenYurt v1.6 release introduces node‑level traffic multiplexing that can cut cloud‑edge communication by about 50% and adds enhanced edge autonomy features such as configurable autonomy duration and a webhook to keep services stable during node failures, while also providing various community and product updates.

Cloud NativeEdge AutonomyEdge Computing
0 likes · 7 min read
OpenYurt v1.6 Release: Node-Level Traffic Multiplexing and Enhanced Edge Autonomy
Efficient Ops
Efficient Ops
Jan 20, 2025 · Operations

Inside Qunar’s Pre‑Release Platform: Design, Practice, and Future Outlook

The article recaps Li Jingkang’s presentation at the 2024 GOPS Global Operations Conference, detailing the background, principles, design, and real‑world implementation of Qunar’s pre‑release platform, and outlines its future direction within DevOps, SRE, AIOps, and cloud‑native practices.

Cloud NativeDevOpsOperations
0 likes · 3 min read
Inside Qunar’s Pre‑Release Platform: Design, Practice, and Future Outlook
IT Architects Alliance
IT Architects Alliance
Jan 20, 2025 · Industry Insights

How Cloud Native and Edge Computing Are Transforming Modern IT Architecture

This article examines the evolution of cloud computing, highlights latency, bandwidth, and security challenges, and explains how cloud‑native technologies and edge computing complement each other to deliver low‑latency, scalable, and secure solutions across industries such as gaming, manufacturing, smart cities, and healthcare.

Cloud NativeEdge ComputingIT Architecture
0 likes · 18 min read
How Cloud Native and Edge Computing Are Transforming Modern IT Architecture
IT Architects Alliance
IT Architects Alliance
Jan 19, 2025 · Cloud Native

Mastering Cloud‑Native CI/CD: Build, Deploy, and Scale Your Pipelines

This comprehensive guide explains cloud‑native architecture fundamentals, walks through CI/CD pipeline core components, provides step‑by‑step instructions for setting up Git, Jenkins, Docker, and Kubernetes, and demonstrates advanced Tekton pipelines, while discussing benefits, challenges, and future trends.

Cloud NativeDockerJenkins
0 likes · 20 min read
Mastering Cloud‑Native CI/CD: Build, Deploy, and Scale Your Pipelines
IT Architects Alliance
IT Architects Alliance
Jan 19, 2025 · Cloud Native

How Cloud‑Native Architecture Slashes Costs and Supercharges Enterprise Efficiency

The article examines how adopting a cloud‑native architecture—through precise resource monitoring, automation pipelines, pay‑as‑you‑go scaling, hybrid‑cloud strategies, and container‑based microservices—enables companies to dramatically reduce operational expenses, improve resource utilization, and accelerate innovation in competitive markets.

Cloud NativeContainersCost Optimization
0 likes · 9 min read
How Cloud‑Native Architecture Slashes Costs and Supercharges Enterprise Efficiency