Tagged articles
928 articles
Page 3 of 10
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jun 3, 2024 · Cloud Native

Fluid 1.0 Release: Cloud‑Native Data Orchestration for AI and Big Data

Fluid 1.0 introduces a cloud‑native data orchestration platform that abstracts dataset management, affinity scheduling, custom data processing, and data flow pipelines for AI and big‑data workloads on Kubernetes, backed by extensive production testing, open‑source contributions, and a roadmap for future enhancements.

AIData OrchestrationKubernetes
0 likes · 13 min read
Fluid 1.0 Release: Cloud‑Native Data Orchestration for AI and Big Data
DataFunTalk
DataFunTalk
May 31, 2024 · Cloud Native

Optimizing I/O for Data‑Intensive Analytics in Cloud‑Native Environments: Insights from Uber Presto

This whitepaper examines the industry trend of moving data‑intensive analytics applications to cloud‑native environments, revealing how cloud storage cost models affect performance optimization, and presents case‑study findings from Uber’s Presto production workload that highlight fragmented I/O patterns and the financial impact of storage API calls.

I/O optimizationPrestocloud-native
0 likes · 3 min read
Optimizing I/O for Data‑Intensive Analytics in Cloud‑Native Environments: Insights from Uber Presto
Alibaba Cloud Native
Alibaba Cloud Native
May 30, 2024 · Cloud Native

Translate CS Textbooks Instantly with AI: A Hands‑On Higress Cloud‑Native Guide

This guide shows how to use free AI translation tools—Immersive Translate and OpenAI Translator—together with the Higress cloud‑native AI‑proxy plugin, configuring Docker, model mappings, and custom dictionaries to efficiently translate computer‑science textbooks like Rust and Crafting Interpreters, while comparing machine and human translations.

AI translationDockerHigress
0 likes · 11 min read
Translate CS Textbooks Instantly with AI: A Hands‑On Higress Cloud‑Native Guide
Alibaba Cloud Observability
Alibaba Cloud Observability
May 29, 2024 · Cloud Native

Why iLogtail Needed a Complete Architecture Overhaul and How It Was Done

This article explains the evolution of iLogtail from a single‑file collector to a multi‑language, plugin‑based observability pipeline, outlines the motivations for refactoring, describes the new unified data model, plugin abstractions, pipeline design, configuration management, hot‑reload mechanisms, and the separation of enterprise and open‑source code, providing a comprehensive view of the architectural upgrade.

CConfiguration ManagementGolang
0 likes · 43 min read
Why iLogtail Needed a Complete Architecture Overhaul and How It Was Done
Open Source Linux
Open Source Linux
May 20, 2024 · Cloud Native

How to Debug Kubernetes Pods Without Root Using Ephemeral Containers

This article explains why traditional kubectl exec fails under common Kubernetes security best‑practices and demonstrates how to use kubectl debug to launch temporary containers with shared namespaces, install tools like htop, and explore alternative debugging methods such as kpexec and AI‑assisted Appilot.

DebuggingEphemeral Containerscloud-native
0 likes · 10 min read
How to Debug Kubernetes Pods Without Root Using Ephemeral Containers
Cloud Native Technology Community
Cloud Native Technology Community
May 15, 2024 · Cloud Native

Common kubectl and Docker Commands for Kubernetes and Container Management

This guide compiles a comprehensive set of kubectl and Docker command snippets for retrieving logs, sorting pods, managing secrets, cleaning up resources, performing port‑forwarding, patching storage classes, and other routine Kubernetes and container operations, helping administrators streamline cluster maintenance tasks.

Containercloud-nativekubectl
0 likes · 14 min read
Common kubectl and Docker Commands for Kubernetes and Container Management
Java High-Performance Architecture
Java High-Performance Architecture
Apr 17, 2024 · Backend Development

Why Tech Giants Are Turning Away from Microservices in 2023

In 2023, major tech companies like Google, Amazon, and Uber publicly questioned the benefits of microservices, revealing new architectural approaches that promise lower latency, reduced costs, and simpler deployment, while highlighting the challenges and pitfalls that have led many teams to revert to monolithic designs.

BackendSoftware Engineeringarchitecture
0 likes · 10 min read
Why Tech Giants Are Turning Away from Microservices in 2023
Ops Development Stories
Ops Development Stories
Apr 12, 2024 · Cloud Native

Mastering etcd: Architecture, Monitoring & Performance Tuning

This article provides a comprehensive overview of etcd—including its origins, role in Kubernetes, version evolution, layered architecture, key terminology, operational commands, monitoring metrics, benchmarking procedures, disk‑performance testing, and tuning recommendations—for building reliable cloud‑native clusters.

Benchmarkcloud-nativedistributed storage
0 likes · 17 min read
Mastering etcd: Architecture, Monitoring & Performance Tuning
Architecture & Thinking
Architecture & Thinking
Apr 7, 2024 · Cloud Native

Why Microservices Matter: Evolution, Benefits, and When to Adopt

Microservices have evolved from early SOA to container‑driven, cloud‑native architectures, offering fine‑grained, loosely coupled services with benefits like scalability, independent deployment, and fault isolation, while also presenting challenges such as distributed complexity, testing, and operational overhead, and are best adopted when traffic, team size, or rapid iteration demand it.

architecturecloud-native
0 likes · 12 min read
Why Microservices Matter: Evolution, Benefits, and When to Adopt
Java Architect Essentials
Java Architect Essentials
Mar 27, 2024 · Cloud Native

Rethinking Microservices: Why Google, Amazon and Others Are Moving Away from Traditional Microservice Architectures

In 2023, major tech companies such as Google and Amazon publicly questioned the benefits of traditional microservice architectures, presenting new "microservice 2.0" concepts, monolithic alternatives, and cost‑performance analyses that highlight a broader industry shift toward more pragmatic, cloud‑native design approaches.

Cost Optimizationarchitecturecloud-native
0 likes · 12 min read
Rethinking Microservices: Why Google, Amazon and Others Are Moving Away from Traditional Microservice Architectures
DeWu Technology
DeWu Technology
Mar 25, 2024 · Cloud Native

Design and Implementation of Same‑City Dual‑Active Architecture for a Transaction Platform

The paper details a same‑city dual‑active architecture for a high‑traffic transaction platform, combining blue‑green and dual‑cluster deployment with zone‑aware routing, middleware transformations, and a gradual traffic‑coloring release process that achieved near‑50/50 traffic split, stable performance, minimal cost, and outlines remaining challenges.

DeploymentDual-Activecloud-native
0 likes · 20 min read
Design and Implementation of Same‑City Dual‑Active Architecture for a Transaction Platform
AntTech
AntTech
Mar 22, 2024 · Cloud Native

LightPool: A Cloud‑Native NVMe‑oF Based High‑Performance Storage Pool Architecture for Distributed Databases

The article introduces LightPool, an open‑source, cloud‑native storage‑pool architecture presented at HPCA 2024, which leverages NVMe‑over‑Fabric, Kubernetes CSI integration, and a lightweight user‑space engine to deliver high‑performance, elastic, and highly available storage for large‑scale distributed databases while reducing cost and improving resource utilization.

NVMe-oFcloud-nativedistributed-database
0 likes · 12 min read
LightPool: A Cloud‑Native NVMe‑oF Based High‑Performance Storage Pool Architecture for Distributed Databases
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 21, 2024 · Cloud Native

LightPool: An NVMe‑oF‑Based High‑Performance and Lightweight Storage Pool Architecture for Cloud‑Native Distributed Databases

The article presents LightPool, a cloud‑native storage‑pooling solution that leverages NVMe‑over‑Fabric, Kubernetes‑based scheduling, and a lightweight user‑space engine to deliver high‑performance, low‑cost, and highly available storage for large‑scale distributed databases while eliminating traditional bottlenecks.

KubernetesNVMe-oFcloud-native
0 likes · 13 min read
LightPool: An NVMe‑oF‑Based High‑Performance and Lightweight Storage Pool Architecture for Cloud‑Native Distributed Databases
AntData
AntData
Mar 21, 2024 · Cloud Computing

LightPool: A Cloud‑Native NVMe‑oF Based High‑Performance Storage Pool Architecture for Distributed Databases

The article introduces LightPool, an open‑source cloud‑native storage‑pool architecture built on NVMe‑over‑Fabric that delivers high performance, low cost, and high availability for large‑scale distributed databases, and explains its design, scheduling, storage engine, and hot‑upgrade/migration capabilities presented at the 30th IEEE HPCA conference.

LiteIONVMe-oFcloud-native
0 likes · 13 min read
LightPool: A Cloud‑Native NVMe‑oF Based High‑Performance Storage Pool Architecture for Distributed Databases
Tencent Cloud Developer
Tencent Cloud Developer
Mar 21, 2024 · Backend Development

Backend Refactoring and Architecture Design of Tencent Docs Collection Form Service

Tencent Docs transformed its high‑traffic Collection Form by refactoring a monolithic C++‑style service into 19 loosely‑coupled vertical services with light‑heavy separation, database isolation, async Kafka pipelines, and full observability via Tianji, achieving dramatically improved stability, millisecond‑level sync, reliable export, and faster incident resolution.

BackendMicroservicesObservability
0 likes · 21 min read
Backend Refactoring and Architecture Design of Tencent Docs Collection Form Service
Practical DevOps Architecture
Practical DevOps Architecture
Mar 15, 2024 · Operations

Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development

This multi‑chapter guide provides in‑depth, hands‑on instruction for configuring and optimizing all Prometheus components, exploring Kubernetes monitoring, source‑code analysis, custom exporter development, high‑availability setups, service discovery, resource‑efficient scraping, and integrating Thanos for long‑term storage.

KubernetesObservabilityOperations
0 likes · 4 min read
Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development
Liangxu Linux
Liangxu Linux
Mar 13, 2024 · Cloud Native

From chroot to Kubernetes: How Containerization Evolved Over Decades

Tracing the evolution of container technology—from the 1979 Unix chroot command, through Linux namespaces and cgroups, to LXC, Docker, and Kubernetes—this article explains each milestone’s role in isolation, resource control, and cloud-native orchestration, highlighting the shift toward managed cloud container services.

KubernetesLinuxcloud-native
0 likes · 10 min read
From chroot to Kubernetes: How Containerization Evolved Over Decades
DevOps Engineer
DevOps Engineer
Feb 23, 2024 · Artificial Intelligence

GitHub Octoverse 2023: AI, Cloud‑Native, and Open‑Source Trends Shaping the Global Developer Experience

The 2023 GitHub Octoverse report reveals that generative AI, cloud‑native workflows, and open‑source contributions are rapidly becoming mainstream, with 92% of developers using AI‑assisted coding tools, a 38% rise in private repositories, and significant regional growth across the US, Asia‑Pacific, Africa, and Latin America.

AIGitHubcloud-native
0 likes · 21 min read
GitHub Octoverse 2023: AI, Cloud‑Native, and Open‑Source Trends Shaping the Global Developer Experience
SQB Blog
SQB Blog
Feb 23, 2024 · Cloud Native

Building External Plugins for APISIX: Deep Dive into Cloud‑Native Gateway Extensions

This article explains APISIX’s multi‑process architecture and request lifecycle, then explores various ways to develop external plugins—including SideCar‑based plugins, WASM modules, and LuaJIT FFI—detailing their implementation steps, advantages, and limitations to help developers choose the optimal approach for extending the cloud‑native gateway.

APISIXFFILua
0 likes · 14 min read
Building External Plugins for APISIX: Deep Dive into Cloud‑Native Gateway Extensions
Alibaba Cloud Native
Alibaba Cloud Native
Feb 22, 2024 · Cloud Native

Achieving 50% Cost Cut with Cloud‑Native Architecture: A Flexible Workforce Platform Case

Facing poor observability, high resource waste, and unstable releases, QingTuan’s flexible‑workforce platform transformed its monolithic and SOA systems into a cloud‑native micro‑service architecture using Alibaba Cloud ACK, MSE, ARMS, and Prometheus, achieving higher availability, elastic scaling, and up to 50% infrastructure cost reduction.

Observabilityarchitecturecloud-native
0 likes · 22 min read
Achieving 50% Cost Cut with Cloud‑Native Architecture: A Flexible Workforce Platform Case
Alibaba Cloud Native
Alibaba Cloud Native
Feb 21, 2024 · Cloud Native

How Fluid & JindoCache Accelerate Large‑Scale AI Training in a Cloud‑Native Environment

This article examines the challenges of data‑intensive AI training on heterogeneous cloud‑native infrastructure and explains how the Fluid framework combined with JindoCache and KubeDL provides distributed caching, metadata acceleration, and seamless POSIX access to dramatically improve I/O performance, GPU utilization, and cost efficiency.

AI trainingData CachingFluid
0 likes · 18 min read
How Fluid & JindoCache Accelerate Large‑Scale AI Training in a Cloud‑Native Environment
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 20, 2024 · Big Data

Feishu ShenNuo's Real-Time Data Warehouse with Flink, Hudi, and Hologres

Feishu ShenNuo redesigned its data architecture by integrating Flink, Hudi, and Hologres to create a cloud‑native real‑time data warehouse that supports both millisecond‑level ad monitoring and minute‑level game operations, offering scalable storage, low‑latency queries, and comprehensive monitoring and capacity planning.

FlinkHologresHudi
0 likes · 16 min read
Feishu ShenNuo's Real-Time Data Warehouse with Flink, Hudi, and Hologres
MaGe Linux Operations
MaGe Linux Operations
Feb 17, 2024 · Cloud Native

From chroot to Kubernetes: The Evolution of Containerization

Tracing the history of containerization, this article explores how early file isolation with chroot evolved through namespaces and cgroups, leading to LXC, Docker’s lightweight application packaging, Kubernetes orchestration, and finally cloud-native services like Huawei CCE, highlighting each stage’s impact on modern software deployment.

DockerKubernetesLinux
0 likes · 11 min read
From chroot to Kubernetes: The Evolution of Containerization
MaGe Linux Operations
MaGe Linux Operations
Feb 10, 2024 · Cloud Native

Mastering Multi‑Instance NGINX‑Ingress: Deployment, Configuration, and Best Practices

This guide explains how NGINX‑Ingress works, outlines key considerations for running multiple instances, and provides step‑by‑step instructions—including helm installation, custom controller parameters, admission webhook scoping, and verification screenshots—to reliably deploy and manage several NGINX‑Ingress controllers in a Kubernetes cluster.

Multi-Instanceadmission-webhookcloud-native
0 likes · 10 min read
Mastering Multi‑Instance NGINX‑Ingress: Deployment, Configuration, and Best Practices
MaGe Linux Operations
MaGe Linux Operations
Feb 4, 2024 · Cloud Native

From Monolith to Microservices: How Cloud‑Native Architecture Transforms Modern Apps

This article traces the evolution of software architecture—from early monolithic Java war packages through Service‑Oriented Architecture to modern microservices and cloud‑native designs—highlighting their structural differences, benefits, challenges, and the key principles that guide successful migration to distributed, scalable systems.

BackendMicroservicesSOA
0 likes · 10 min read
From Monolith to Microservices: How Cloud‑Native Architecture Transforms Modern Apps
MaGe Linux Operations
MaGe Linux Operations
Feb 1, 2024 · Cloud Native

Inside Kubernetes kube-scheduler: A Deep Dive into Its Code Structure and Scheduling Logic

This article dissects the internal architecture of Kubernetes' kube-scheduler, walking through its initialization with Cobra, the Setup function, the creation of scheduler instances, the priority queue mechanics, scheduling cycles, and binding processes, providing comprehensive code examples to illuminate each step of the scheduling workflow.

GoSchedulercloud-native
0 likes · 19 min read
Inside Kubernetes kube-scheduler: A Deep Dive into Its Code Structure and Scheduling Logic
Alibaba Cloud Native
Alibaba Cloud Native
Jan 30, 2024 · Cloud Native

Detect Java Microservice Bottlenecks with ARMS Code Hotspots

During high‑traffic load tests, e‑commerce services often hit performance ceilings, leading to low success rates and high latency; by combining tracing data, CPU flame‑graphs, and Alibaba Cloud’s ARMS 3.x JavaAgent features such as Code Hotspots and Adaptive Overload Protection, teams can automatically locate bottlenecks, mitigate traffic spikes, and improve stability without code changes.

CPU FlameGraphcloud-nativejava-agent
0 likes · 18 min read
Detect Java Microservice Bottlenecks with ARMS Code Hotspots
DaTaobao Tech
DaTaobao Tech
Jan 29, 2024 · Cloud Native

Observability: Logging, Metrics, and Tracing in Distributed Systems

Observability in distributed systems combines event logging, aggregated metrics, and request tracing—each offering distinct trade‑offs in detail, storage, and overhead—and while the ELK stack dominates log and metric handling, tracing solutions such as EagleEye and SkyWalking differ by protocol and language, prompting many teams to adopt unified, cloud‑native platforms like Alibaba Cloud’s Log Service for lower cost, real‑time analysis and simplified management.

ELKMetricsObservability
0 likes · 32 min read
Observability: Logging, Metrics, and Tracing in Distributed Systems
Liangxu Linux
Liangxu Linux
Jan 28, 2024 · Cloud Native

Master Kubernetes Troubleshooting: 100 Essential kubectl Commands

This guide compiles 100 practical kubectl commands that help you diagnose cluster information, pods, services, deployments, networking, storage, security, autoscaling, and many other Kubernetes components, providing a handy reference for effective cluster troubleshooting.

ClusterKubernetescloud-native
0 likes · 19 min read
Master Kubernetes Troubleshooting: 100 Essential kubectl Commands
Efficient Ops
Efficient Ops
Jan 22, 2024 · Operations

Mastering Monitoring: Black‑Box vs White‑Box, Metrics, and Prometheus in Practice

This guide explains monitoring fundamentals, clears common misconceptions, compares black‑box and white‑box approaches, outlines key metrics such as latency, traffic, errors and saturation, and provides a deep dive into Prometheus architecture, data model, query language, and practical examples for CPU, memory, and disk monitoring.

Prometheuscloud-nativemonitoring
0 likes · 15 min read
Mastering Monitoring: Black‑Box vs White‑Box, Metrics, and Prometheus in Practice
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jan 22, 2024 · Industry Insights

How Trustworthy Computing Power Measurement Can Transform Cloud‑Native Services

This article examines the urgent need for standardized, trustworthy computing power measurement, outlines narrow and broad measurement frameworks, and details a technical solution that integrates WASM virtual machines and blockchain with Kubernetes to achieve precise, tamper‑proof resource accounting for modern cloud‑native environments.

KubernetesWasmcloud-native
0 likes · 14 min read
How Trustworthy Computing Power Measurement Can Transform Cloud‑Native Services
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jan 20, 2024 · Backend Development

Tango Flow: A Low‑Code Workflow Orchestration Platform for Cloud Music Backend

Tango Flow is a low‑code workflow orchestration platform that unifies Cloud Music’s backend services—RPC, HTTP, FaaS and tool‑domain APIs—into visual, versioned workflows, offering drag‑and‑drop design, debugging, mock testing, multi‑tenant clustering, monitoring and continuous release to replace BFFs and accelerate full‑chain development.

BFFBackend DevelopmentWorkflow Orchestration
0 likes · 18 min read
Tango Flow: A Low‑Code Workflow Orchestration Platform for Cloud Music Backend
DaTaobao Tech
DaTaobao Tech
Jan 17, 2024 · Backend Development

Scaling and Performance Optimization of Taobao Shopping Cart

Taobao’s shopping cart was scaled and optimized by raising the item limit to 380, deploying the cloud‑native in‑memory read‑only replica tairSQL for read‑write separation, pre‑computing promotions, compressing payloads, caching data, redesigning the protocol, introducing response‑streaming APIs, and parallelizing per‑item processing with Java’s ForkJoinPool, dramatically cutting latency during traffic spikes.

Performance OptimizationScalabilityShopping Cart
0 likes · 15 min read
Scaling and Performance Optimization of Taobao Shopping Cart
IT Architects Alliance
IT Architects Alliance
Jan 8, 2024 · Fundamentals

Five Common Software Architecture Patterns and Their Ideal Use Cases

This article examines five prevalent software architecture styles—monolithic, microservices, client‑server, distributed, and cloud‑native—explaining their characteristics, advantages, and suitable scenarios to help developers choose the most appropriate design for their projects in modern software development.

Design PatternsSoftware Architecturecloud-native
0 likes · 5 min read
Five Common Software Architecture Patterns and Their Ideal Use Cases
dbaplus Community
dbaplus Community
Jan 2, 2024 · Operations

How Xiaohongshu Scaled Its Metrics System Tenfold with Cloud‑Native Architecture

Facing exploding metric volumes, high resource consumption, and fragile operations, Xiaohongshu's observability team completely rebuilt its metrics pipeline using Victoriametrics, achieving ten‑fold performance gains, minute‑level scaling, high‑availability, cost reduction, and robust multi‑cloud active‑active deployment while preserving data safety and query speed.

MetricsObservabilityPrometheus
0 likes · 34 min read
How Xiaohongshu Scaled Its Metrics System Tenfold with Cloud‑Native Architecture
Efficient Ops
Efficient Ops
Jan 1, 2024 · Cloud Native

Build a Mini Docker with Bash: Master Namespaces, Cgroups & OverlayFS

This article walks you through creating a lightweight Docker‑like container runtime using Bash, explaining Linux namespaces, cgroups, and overlayfs, showing how to inspect and manipulate them, and providing a complete 130‑line script that implements pull, build, run, exec, logs, and cleanup operations.

BashNamespacescgroups
0 likes · 32 min read
Build a Mini Docker with Bash: Master Namespaces, Cgroups & OverlayFS
MaGe Linux Operations
MaGe Linux Operations
Dec 30, 2023 · Operations

Why Every Developer Needs FRP: Fast Reverse Proxy for Secure Remote Access

This article explains what internal network penetration is, why developers should use FRP (Fast Reverse Proxy) to access internal services, perform remote debugging, receive webhooks, protect sensitive data, and provides a step‑by‑step guide to install and configure FRP on Linux servers and clients.

cloud-nativefrpnetwork tunneling
0 likes · 13 min read
Why Every Developer Needs FRP: Fast Reverse Proxy for Secure Remote Access
Alibaba Cloud Native
Alibaba Cloud Native
Dec 28, 2023 · Cloud Native

Mastering Elastic Scheduling in Alibaba Cloud ACK for Cost‑Effective Resource Management

This article explains how Alibaba Cloud Container Service (ACK) extends Kubernetes scheduling with custom elastic resource priority, reverse‑order scaling, and resource caps, providing step‑by‑step examples and YAML policies to help enterprises optimize cloud resource allocation and reduce costs.

alibaba-cloudcloud-nativeelastic-scheduling
0 likes · 12 min read
Mastering Elastic Scheduling in Alibaba Cloud ACK for Cost‑Effective Resource Management
Efficient Ops
Efficient Ops
Dec 27, 2023 · Cloud Native

How Shenwan Hongyuan Built a Cloud‑Native Business Middle Platform: A Case Study

At the 21st GOPS Global Operations Conference, Shenwan Hongyuan’s senior product manager presented a detailed case study of their cloud‑native business middle platform, outlining challenges such as siloed systems, the goals of creating an open, flexible, high‑efficiency platform, and the stepwise evolution from version 1.0 to 3.0.

business platformcloud-native
0 likes · 5 min read
How Shenwan Hongyuan Built a Cloud‑Native Business Middle Platform: A Case Study
21CTO
21CTO
Dec 27, 2023 · Cloud Native

Why 2023 Marks the Decline of Microservices: Lessons from Google, Amazon, and DHH

2023 saw a growing backlash against microservices as major players like Google, Amazon, and Basecamp’s DHH highlight performance, cost, and complexity issues, proposing monolithic or “microservices 2.0” approaches that promise lower latency, reduced expenses, and simpler deployment, sparking a re‑evaluation of cloud‑native architectures.

CostMicroservicesarchitecture
0 likes · 12 min read
Why 2023 Marks the Decline of Microservices: Lessons from Google, Amazon, and DHH
MaGe Linux Operations
MaGe Linux Operations
Dec 27, 2023 · Cloud Native

Master Kubectl in 5 Minutes: Essential Commands for Kubernetes

This guide provides a concise, 5‑minute overview of the most frequently used kubectl commands—including autocomplete setup, context management, resource creation, querying, updating, patching, scaling, and troubleshooting—complete with ready‑to‑run code snippets for efficient Kubernetes cluster operations.

CLIcloud-nativekubectl
0 likes · 17 min read
Master Kubectl in 5 Minutes: Essential Commands for Kubernetes
DeWu Technology
DeWu Technology
Dec 27, 2023 · Cloud Native

DeWu's Cloud-Native Container Management Practices

Since August 2021, DeWu App has built a cloud‑native, multi‑cluster Kubernetes platform that uses an OAM‑style CloneSet model, Helm‑generated resources, Karmada‑based federation, custom scheduler plugins for reservation and node‑balance, offline mixing for Flink, a unified KubeAutoScaler, and a self‑built KubeAI stack, achieving significant cost cuts and improved stability while planning further middleware containerization and multi‑cloud expansion.

AICost ManagementKubernetes
0 likes · 22 min read
DeWu's Cloud-Native Container Management Practices
DevOps
DevOps
Dec 26, 2023 · Cloud Native

Comprehensive Guide to Cloud‑Native DevOps: Architecture, Tools, and Practical Implementations

This document presents a thorough overview of cloud‑native DevOps, covering the evolution of related technologies, detailed analysis of virtualization, container orchestration, CI/CD pipelines, programming language choices, system architectures, database options, build tools, and five step‑by‑step practice cases that demonstrate end‑to‑end automation, monitoring, and release management in Kubernetes environments.

AutomationDevOpsInfrastructure
0 likes · 35 min read
Comprehensive Guide to Cloud‑Native DevOps: Architecture, Tools, and Practical Implementations
Efficient Ops
Efficient Ops
Dec 26, 2023 · Cloud Native

Master kubectl: Essential Commands for Managing Kubernetes Clusters

This comprehensive guide covers kubectl basics, autocomplete setup, context configuration, creating, viewing, updating, patching, editing, scaling, and deleting resources, as well as interacting with pods, nodes, and using the kubectl set family for resources, selectors, and images.

DevOpsKubernetescloud-native
0 likes · 16 min read
Master kubectl: Essential Commands for Managing Kubernetes Clusters
AntTech
AntTech
Dec 25, 2023 · Databases

HoraeDB Joins Apache Incubator: Design Goals, Architecture, and Core Features of the Cloud‑Native Time‑Series Database

HoraeDB, the next‑generation cloud‑native time‑series database originally from Ant Group, has been accepted into the Apache Incubator, and this article outlines its design motivations, distributed architecture, key technical components, and core capabilities such as high performance, low cost, SQL‑based analytics, storage‑compute separation, high availability, and open‑source ecosystem compatibility.

Apache IncubatorSQL Analyticscloud-native
0 likes · 6 min read
HoraeDB Joins Apache Incubator: Design Goals, Architecture, and Core Features of the Cloud‑Native Time‑Series Database
Bilibili Tech
Bilibili Tech
Dec 22, 2023 · Cloud Native

Safe Change Management in Bilibili's Cloud‑Native Container Platform Caster

The paper describes Bilibili’s Caster platform, which implements standardized workflows, left‑shifted pre‑checks, tiered release checkpoints, and an emergency green‑channel to safely manage containerized application changes, providing real‑time observability, automated rollback, and capacity‑aware scaling that together cut change‑induced incidents and improve production stability.

ci/cdcloud-nativecontainer platform
0 likes · 17 min read
Safe Change Management in Bilibili's Cloud‑Native Container Platform Caster
21CTO
21CTO
Dec 15, 2023 · Backend Development

9 Essential Microservices Best Practices to Build Scalable, Secure Systems

This article outlines nine practical microservices best practices—from applying the Single Responsibility Principle and forming cross‑functional teams to using proper DevSecOps tools, asynchronous communication, independent data stores, and robust monitoring—to help developers design scalable, maintainable, and secure backend architectures.

BackendDevOpsMicroservices
0 likes · 12 min read
9 Essential Microservices Best Practices to Build Scalable, Secure Systems
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Dec 14, 2023 · Cloud Native

Evolution of Xiaohongshu Metrics System: Cloud‑Native Observability, High Availability, and Performance Optimizations

Xiaohongshu’s observability team rebuilt its Prometheus‑based metrics platform using vmagent, dual‑active HA clusters, query push‑down, high‑cardinality governance and multi‑cloud active‑active design, delivering ten‑fold collection speed, up to 70× query capacity, massive CPU‑memory‑storage savings and fully automated scaling.

MetricsTime SeriesVictoriaMetrics
0 likes · 35 min read
Evolution of Xiaohongshu Metrics System: Cloud‑Native Observability, High Availability, and Performance Optimizations
Alibaba Cloud Native
Alibaba Cloud Native
Dec 8, 2023 · Cloud Native

How Fluid’s Cloud‑Native Caching Supercharges AIGC Model Inference

The article examines the cost, performance, and efficiency challenges of large‑model inference, explains why Kubernetes is becoming the standard platform for AI workloads, and details how the Fluid project provides cloud‑native caching, elastic scaling, and automation to dramatically reduce startup latency and operating expenses.

AIAIGCKubernetes
0 likes · 17 min read
How Fluid’s Cloud‑Native Caching Supercharges AIGC Model Inference
dbaplus Community
dbaplus Community
Dec 7, 2023 · Backend Development

How to Merge Go Microservices into a Single Pod and Cut CPU Usage by 60%

This article explains how the team transformed a Go‑based microservice recommendation system into a single‑pod monolithic application using tRPC‑Go, detailing performance bottlenecks, code‑level mock‑proxy techniques, deployment adjustments, and the resulting dramatic reduction in CPU consumption.

BackendGoMicroservices
0 likes · 13 min read
How to Merge Go Microservices into a Single Pod and Cut CPU Usage by 60%
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Nov 27, 2023 · Cloud Native

Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform

Xiaohongshu’s cloud‑native platform adopted a four‑stage mixed‑workload scheduling strategy—reusing idle nodes, whole‑machine time‑sharing, normal mixed pools, and a unified scheduler (Tusker) that coordinates CPU, GPU and memory across Kubernetes and YARN—boosting average cluster CPU utilization from under 20 % to over 45 % and delivering millions of low‑cost core‑hours while preserving QoS for latency‑sensitive, mid, and batch jobs.

Big DataKubernetesQoS
0 likes · 19 min read
Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 27, 2023 · Databases

How GaiaDB Redefines Cloud‑Native Databases with Fusion Architecture

GaiaDB, Baidu’s cloud‑native database, combines compute‑storage separation with a fused, log‑service architecture to boost performance, simplify consistency, and deliver multi‑level high availability across zones and regions, while supporting new features such as parallel query, HTAP replicas, and serverless scaling.

cloud-nativedistributed-systemshigh-availability
0 likes · 17 min read
How GaiaDB Redefines Cloud‑Native Databases with Fusion Architecture
Top Architect
Top Architect
Nov 24, 2023 · Cloud Native

Comprehensive Overview of Microservice Architecture Components

This article provides a detailed walkthrough of a typical microservice architecture, covering entry traffic with Nginx, gateway selection, business service design, service registry options, caching and distributed locks with Redis, data persistence strategies, structured data storage, messaging middleware, log collection, task scheduling, and distributed object storage, while also promoting related community resources.

BackendMicroservicesarchitecture
0 likes · 11 min read
Comprehensive Overview of Microservice Architecture Components
Qunar Tech Salon
Qunar Tech Salon
Nov 22, 2023 · Operations

Optimizing Qunar's Monitoring System for Faster Fault Detection and Root‑Cause Analysis

This article details Qunar's comprehensive overhaul of its monitoring platform—introducing second‑level metrics, redesigning storage with VictoriaMetrics, optimizing client and server data collection, and building a root‑cause analysis tool—to dramatically reduce order‑related fault discovery time from minutes to under one minute.

MicroservicesOperationsTSDB
0 likes · 22 min read
Optimizing Qunar's Monitoring System for Faster Fault Detection and Root‑Cause Analysis
MaGe Linux Operations
MaGe Linux Operations
Nov 20, 2023 · Fundamentals

How Low-Code Platforms Accelerate Development and Cut Costs

Low-code development uses visual, drag‑and‑drop interfaces and minimal coding to let both professional and non‑technical users quickly build applications, boosting efficiency, reducing maintenance effort, and lowering overall development costs across various business scenarios.

Visual Programmingcloud-nativelow-code
0 likes · 8 min read
How Low-Code Platforms Accelerate Development and Cut Costs
ITPUB
ITPUB
Nov 17, 2023 · Operations

How Bilibili Overcame a Massive CDN Outage: Cloud‑Edge Incident Response Lessons

This article details the August 2023 Bilibili CDN failure, analyzes its root causes, describes the 1‑5‑10 emergency recovery framework, and presents cloud‑side SLB/BFS optimizations and edge‑side scheduling and fallback strategies that together restored service and improved future resilience.

CDNEdge ComputingOperations
0 likes · 20 min read
How Bilibili Overcame a Massive CDN Outage: Cloud‑Edge Incident Response Lessons
Tencent Cloud Developer
Tencent Cloud Developer
Nov 15, 2023 · Game Development

Case Study: KMS Game Company’s Cloud‑Native Architecture and Elastic Microservice Deployment on Tencent Cloud

Japanese game developer KMS migrated from Azure to Tencent Cloud, adopting a cloud‑native architecture with Tencent’s Elastic Microservice platform that provides timed and metric‑based scaling, CI/CD pipelines, and batch upgrades, resulting in roughly 50% cost savings, 15% performance gains and 50% latency reduction.

CI/CDGame DevelopmentMicroservices
0 likes · 9 min read
Case Study: KMS Game Company’s Cloud‑Native Architecture and Elastic Microservice Deployment on Tencent Cloud
Big Data Technology Architecture
Big Data Technology Architecture
Nov 14, 2023 · Big Data

Open Source Big Data Platform 3.0: Streaming Lakehouse, Serverless Architecture, and AI Integration

The talk outlines the evolution of Alibaba Cloud's open‑source big data platform from Hadoop‑based EMR to a 3.0 architecture featuring a streaming lakehouse, full serverless compute and storage, AI‑driven operations, and upcoming vector search services, highlighting technical motivations, challenges, and product releases.

Big DataLakehouseServerless
0 likes · 14 min read
Open Source Big Data Platform 3.0: Streaming Lakehouse, Serverless Architecture, and AI Integration
Tencent Cloud Developer
Tencent Cloud Developer
Nov 14, 2023 · Cloud Native

Monolithizing tRPC-Go Microservices: Architecture, Implementation, and Performance Gains

The article shows how to monolithize selected tRPC‑Go microservices by defining protobuf‑generated Go interfaces and swapping RPC proxies for in‑process implementations via a proxy API, cutting CPU usage by 61% while keeping microservice flexibility and offering best‑practice guidelines for Go service design.

GoMicroservicescloud-native
0 likes · 13 min read
Monolithizing tRPC-Go Microservices: Architecture, Implementation, and Performance Gains
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 10, 2023 · Big Data

How We Transformed Big Data Workloads with Spark on Kubernetes and OSS‑HDFS

Facing rapid growth in offline data and compute demands, we migrated our big‑data platform to a cloud‑native architecture using Spark 3.2.3 on Kubernetes with OSS‑HDFS storage, achieving elastic scaling, cost reduction, and compute‑storage separation while detailing implementation, challenges, and operational insights.

Sparkcloud-nativeelastic computing
0 likes · 25 min read
How We Transformed Big Data Workloads with Spark on Kubernetes and OSS‑HDFS
Code Ape Tech Column
Code Ape Tech Column
Nov 6, 2023 · Backend Development

Best Practices for Building Efficient Microservice Architectures

This article outlines essential microservice best practices—including the single‑responsibility principle, clear team responsibilities, appropriate tooling, asynchronous communication, DevSecOps security, independent data stores, isolated deployments, orchestration, and effective monitoring—to help developers design scalable, maintainable, and high‑performance backend systems.

DevOpsarchitecturebest practices
0 likes · 12 min read
Best Practices for Building Efficient Microservice Architectures
DataFunTalk
DataFunTalk
Nov 5, 2023 · Cloud Native

Cloud‑Native Storage Acceleration: Experience and Practices with CloudFS on Volcano Engine

This article presents the cloud‑native storage acceleration demands, evaluates what constitutes a good acceleration solution, and details the design, implementation, and real‑world practice of CloudFS—including metadata acceleration, data‑plane caching, FUSE enhancements, AI training and multi‑cloud data‑lake use cases—while outlining future roadmap plans.

AICloudFSKubernetes
0 likes · 15 min read
Cloud‑Native Storage Acceleration: Experience and Practices with CloudFS on Volcano Engine

How Cloud‑Native Transforms Big Data Platforms: Challenges, Solutions, and Future Trends

This article analyzes the rise of cloud‑native technologies in big data ecosystems, identifies key pain points such as resource scheduling, service capabilities, performance, and operations, and presents detailed technical explorations—including Volcano batch scheduling, Kyuubi serverless, vectorized computing, remote shuffle services, and storage‑compute separation—while outlining future development directions.

KubernetesServerlesscloud-native
0 likes · 23 min read
How Cloud‑Native Transforms Big Data Platforms: Challenges, Solutions, and Future Trends
Sohu Tech Products
Sohu Tech Products
Nov 1, 2023 · Databases

Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions

Douyin tackled vector‑retrieval challenges by optimizing HNSW and creating a high‑performance IVF algorithm, implementing custom scalar quantization, SIMD acceleration, and a DSL‑driven engine that merges filtering with search, then built a cloud‑native, storage‑compute‑separated vector database (VikingDB) delivering sub‑10 ms latency, real‑time updates, multi‑tenant support, and secure, scalable retrieval for LLM‑driven applications.

ANNLLM integrationStorage Compute Separation
0 likes · 18 min read
Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions
Amap Tech
Amap Tech
Nov 1, 2023 · Backend Development

Gaode Go Ecosystem Evolution, Cloud‑Native Serverless Practices, and Project Refactoring Experience

The article details Gaode’s journey of building a high‑performance Go ecosystem that scaled from zero to tens of millions of QPS, comparing Go with Java and Erlang, outlining cloud‑native serverless architecture, and sharing real‑world refactoring and optimization case studies such as a million‑QPS rendering gateway and a Go‑based sharding middleware.

GoMicroservicesServerless
0 likes · 34 min read
Gaode Go Ecosystem Evolution, Cloud‑Native Serverless Practices, and Project Refactoring Experience
Sohu Tech Products
Sohu Tech Products
Oct 25, 2023 · Cloud Native

Strategies for Rolling Restart of Pods During Istio Service Mesh Upgrade

To upgrade an Istio service mesh without overloading the cluster or causing downtime, the author recommends using Kubernetes’s built‑in kubectl rollout restart for each deployment—scaling replicas up then deleting old pods or simply invoking the command in a scripted loop—to safely perform a rolling restart of all sidecar‑proxied pods.

DevOpsIstioKubernetes
0 likes · 8 min read
Strategies for Rolling Restart of Pods During Istio Service Mesh Upgrade
DataFunSummit
DataFunSummit
Oct 24, 2023 · Databases

OushuDB: A Cloud‑Native Real‑Time Lakehouse Database – Architecture, Evolution and Practice

This article introduces OushuDB, a cloud‑native real‑time lakehouse database, tracing the evolution of cloud‑native lakehouse architectures, detailing OushuDB’s multi‑engine, multi‑storage design, and sharing practical insights on compute‑storage separation, high‑availability, and integration with Hadoop, Hive and Hudi.

Distributed Systemscloud-native
0 likes · 20 min read
OushuDB: A Cloud‑Native Real‑Time Lakehouse Database – Architecture, Evolution and Practice
Alibaba Cloud Native
Alibaba Cloud Native
Oct 9, 2023 · Cloud Native

Designing a “Highway” Architecture for Hybrid Cloud‑Native Data Sync with Dubbo

This article explains how a government procurement platform built a hybrid cloud‑native “highway” solution using Dubbo to achieve reusable, tunnel‑based data synchronization across cloud and isolated island networks, detailing the background, challenges, design choices, implementation steps, and future roadmap.

DubboMicroservicescloud-native
0 likes · 16 min read
Designing a “Highway” Architecture for Hybrid Cloud‑Native Data Sync with Dubbo
MaGe Linux Operations
MaGe Linux Operations
Sep 30, 2023 · Cloud Native

How DeWu Built a Scalable Cloud‑Native Trace2.0 Observability Platform

This article details DeWu's evolution from a sneaker marketplace to a full‑stack e‑commerce platform and explains how its cloud‑native monitoring system, based on OpenTelemetry, ClickHouse, and object storage, was architected, optimized, and scaled to handle billions of spans daily.

ObservabilityOpenTelemetrycloud-native
0 likes · 16 min read
How DeWu Built a Scalable Cloud‑Native Trace2.0 Observability Platform
Liangxu Linux
Liangxu Linux
Sep 24, 2023 · Operations

Understanding Prometheus Metric Types: Counters, Gauges, Histograms, and Summaries

This article explains the fundamentals of metrics, introduces dimensional metrics, compares Prometheus, OpenMetrics, and OpenTelemetry standards, and provides detailed guidance on the four Prometheus metric types—Counters, Gauges, Histograms, and Summaries—including their use‑cases, PromQL queries, and Python client examples.

CountersGaugesHistograms
0 likes · 18 min read
Understanding Prometheus Metric Types: Counters, Gauges, Histograms, and Summaries
DevOps Coach
DevOps Coach
Sep 21, 2023 · Operations

What Is Observability (o11y) and Why It Matters for Modern Cloud‑Native Operations

The article explains the origins, common misconceptions, and a rigorous definition of observability (o11y), highlights its importance in cloud‑native environments, and describes how high‑cardinality, high‑dimensional telemetry enables effective debugging, troubleshooting, and performance analysis of modern distributed systems.

Debuggingcloud-nativemonitoring
0 likes · 11 min read
What Is Observability (o11y) and Why It Matters for Modern Cloud‑Native Operations
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Sep 20, 2023 · Databases

How Huawei’s Multi‑Master Cloud Database Beats Aurora and CockroachDB

The article explains Huawei's VLDB 2023 paper on a cloud‑native multi‑master database, detailing its architecture, novel VS‑clock and hybrid lock techniques, and presents extensive performance experiments that show near‑linear scalability and superior throughput and latency compared with Aurora and CockroachDB.

cloud-nativedatabasedistributed-systems
0 likes · 19 min read
How Huawei’s Multi‑Master Cloud Database Beats Aurora and CockroachDB
Cloud Native Technology Community
Cloud Native Technology Community
Sep 19, 2023 · Cloud Native

Understanding Kubernetes Validating Admission Policies with Practical Examples

This article explains how Kubernetes Admission Controllers work, introduces the new Validating Admission Policies feature that uses CEL for native policy enforcement, and provides a step‑by‑step demonstration with YAML and kubectl commands showing how to limit deployment replicas in a namespace.

Admission Controllercloud-nativepolicy-management
0 likes · 9 min read
Understanding Kubernetes Validating Admission Policies with Practical Examples
Efficient Ops
Efficient Ops
Sep 17, 2023 · Cloud Native

Top 9 Essential Kubernetes Tools to Streamline Your Cloud‑Native Workflows

Explore nine indispensable Kubernetes tools—including Kubie, Kubespray, Helm, Minikube, K3s, Kustomize, KOps, Prometheus, and krew—that simplify cluster management, accelerate deployments, and enhance efficiency, helping you choose the right solution for smoother, more productive cloud‑native operations.

Cluster ManagementKubernetesPrometheus
0 likes · 6 min read
Top 9 Essential Kubernetes Tools to Streamline Your Cloud‑Native Workflows
DataFunTalk
DataFunTalk
Sep 17, 2023 · Cloud Native

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

REDck is a cloud‑native, storage‑compute separated real‑time OLAP data warehouse derived from ClickHouse that addresses scalability, operational cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, and two‑phase commit transactions.

ClickHouseDistributed TransactionsReal-time OLAP
0 likes · 18 min read
REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse
Alibaba Cloud Native
Alibaba Cloud Native
Sep 17, 2023 · Cloud Native

Unlock Seamless Message Integration with Alibaba Cloud EventBridge

This article explains how Alibaba Cloud EventBridge provides unified event services, data pipelines, and ecosystem integration to address modern messaging challenges such as routing, processing, and cross‑region synchronization, offering step‑by‑step guidance and practical scenarios for developers.

EventBridgeMessage Routingalibaba-cloud
0 likes · 13 min read
Unlock Seamless Message Integration with Alibaba Cloud EventBridge
Tencent Cloud Middleware
Tencent Cloud Middleware
Sep 14, 2023 · Backend Development

How Tencent Cloud’s Unitized Architecture Boosts Microservice Scalability and High Availability

This article explains the concept of unitized architecture, its characteristics and types, the performance and reliability challenges it solves for large‑scale microservice systems, and how Tencent Cloud’s TSF platform implements unit routing, gray release, and disaster‑recovery to achieve efficient, cross‑region, high‑availability deployments.

Tencent Cloudcloud-nativehigh-availability
0 likes · 15 min read
How Tencent Cloud’s Unitized Architecture Boosts Microservice Scalability and High Availability
Tencent Cloud Developer
Tencent Cloud Developer
Sep 13, 2023 · Cloud Native

Designing and Implementing a Payment Fund Account System

The article details how to design and implement a cloud‑native payment fund account system on Tencent Cloud, covering account definitions, fund flow and multiple account types, TDSQL storage, separated fund and account services, robust security, distributed transactions, auditing, reconciliation, and high‑availability measures for high‑concurrency merchant payments.

AvailabilityConsistencySecurity
0 likes · 35 min read
Designing and Implementing a Payment Fund Account System
ITPUB
ITPUB
Sep 11, 2023 · Cloud Native

How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Xiaohongshu built REDck, a cloud‑native, storage‑compute separated real‑time OLAP warehouse on ClickHouse, addressing scaling, cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, bucketing, and exactly‑once transaction support.

ClickHouseDistributed TransactionsReal-time OLAP
0 likes · 21 min read
How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse
Didi Tech
Didi Tech
Sep 7, 2023 · Cloud Native

Service Management and Resource Abstraction in Cloud‑Native Environments Using OAM and KubeVela

To tackle the exploding number of microservices and heterogeneous infrastructure in cloud‑native enterprises, the article proposes a unified service‑and‑resource abstraction built on the Open Application Model and its implementation KubeVela, enabling declarative application definitions, cost attribution, automated lifecycle management, and cross‑region efficiency through component marketplaces, an application center, an operations platform, and a site‑building center.

KubeVelaOAMService Management
0 likes · 13 min read
Service Management and Resource Abstraction in Cloud‑Native Environments Using OAM and KubeVela