Tagged articles
36 articles
Page 1 of 1
Woodpecker Software Testing
Woodpecker Software Testing
Jan 5, 2026 · Operations

Three Core Dimensions of Performance Testing: Time Behavior, Resource Utilization, and Capacity

This article breaks down performance testing into three essential dimensions—time behavior, resource utilization, and capacity—explains their key metrics, demonstrates a detailed e‑commerce flash‑sale case study, and shows how systematic testing and optimization can dramatically improve response times, throughput, and scalability.

JMeterLoad TestingMetrics
0 likes · 12 min read
Three Core Dimensions of Performance Testing: Time Behavior, Resource Utilization, and Capacity
JD Tech Talk
JD Tech Talk
Aug 3, 2024 · Operations

Evolution of Load Balancing Strategies in JD Advertising Online Model System

This article examines the progression of load‑balancing techniques used in JD's advertising online model system, analyzing current challenges, outlining requirements, reviewing static and dynamic strategies, and presenting a multi‑objective, hierarchical approach that improves service availability, resource utilization, and overall system stability.

Dynamic Schedulingload balancingresource utilization
0 likes · 14 min read
Evolution of Load Balancing Strategies in JD Advertising Online Model System
JD Cloud Developers
JD Cloud Developers
Aug 2, 2024 · Operations

How JD’s Advertising Platform Optimizes Load Balancing for Heterogeneous Clusters

Exploring the evolution of JD’s advertising online model system, this article examines the challenges of heterogeneous hardware load balancing, outlines static and dynamic strategies—including DNS, Nginx, LVS, Ribbon, and Dubbo—and presents a multi‑objective framework that improves service availability and resource utilization, achieving up to 20%+ efficiency gains.

Distributed Systemsheterogeneous hardwareload balancing
0 likes · 17 min read
How JD’s Advertising Platform Optimizes Load Balancing for Heterogeneous Clusters
JD Retail Technology
JD Retail Technology
Jul 24, 2024 · Operations

Load Balancing Strategies for Heterogeneous Hardware Clusters in JD Advertising Online Model System

This article examines the evolution, theory, and practical implementation of load balancing strategies for JD Advertising's online model system, focusing on heterogeneous hardware clusters, dual‑objective optimization of service availability and resource utilization, and the resulting performance improvements in large‑scale production environments.

heterogeneous clustersload balancingresource utilization
0 likes · 15 min read
Load Balancing Strategies for Heterogeneous Hardware Clusters in JD Advertising Online Model System
FunTester
FunTester
May 8, 2024 · Fundamentals

How Garbage Collection Impacts Performance and How to Optimize It

This article explains the fundamentals of garbage collection, compares manual and automatic memory management, outlines common GC algorithms, and provides practical guidance on performance analysis, tool selection, and optimization techniques to improve application responsiveness and resource utilization.

GC AlgorithmsGarbage CollectionMemory Management
0 likes · 22 min read
How Garbage Collection Impacts Performance and How to Optimize It
360 Smart Cloud
360 Smart Cloud
Jan 24, 2024 · Cloud Native

Idle Compute Sharing in Dedicated Kubernetes Clusters Using Karmada

The article describes how a company implements an idle compute sharing feature for dedicated Kubernetes clusters, leveraging Karmada to allocate spare CPU and memory to offline workloads, thereby improving resource utilization, reducing costs, and outlining usage scenarios, configuration steps, technical architecture, and future plans.

Cloud NativeIdle Compute SharingKarmada
0 likes · 9 min read
Idle Compute Sharing in Dedicated Kubernetes Clusters Using Karmada
Alibaba Cloud Native
Alibaba Cloud Native
Dec 11, 2023 · Cloud Native

Boosting Cluster Resource Utilization with Alibaba Cloud Native Elastic Solutions

This article explains how Alibaba Cloud's native elastic solutions—covering application‑level scaling, resource‑level scaling, and the new instant elastic controller—help enterprises improve Kubernetes cluster resource utilization, reduce costs, and simplify operations through advanced metrics, custom scaling policies, and event‑driven node management.

ACKCloud NativeCluster Autoscaler
0 likes · 18 min read
Boosting Cluster Resource Utilization with Alibaba Cloud Native Elastic Solutions
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Nov 17, 2023 · Cloud Native

Cloud Music FinOps Practice: Building Enterprise Cloud Cost Management Platform

NetEase Cloud Music’s self‑built FinOps platform tackles rising cloud spend by unifying cost data, visualizing and allocating expenses, rating resource utilization, and empowering platform providers, business units, and developers with data‑driven governance to curb the Andy‑Bill effect and enable scalable, long‑term cost control.

Cloud Cost ManagementCloud NativeContainer Governance
0 likes · 8 min read
Cloud Music FinOps Practice: Building Enterprise Cloud Cost Management Platform
Tencent Architect
Tencent Architect
Sep 27, 2023 · Cloud Native

OpenCloudOS Cloud‑Native Practices, Resource Utilization Enhancements, and Testing Framework Overview

The article introduces OpenCloudOS’s cloud‑native initiatives—including a mixed‑workload CPU QoS scheduler, the RUE resource‑utilization enhancement, the eBPF‑based nettrace network‑diagnosis tool, and the TCase/TSuite testing platform—highlighting how these innovations improve CPU utilization, cut costs, and ensure high‑quality releases.

LinuxOpen-sourceeBPF
0 likes · 14 min read
OpenCloudOS Cloud‑Native Practices, Resource Utilization Enhancements, and Testing Framework Overview
dbaplus Community
dbaplus Community
Dec 26, 2022 · Cloud Native

How Bilibili Boosted Server Utilization with Kubernetes Co‑Location Strategies

This article explains how Bilibili’s large‑scale Kubernetes cloud platform reduces costs and improves machine utilization by applying co‑location (mixed‑tenant) techniques, including resource‑aware scheduling, dynamic isolation, and a dedicated management console across online, offline, and idle‑machine scenarios.

Cloud NativeCo-locationKubernetes
0 likes · 17 min read
How Bilibili Boosted Server Utilization with Kubernetes Co‑Location Strategies
Cloud Native Technology Community
Cloud Native Technology Community
Aug 29, 2022 · Cloud Native

Cloud‑Native and Edge Computing: How Containers Empower Edge Applications

The article explains how the deep integration of cloud‑native technologies and edge computing, driven by digital transformation, improves resource utilization, unifies infrastructure management, reduces AI workload costs, simplifies device access, accelerates deployment, and enhances autonomy and ROI for enterprises.

AIContainersEdge Computing
0 likes · 10 min read
Cloud‑Native and Edge Computing: How Containers Empower Edge Applications
Cloud Native Technology Community
Cloud Native Technology Community
Jun 22, 2022 · Industry Insights

How to Slash Cloud‑Native Costs: Practical Steps for Better Resource Utilization

This article analyzes the low server utilization problem in modern cloud‑native environments, presents industry survey data, and outlines a four‑step framework—including observability, optimal public‑cloud usage, elasticity sharing, and remote deployment—to help enterprises dramatically reduce cloud costs while maintaining performance.

Cloud NativeCost OptimizationKubernetes
0 likes · 23 min read
How to Slash Cloud‑Native Costs: Practical Steps for Better Resource Utilization
Shopee Tech Team
Shopee Tech Team
May 26, 2022 · Cloud Computing

Shopee's Green Computing Practices: Optimizing Resource Utilization in Data Centers

Shopee reduces data‑center carbon emissions by over 40,000 tons annually through three 2021 green‑computing technologies—Overcommit resource oversubscription, mixed‑model Colocation of latency‑sensitive and batch workloads, and enhanced Auto Scaling that leverages global metrics to cut machine usage and improve resource efficiency.

Auto ScalingKubernetescarbon emissions
0 likes · 15 min read
Shopee's Green Computing Practices: Optimizing Resource Utilization in Data Centers
Tencent Cloud Developer
Tencent Cloud Developer
Dec 8, 2021 · Cloud Native

Using Tencent Cloud EKS Virtual Nodes to Solve CronJob Isolation and Scheduling Challenges

By offloading thousands of short‑lived CronJob pods to Tencent Cloud EKS serverless virtual nodes, Zuoyebang isolated them from online services, eliminated IP waste, achieved millisecond‑level parallel scheduling and sub‑3‑second startup, freed 10 % of cluster resources and cut scheduling costs by roughly 70 % while markedly improving cluster stability.

Cloud NativeCronJobKubernetes
0 likes · 10 min read
Using Tencent Cloud EKS Virtual Nodes to Solve CronJob Isolation and Scheduling Challenges
Alibaba Cloud Native
Alibaba Cloud Native
Dec 6, 2021 · Cloud Native

How Alibaba Cloud’s ECS‑Based FaaS Achieves High‑Density, Low‑Latency Serverless Scaling

This article explains the design of an ECS‑based Function‑as‑a‑Service platform, covering multi‑tenant deployment, rapid horizontal scaling, resource‑utilization optimization, avalanche‑prevention strategies, and high‑density deployment techniques that together enable fast, cost‑effective cloud‑native serverless workloads.

Cloud NativeECSServerless
0 likes · 12 min read
How Alibaba Cloud’s ECS‑Based FaaS Achieves High‑Density, Low‑Latency Serverless Scaling
Tencent Architect
Tencent Architect
Sep 10, 2021 · Cloud Native

BT Scheduler for Absolute Preemption: Boosting CPU Utilization and QoS in Cloud‑Native Environments

This article analyzes the limitations of the Linux Completely Fair Scheduler (CFS) for high‑priority workloads, introduces Tencent's custom offline BT scheduler that provides absolute preemption, and presents experimental results showing significant improvements in latency, CPU utilization, and carbon‑reduction for cloud‑native services.

BT schedulerCFSCPU scheduling
0 likes · 10 min read
BT Scheduler for Absolute Preemption: Boosting CPU Utilization and QoS in Cloud‑Native Environments
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Aug 31, 2021 · Operations

Facebook’s Shard Manager: Strategies for Large‑Scale System Sharding, Fault Tolerance, and Resource Utilization

The article explains how Facebook’s Shard Manager tackles large‑scale system sharding by combining stateful and stateless service deployment, consistent hashing versus sharding, fault‑as‑normal principles, replication, automated failover, load‑balancing, and elastic scaling to achieve high availability and efficient resource use.

Facebookload balancingresource utilization
0 likes · 9 min read
Facebook’s Shard Manager: Strategies for Large‑Scale System Sharding, Fault Tolerance, and Resource Utilization
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 17, 2021 · Databases

Design and Implementation of a Cloud‑Native MySQL Container Platform for High Availability and Resource Efficiency

The article describes how a bank built a Kubernetes‑based, containerized MySQL service platform (CDD) to improve database high availability, resource utilization, automated operations, and agile delivery by addressing network, storage, scheduling, and management challenges through custom networking, hybrid storage, scheduler extensions, and multi‑AZ deployment.

Cloud NativeKubernetescontainerization
0 likes · 16 min read
Design and Implementation of a Cloud‑Native MySQL Container Platform for High Availability and Resource Efficiency
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 24, 2021 · Cloud Computing

LIBRA and CARE: Memory Bandwidth Management and Fault‑Tolerance Innovations Presented at HPCA 2021

The article reviews two HPCA 2021 papers from Alibaba Cloud—LIBRA, a dynamic memory‑bandwidth management framework that boosts data‑center utilization, and CARE, a cache‑based fault‑tolerance architecture that delivers near‑Chipkill reliability with minimal overhead—while also highlighting future research directions in ML systems, quantum computing, and cache computing.

HPCA2021Memory Bandwidthcloud computing
0 likes · 4 min read
LIBRA and CARE: Memory Bandwidth Management and Fault‑Tolerance Innovations Presented at HPCA 2021
Liulishuo Tech Team
Liulishuo Tech Team
Feb 4, 2021 · Cloud Computing

Improving Cloud Cost Allocation and Resource Utilization through Catalog, Tags, and Automated Monitoring

This article describes how a tech team built a catalog‑based cost‑allocation system, leveraged cloud tags and Kubernetes labels, used Prometheus data for scaling decisions, and combined reserved, spot, and on‑demand instances to boost cloud resource utilization while keeping services stable.

Cloud Costautoscalingcloud-tagging
0 likes · 8 min read
Improving Cloud Cost Allocation and Resource Utilization through Catalog, Tags, and Automated Monitoring
DataFunTalk
DataFunTalk
Jun 20, 2020 · Cloud Native

Automated Elastic Scaling for Million‑Scale Core Services and Mixed Workloads on ByteDance's Private Cloud Platform

This article presents ByteDance's private cloud platform TCE architecture and explains how automated elastic scaling, dynamic over‑commit, and mixed‑workload deployment are used to improve resource utilization for millions of services, balancing online peak demand with offline batch tasks.

Cloud NativeKuberneteselastic scaling
0 likes · 25 min read
Automated Elastic Scaling for Million‑Scale Core Services and Mixed Workloads on ByteDance's Private Cloud Platform
Didi Tech
Didi Tech
Dec 2, 2019 · Operations

Capacity Estimation Methodology for Growing Services

The article presents a systematic capacity‑estimation methodology that links service traffic to order volume, uses CPU‑Idle as a primary metric, predicts traffic growth and upper‑bound limits, validates predictions with load‑testing, and provides scaling recommendations while noting limitations of the CPU‑Idle baseline.

Traffic Predictioncapacity planningresource utilization
0 likes · 9 min read
Capacity Estimation Methodology for Growing Services
Alibaba Cloud Native
Alibaba Cloud Native
Oct 11, 2019 · Cloud Native

Can Dynamic Cgroup Tweaks Boost Kubernetes Resource Utilization?

This article shares Alibaba Cloud Container Platform's practical experience in improving container resource utilization by dynamically adjusting cgroup limits, describing real‑world challenges, the design of a policy‑engine solution, experimental results, lessons learned, and future directions for cloud‑native workloads.

Dynamic SchedulingKubernetesPolicy Engine
0 likes · 25 min read
Can Dynamic Cgroup Tweaks Boost Kubernetes Resource Utilization?
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 20, 2018 · Big Data

Unlocking Alibaba’s Massive Cluster Data V2018: A Treasure Trove for Big‑Data Research

Alibaba has released the comprehensive Cluster Data V2018 dataset, detailing eight days of operation for 4,000 servers and their mixed online and offline workloads, including DAG information, enabling researchers to study large‑scale data‑center performance, resource utilization, scheduling algorithms, and derive new insights.

Big DataDAGDataset
0 likes · 7 min read
Unlocking Alibaba’s Massive Cluster Data V2018: A Treasure Trove for Big‑Data Research
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 7, 2018 · Cloud Native

How Alibaba’s Sigma‑Cerebro Simulator Boosts Cluster Utilization for Double‑11

The article explains Alibaba’s Sigma container‑scheduling system and its Cerebro simulation platform, detailing how they improve resource utilization, reduce costs during large‑scale events like Double‑11, and address challenges such as fragmentation, rapid scaling, image distribution, and accurate workload forecasting.

Cloud Nativecontainer schedulingresource utilization
0 likes · 12 min read
How Alibaba’s Sigma‑Cerebro Simulator Boosts Cluster Utilization for Double‑11
21CTO
21CTO
Dec 24, 2017 · Cloud Computing

Tencent’s Elastic Compute: Efficient Idle Resource Use Without Service Disruption

This article describes Tencent’s elastic computing platform built to harness idle on‑premise resources for massive image, video, AI, and log processing workloads, detailing the architectural layers, strategies for protecting online service capacity, latency, scheduling and fault rates, and the practical lessons learned from its deployment.

Performance Optimizationcloud infrastructurecontainer scheduling
0 likes · 15 min read
Tencent’s Elastic Compute: Efficient Idle Resource Use Without Service Disruption
Ctrip Technology
Ctrip Technology
Feb 16, 2017 · Operations

Application‑Based Automated Capacity Management and Utilization Evaluation

The article presents a comprehensive, application‑centric approach to automated capacity management that analyzes why server utilization is low, defines safe usage thresholds, describes a load‑balancer‑driven stress‑testing workflow with regression modeling, and explains how this practice improves resource efficiency, cost savings, and developer‑ops collaboration.

AutomationDevOpsOperations
0 likes · 14 min read
Application‑Based Automated Capacity Management and Utilization Evaluation
Qunar Tech Salon
Qunar Tech Salon
Feb 14, 2017 · Operations

Application‑Based Automated Capacity Management and Utilization Evaluation

This article explains how to automate application‑centric capacity assessment, identify the safe utilization thresholds, use load‑balancer‑driven stress testing and regression modeling to pinpoint resource bottlenecks, and improve server usage while maintaining service reliability through close DevOps collaboration.

AutomationDevOpsOperations
0 likes · 15 min read
Application‑Based Automated Capacity Management and Utilization Evaluation
Efficient Ops
Efficient Ops
Feb 9, 2017 · Operations

Automating Application‑Based Capacity Management to Boost Resource Utilization

This article explains how to automate capacity management focused on application performance, identifies common causes of low resource utilization, proposes safe utilization thresholds, describes a testing framework that uses load‑balancer weighting and real‑time monitoring to pinpoint bottlenecks, and outlines how ops and developers can collaborate to improve efficiency.

AutomationOperationsPerformance Testing
0 likes · 18 min read
Automating Application‑Based Capacity Management to Boost Resource Utilization