Tagged articles
318 articles
Page 2 of 4
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 25, 2023 · Cloud Native

How ByteDance Scaled with Multi‑Cloud: Lessons from Their Cloud‑Native Journey

ByteDance’s multi‑cloud evolution, driven by rapid business growth, cost control, and compliance needs, showcases a distributed cloud‑native platform built on open‑source orchestration, unified resource management, and advanced data‑lake solutions, while addressing operational complexity, interoperability, and emerging AI‑driven challenges.

AIBig DataKubernetes
0 likes · 14 min read
How ByteDance Scaled with Multi‑Cloud: Lessons from Their Cloud‑Native Journey
Bilibili Tech
Bilibili Tech
Aug 25, 2023 · Game Development

Technical Implementation of the "Da Li Chu Qi Ji" Mini‑Game on Bilibili

The article details the end‑to‑end technical implementation of Bilibili’s “Da Li Chu Qi Ji” mini‑game, describing a GSAP‑driven 4‑second countdown, a PixiJS‑based core gameplay with custom resource loading and event handling, and the evaluation and integration of PAG, SVGA, and MP4 animation formats for reward effects.

GSAPPixiJSResource Management
0 likes · 29 min read
Technical Implementation of the "Da Li Chu Qi Ji" Mini‑Game on Bilibili
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Aug 25, 2023 · Databases

Unlock GaussDB(DWS) Performance: Expert Tips for Resource Management

This announcement introduces Huawei Cloud’s DTT live session on August 29, where expert Lv Pengbo will explain GaussDB(DWS) resource control principles and demonstrate practical techniques—such as CPU usage analysis, memory tuning, and queue issue resolution—to help developers efficiently manage data‑warehouse resources and boost performance.

Data WarehouseDatabase TuningGaussDB
0 likes · 3 min read
Unlock GaussDB(DWS) Performance: Expert Tips for Resource Management
Alibaba Cloud Native
Alibaba Cloud Native
Aug 21, 2023 · Cloud Native

Optimizing Multi‑Cluster Cloud Native Costs: ZEEK’s ACK FinOps Journey

This article details how ZEEK automotive tackled rapid growth challenges by redesigning its cloud‑native infrastructure, adopting Alibaba Cloud ACK FinOps and ACK One for multi‑cluster management, and implementing cost‑visibility, intelligent allocation, and configuration checks that yielded significant resource savings and operational stability.

Cost OptimizationFinOpsKubernetes
0 likes · 18 min read
Optimizing Multi‑Cluster Cloud Native Costs: ZEEK’s ACK FinOps Journey
ByteDance Cloud Native
ByteDance Cloud Native
Aug 15, 2023 · Cloud Native

What’s New in Katalyst v0.3.0? Core Enhancements Explained

Katalyst v0.3.0 introduces major upgrades including enhanced KCNR API bandwidth isolation, a more extensible task and async execution framework, advanced mixed‑deployment controls, load‑aware resource prediction, and concurrent unit testing, all aimed at improving cloud‑native resource management efficiency.

KatalystKubernetesResource Management
0 likes · 4 min read
What’s New in Katalyst v0.3.0? Core Enhancements Explained
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Aug 10, 2023 · Operations

How Kubernetes Powers Modern DevOps Automation and Operations

By integrating Kubernetes with DevOps practices, teams can automate deployment pipelines, achieve dynamic resource allocation, centralize monitoring with tools like Prometheus and Grafana, and treat infrastructure as code, resulting in faster, higher-quality software delivery and improved collaboration between development and operations.

AutomationDevOpsInfrastructure as Code
0 likes · 7 min read
How Kubernetes Powers Modern DevOps Automation and Operations
Test Development Learning Exchange
Test Development Learning Exchange
Jul 29, 2023 · Fundamentals

Understanding Python Context Managers: __enter__ and __exit__ Methods with Practical Examples

This article explains Python's context manager protocol, detailing the roles of __enter__() and __exit__() methods, their typical use cases, and provides ten concrete code examples ranging from file handling and database connections to custom managers for networking, temporary files, and exception handling.

Resource Managementcode-examplescontext manager
0 likes · 5 min read
Understanding Python Context Managers: __enter__ and __exit__ Methods with Practical Examples
Bilibili Tech
Bilibili Tech
Jul 28, 2023 · Operations

How to Build an Efficient Public Resource Management System for Testing Teams

The article explains why testing teams need systematic public resource management, outlines the challenges of resource flow, asset control, and security, and details the evolution from a simple approval pipeline (v1.0) to a more autonomous, domain‑aware system (v2.0) with practical solutions.

AutomationResource ManagementSecurity
0 likes · 10 min read
How to Build an Efficient Public Resource Management System for Testing Teams
DataFunTalk
DataFunTalk
Jul 22, 2023 · Big Data

Optimization Practices for Real-Time Data Warehouse Governance at NetEase Cloud Music

This article details the current challenges, governance motivations, architectural design, and technical optimizations—including Flink SQL tuning, Kafka batch improvements, partitioned stream tables, containerization, and automated governance—implemented to enhance the efficiency, stability, and cost-effectiveness of NetEase Cloud Music's real-time data warehouse platform.

Flink optimizationKafka batchResource Management
0 likes · 23 min read
Optimization Practices for Real-Time Data Warehouse Governance at NetEase Cloud Music
Open Source Linux
Open Source Linux
May 31, 2023 · Operations

Mastering Linux Cgroups: A Complete Guide to Resource Management and Containerization

This article provides a comprehensive overview of Linux cgroups, explaining their purpose, architecture, versions, subsystems, and practical usage with systemd, including installation, configuration files, command‑line tools, and methods to monitor and limit CPU, memory, and I/O resources for containers and services.

ContainersResource Managementcgroups
0 likes · 25 min read
Mastering Linux Cgroups: A Complete Guide to Resource Management and Containerization
MaGe Linux Operations
MaGe Linux Operations
May 20, 2023 · Cloud Native

Mastering Kubernetes QoS: Guarantees, Burstable, and BestEffort Explained

This article explains Kubernetes QoS classes—Guaranteed, Burstable, and BestEffort—detailing their resource request and limit requirements, how to configure them with YAML examples, the eviction priority order, and practical best‑practice strategies for classifying and scheduling workloads across cluster nodes.

KubernetesPod SchedulingQoS
0 likes · 6 min read
Mastering Kubernetes QoS: Guarantees, Burstable, and BestEffort Explained
System Architect Go
System Architect Go
Mar 31, 2023 · Cloud Native

Understanding CPU Requests and Limits in Kubernetes

This article explains how Kubernetes uses CPU requests and limits to schedule pods, allocate CPU proportionally, calculate minimal request units, and provides practical guidelines for setting appropriate request and limit values based on workload characteristics and monitoring data.

KubernetesLimitsResource Management
0 likes · 6 min read
Understanding CPU Requests and Limits in Kubernetes
DevOps Cloud Academy
DevOps Cloud Academy
Mar 30, 2023 · Cloud Computing

FinOps: A Personal Story of Cloud Cost Optimization and Practical Steps

This article introduces FinOps, explains its definition and cultural significance, illustrates the concept with a personal household billing story, and outlines five concrete steps—cost allocation, target setting, budget control, operational management, and intelligent automation—to achieve cloud cost optimization.

Cloud Cost ManagementCost OptimizationFinOps
0 likes · 10 min read
FinOps: A Personal Story of Cloud Cost Optimization and Practical Steps
Efficient Ops
Efficient Ops
Mar 21, 2023 · Operations

How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform

This article details Hupu's massive sports‑traffic environment, the design and implementation of the Flex auto‑scaling platform, its architecture, core functions such as resource statistics, node and pod scaling, scenario scheduling, and the performance optimizations that enable rapid, cost‑effective scaling across multi‑cloud Kubernetes clusters.

Auto ScalingKubernetesPerformance Optimization
0 likes · 15 min read
How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 15, 2023 · Frontend Development

Breaking the Cycle of Frontend Resource‑Centric Work: Strategies to Lead and Add Business Value

This article examines why front‑end teams often become mere resources, outlines the emotional and operational challenges of resource‑centric work, and presents practical strategies—including goal setting, role clarification, and risk management—to transform front‑end into a value‑adding leader, illustrated by an experience‑driven pricing revamp case study.

Resource Managementbusiness valueexperience-driven
0 likes · 12 min read
Breaking the Cycle of Frontend Resource‑Centric Work: Strategies to Lead and Add Business Value
AntTech
AntTech
Mar 13, 2023 · Cloud Computing

Cougar: A General Framework for Jobs Optimization in Cloud

Cougar is a cloud‑native, multi‑objective optimization framework that unifies metadata and monitoring ingestion to improve resource efficiency and performance for large‑scale AI and big‑data jobs, demonstrating over 50% CPU‑memory savings and stable latency in production experiments.

Resource Managementartificial intelligencecloud computing
0 likes · 10 min read
Cougar: A General Framework for Jobs Optimization in Cloud
Baidu App Technology
Baidu App Technology
Mar 10, 2023 · Mobile Development

How Baidu Cut Its iOS App Size by 50 MB: A Deep Dive into Package Optimization

This article examines why Baidu's super‑app needed a drastic reduction in its iOS package size, outlines the metrics that link bundle size to download conversion, compares the app‑size footprints of major domestic and overseas apps, and details the multi‑layered technical solutions—resource trimming, architecture safeguards, compiler tweaks, image compression, and code slimming—that together saved over 50 MB while preserving functionality.

Compiler OptimizationMobile DevelopmentResource Management
0 likes · 16 min read
How Baidu Cut Its iOS App Size by 50 MB: A Deep Dive into Package Optimization
Zhuanzhuan Tech
Zhuanzhuan Tech
Feb 20, 2023 · Operations

Evolution of Zhuanzhuan's Test Environments: From Monolithic Setups to Docker‑Based Dynamic and Stable Platforms

This article details how Zhuanzhuan transformed its testing infrastructure from a handful of monolithic servers to a Docker‑driven, tag‑routed dynamic and stable environment, addressing resource shortages, waste, and stability issues while achieving significant reductions in deployment time, resource consumption, and user‑reported problems.

DevOpsDockerKubernetes
0 likes · 14 min read
Evolution of Zhuanzhuan's Test Environments: From Monolithic Setups to Docker‑Based Dynamic and Stable Platforms
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Feb 15, 2023 · Operations

How YARN and Kubernetes Solve Distributed Resource Management Challenges

This article explains how Apache YARN and Google Kubernetes address the three core problems of resource utilization, task responsiveness, and flexible scheduling in distributed environments, detailing their architectures, scheduling models, and practical implications for modern big‑data and cloud workloads.

KubernetesResource ManagementScheduling
0 likes · 8 min read
How YARN and Kubernetes Solve Distributed Resource Management Challenges
JD Cloud Developers
JD Cloud Developers
Feb 14, 2023 · Cloud Native

Why Kubernetes Is the Backbone of Modern Cloud‑Native Architecture

This article explains the evolution from monolithic to microservice architectures, introduces Kubernetes as the core cloud‑native platform, and details its components, design principles, and resource management strategies for compute, networking, and storage within a cluster.

CSIIngressKubernetes
0 likes · 22 min read
Why Kubernetes Is the Backbone of Modern Cloud‑Native Architecture
NetEase Media Technology Team
NetEase Media Technology Team
Feb 13, 2023 · Game Development

How to Build a Dynamic Face‑Customization System on Mobile with Spine

This article explains how to use the Spine 2D skeletal animation framework to implement a flexible, runtime face‑customization and outfit‑changing feature on mobile platforms, covering basic concepts, code examples, resource handling, memory optimizations, and platform‑specific integration challenges.

Performance OptimizationResource ManagementSpine
0 likes · 22 min read
How to Build a Dynamic Face‑Customization System on Mobile with Spine
Architecture Digest
Architecture Digest
Feb 10, 2023 · Operations

Design and Implementation of Vivo Jenkins Scheduler for High Availability and Resource Scheduling

This article analyzes common Jenkins high‑availability challenges, reviews existing industry solutions, and presents Vivo's own Jenkins Scheduler architecture—including API‑gateway, event center, scheduling algorithms, flow‑control, and callback mechanisms—demonstrating its production deployment and future container‑based evolution.

DevOpsJenkinsResource Management
0 likes · 12 min read
Design and Implementation of Vivo Jenkins Scheduler for High Availability and Resource Scheduling
vivo Internet Technology
vivo Internet Technology
Feb 8, 2023 · Operations

Design and Implementation of Vivo Jenkins Scheduler for High Availability and Resource Management

The paper presents Vivo’s Jenkins Scheduler, a master‑centric, high‑availability solution that replaces single‑master Jenkins by integrating an API gateway, event‑driven failure detection, label‑based multi‑dimensional scheduling, Redis/MySQL‑backed flow control, and callback monitoring, thereby balancing resources, enabling rapid failover, persisting queues, and improving build reliability, with plans to containerize Jenkins for Kubernetes workflows.

DevOpsJenkinsResource Management
0 likes · 10 min read
Design and Implementation of Vivo Jenkins Scheduler for High Availability and Resource Management
Architects' Tech Alliance
Architects' Tech Alliance
Jan 30, 2023 · Operations

Advanced Software Performance Optimization Techniques: From Resource Exhaustion to Parallelism

This article presents a comprehensive guide to software performance optimization, covering low‑level resource exhaustion, horizontal scaling, sharding, lock‑free techniques, and system‑wide strategies, while offering practical examples and references for developers seeking to improve efficiency and scalability.

ParallelismResource ManagementScalability
0 likes · 12 min read
Advanced Software Performance Optimization Techniques: From Resource Exhaustion to Parallelism
FunTester
FunTester
Jan 29, 2023 · Backend Development

How to Extend Commons‑Pool2 for Custom KeyedObjectPool Idle Management

This article explains why the default Apache Commons‑Pool2 APIs cannot limit idle objects per key, and demonstrates three practical techniques—scheduled cleanup, factory modification, and priority‑based eviction—to enforce per‑key idle limits in a GenericKeyedObjectPool implementation.

BackendGenericKeyedObjectPoolJava
0 likes · 7 min read
How to Extend Commons‑Pool2 for Custom KeyedObjectPool Idle Management
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Jan 18, 2023 · Big Data

How Distributed Technologies Power Modern Big Data Platforms

This article explains how distributed storage, computing, and resource‑management technologies have evolved—from early Google File System research to Hadoop, Spark, and Kubernetes—enabling enterprises to tackle the 4 Vs of big data while reducing cost, improving performance, and supporting real‑time analytics.

Resource Managementcomputingstorage
0 likes · 17 min read
How Distributed Technologies Power Modern Big Data Platforms
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 16, 2023 · Databases

Deploying OceanBase 4.X as a Minimal Single‑Node Distributed Database

This article demonstrates how to deploy OceanBase 4.X in a minimal single‑node configuration, explains the key resource parameters, provides the necessary YAML configuration and command‑line steps to start the server, create a MySQL‑compatible tenant, and verify resource usage and basic database operations.

Database ConfigurationMySQL compatibilityOceanBase
0 likes · 7 min read
Deploying OceanBase 4.X as a Minimal Single‑Node Distributed Database
Huolala Tech
Huolala Tech
Jan 12, 2023 · Cloud Computing

How Huolala Cut Cloud Costs and Boost Efficiency with FinOps

This article examines Huolala's comprehensive cloud cost governance strategy, detailing infrastructure distribution, cost‑trend analysis, optimization tactics such as storage, networking, logging, spot instances, containerization, reserved instances, and the role of the Lcloud platform in achieving measurable savings and operational efficiency.

Cloud Cost OptimizationFinOpsResource Management
0 likes · 15 min read
How Huolala Cut Cloud Costs and Boost Efficiency with FinOps
Alibaba Cloud Native
Alibaba Cloud Native
Jan 9, 2023 · Cloud Native

CNStack 2.0: Cloud‑Native Design for Agile, Secure Multi‑Cluster Ops

CNStack 2.0 is a cloud‑native PaaS platform built on Kubernetes that unifies resource and workload management, offering agile, open, and secure multi‑cluster capabilities through modular cloud services, a unified API gateway, and integration with open‑source projects such as Sealer, Emissary‑Ingress, cert‑manager, Velero, and OCM.

KubernetesMulti-ClusterResource Management
0 likes · 24 min read
CNStack 2.0: Cloud‑Native Design for Agile, Secure Multi‑Cluster Ops
DataFunTalk
DataFunTalk
Jan 3, 2023 · Big Data

Tencent Unified Big Data Scheduling Platform – Architecture, Design, and Operations

The article presents an in‑depth overview of Tencent's self‑developed Unified Scheduling Platform, detailing its system architecture, design challenges, performance optimizations, resource‑fair scheduling mechanisms, operational metrics, future roadmap, and a Q&A session that together illustrate how the platform enables massive offline data processing at scale.

Big DataDistributed SystemsPerformance Optimization
0 likes · 18 min read
Tencent Unified Big Data Scheduling Platform – Architecture, Design, and Operations
DataFunTalk
DataFunTalk
Dec 29, 2022 · Big Data

Design and Implementation of OPPO's Big Data Diagnostic Platform (Compass)

This article presents the background, requirements, architecture, key modules, and practical impact of OPPO's non‑intrusive big‑data diagnostic platform—named Compass—designed to quickly locate issues, provide optimization suggestions, and achieve cost‑saving and efficiency gains for large‑scale Spark and Hadoop workloads.

Big DataCost reductionHadoop
0 likes · 17 min read
Design and Implementation of OPPO's Big Data Diagnostic Platform (Compass)
Architecture Digest
Architecture Digest
Dec 2, 2022 · Big Data

Design and Implementation of Vivo's Bees Log Collection Agent

This article presents the design principles, core techniques, and practical solutions of Vivo's self‑developed Bees log collection agent, covering file discovery, unique identification, real‑time and offline ingestion, checkpointing, resource control, platform management, and a comparison with open‑source alternatives.

Agent DesignJavaKafka
0 likes · 25 min read
Design and Implementation of Vivo's Bees Log Collection Agent
High Availability Architecture
High Availability Architecture
Nov 30, 2022 · Big Data

Design and Implementation of Vivo's Bees Log Collection Agent

This article presents the design principles, core features, and implementation details of Vivo's self‑developed Bees log collection agent, covering file discovery, unique identification, real‑time and offline ingestion, resource control, platform management, and comparisons with open‑source solutions.

HDFSJavaKafka
0 likes · 22 min read
Design and Implementation of Vivo's Bees Log Collection Agent
dbaplus Community
dbaplus Community
Nov 28, 2022 · Operations

How Bilibili Guaranteed Seamless Live Streaming for the League of Legends S12 Finals

Bilibili’s S12 technical guarantee team coordinated dozens of engineering groups, performed resource estimation, built a shared resource pool, applied chaos engineering, high‑availability architecture, and systematic performance testing to ensure the League of Legends World Championship livestream remained stable and responsive under peak traffic.

Performance TestingResource ManagementSRE
0 likes · 19 min read
How Bilibili Guaranteed Seamless Live Streaming for the League of Legends S12 Finals
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 28, 2022 · Cloud Computing

How Baidu’s ARIES Powers Exabyte-Scale Cloud Storage for Baidu Netdisk

This article presents a comprehensive overview of Baidu’s ARIES storage platform, detailing its design philosophy, architecture, key concepts, and engineering challenges, and explains how it underpins Baidu Netdisk’s massive data‑plane storage with high availability, cost‑performance trade‑offs, and robust monitoring.

Distributed SystemsResource Managementcloud storage
0 likes · 36 min read
How Baidu’s ARIES Powers Exabyte-Scale Cloud Storage for Baidu Netdisk
Tencent Cloud Developer
Tencent Cloud Developer
Nov 24, 2022 · Cloud Native

Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation

The article details a Tencent‑led, end‑to‑end cost‑optimization project for large‑scale Kubernetes/TKE clusters that collected extensive workload metrics, applied VPA/HPA enhancements, custom scheduling and node‑downscaling via the open‑source Crane platform, ultimately delivering up to 70% CPU and 50% memory savings with zero‑fault deployments.

HPAKubernetesResource Management
0 likes · 29 min read
Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation
MaGe Linux Operations
MaGe Linux Operations
Nov 23, 2022 · Fundamentals

Mastering Linux Cgroups: The Core of Container Resource Management

Linux cgroups, a kernel mechanism for grouping and controlling processes, enable fine-grained resource allocation, monitoring, and isolation, forming the foundation of container technologies like Docker and Kubernetes; this guide explains their concepts, hierarchies, subsystems, versions, configuration, and practical usage on CentOS.

CPUContainersLinux
0 likes · 28 min read
Mastering Linux Cgroups: The Core of Container Resource Management
AntTech
AntTech
Nov 10, 2022 · Cloud Computing

DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services

DeepScaling is a deep‑learning‑driven autoscaling framework that predicts workload, estimates CPU usage, and makes reinforcement‑learning‑based scaling decisions to keep microservice CPU utilization at a target level, thereby reducing resource waste while meeting SLOs in large‑scale cloud environments.

Resource Managementautoscalingcloud computing
0 likes · 21 min read
DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services
DaTaobao Tech
DaTaobao Tech
Nov 4, 2022 · Backend Development

Designing Stateful and Resource‑Safe Reactive Stream Operators with statefulMap

The article shows how a unified abstraction called statefulMap, together with a resource‑aware mapWithResource primitive, lets developers implement a wide range of complex reactive‑stream operators—such as buffering, indexing, deduplication, and safe DB access—in a concise, composable, thread‑safe manner, dramatically reducing boilerplate code.

AkkaJavaReactive Streams
0 likes · 15 min read
Designing Stateful and Resource‑Safe Reactive Stream Operators with statefulMap
Architecture and Beyond
Architecture and Beyond
Oct 15, 2022 · Operations

Technical Cost Optimization and Fine‑Grained Operations: Strategies, Processes, and Best Practices

This article provides a comprehensive guide for technical leaders on reducing and managing technology costs through a two‑stage approach of cost optimization and fine‑grained operations, covering team formation, current‑state analysis, discount and storage tactics, project planning, communication, and long‑term process and system support.

Cost OptimizationOperationsResource Management
0 likes · 27 min read
Technical Cost Optimization and Fine‑Grained Operations: Strategies, Processes, and Best Practices
21CTO
21CTO
Aug 22, 2022 · Operations

How Meituan Scaled Its Pipeline Engine to Power 100k Daily Jobs

This article explains how Meituan built a unified, highly available pipeline engine that supports nearly 100,000 daily executions across dozens of services with 99.99% success, detailing the challenges faced, the architectural decisions made, and the future roadmap for further scalability and cloud‑native improvements.

Distributed SchedulingMeituanPipeline
0 likes · 24 min read
How Meituan Scaled Its Pipeline Engine to Power 100k Daily Jobs
ITPUB
ITPUB
Aug 20, 2022 · Operations

How Meituan Scaled Its CI/CD Pipeline Engine to 100k Daily Jobs with 99.99% Success

This article details Meituan's three‑year journey building a self‑developed pipeline engine that now handles nearly 100,000 daily executions with over 99.99% reliability, covering background, challenges, architectural decisions, core scheduling and resource‑pool designs, component layering, and future cloud‑native plans.

Job SchedulingOperationsPipeline
0 likes · 25 min read
How Meituan Scaled Its CI/CD Pipeline Engine to 100k Daily Jobs with 99.99% Success
IT Architects Alliance
IT Architects Alliance
Aug 16, 2022 · Backend Development

Seven Key Directions for Java Code Performance Optimization

This article theoretically outlines seven major Java performance optimization strategies—including reuse, computation, result set, resource conflict, algorithm, high‑efficiency implementation, and JVM tuning—explaining their principles, typical techniques, and how they collectively improve resource utilization and application speed.

Backend DevelopmentJVMJava
0 likes · 11 min read
Seven Key Directions for Java Code Performance Optimization
Open Source Linux
Open Source Linux
Aug 9, 2022 · Fundamentals

How Docker Uses Linux cgroups to Allocate CPU Resources

This article explains how Docker containers rely on Linux cgroups and namespaces for resource isolation, details CPU share and quota scheduling, and shows practical commands to inspect cgroup assignments, helping developers optimize container performance on Kubernetes.

CPU schedulingDockerKubernetes
0 likes · 16 min read
How Docker Uses Linux cgroups to Allocate CPU Resources
dbaplus Community
dbaplus Community
Aug 8, 2022 · Operations

How Meituan Scaled Its CI/CD Pipeline to 100K Daily Runs with 99.99% Success

This article details Meituan's three‑year journey building a self‑developed, distributed pipeline engine that now handles nearly 100,000 daily executions across dozens of services with over 99.99% reliability, covering the challenges faced, architectural decisions, scheduling and resource‑pool designs, and future cloud‑native plans.

MeituanPipelineResource Management
0 likes · 28 min read
How Meituan Scaled Its CI/CD Pipeline to 100K Daily Runs with 99.99% Success
DataFunTalk
DataFunTalk
Jul 29, 2022 · Artificial Intelligence

Tencent Music Cloud‑Native One‑Stop Machine Learning Platform: Features and Future Roadmap

This article introduces Tencent Music's cloud‑native, one‑stop machine learning platform, detailing its engineering workflow, distributed acceleration, inference closed‑loop, edge computing capabilities, and future plans, while highlighting challenges of traditional ML pipelines and the platform's solutions for resource orchestration, storage, scheduling, and GPU utilization.

AI PlatformDistributed TrainingPipeline
0 likes · 17 min read
Tencent Music Cloud‑Native One‑Stop Machine Learning Platform: Features and Future Roadmap
Zuoyebang Tech Team
Zuoyebang Tech Team
Jul 22, 2022 · Mobile Development

How to Slash Live‑Streaming App Memory & CPU Usage on Mobile Devices

This article analyzes the architecture and performance bottlenecks of a mobile live‑streaming classroom, defines measurable APM metrics, identifies root causes such as CPU, memory, GPU contention and signaling issues, and presents concrete optimization techniques—including independent processes, containerization, dedicated signaling channels, rendering and thread improvements—that dramatically reduce memory, CPU and frame‑rate problems.

APMResource Managementlive streaming
0 likes · 14 min read
How to Slash Live‑Streaming App Memory & CPU Usage on Mobile Devices
Cognitive Technology Team
Cognitive Technology Team
Jul 2, 2022 · Fundamentals

Defensive Programming Principles and Practices

The article outlines defensive programming concepts, emphasizing input validation, error handling, resource management, isolation techniques, and design considerations such as thread safety, cache strategies, and interface versioning to build robust and resilient software systems.

Error HandlingResource ManagementSoftware Robustness
0 likes · 4 min read
Defensive Programming Principles and Practices
DataFunSummit
DataFunSummit
Jul 1, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

Shilong Fei from Xiaomi Data Platform presents an in‑depth exploration of elastic scheduling for Hadoop YARN, covering background, design of resource pools, auto‑scaling architecture, challenges such as job stability and user transparency, achieved cost reductions, and future plans for further optimization.

Auto ScalingBig DataHadoop
0 likes · 20 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN
Weimob Technology Center
Weimob Technology Center
Jun 17, 2022 · Backend Development

How an External Resource Manager Boosts Service Resilience with Multi‑Layer Caching

This article explains the design and operation of an external resource manager that improves overseas e‑commerce service stability by employing CDN, local memory, and disk caches, detailing access methods, cache mechanisms, CDN configurations, and practical use cases such as Google Fonts and translation integration.

BackendCDNResource Management
0 likes · 12 min read
How an External Resource Manager Boosts Service Resilience with Multi‑Layer Caching
Alibaba Cloud Native
Alibaba Cloud Native
Jun 16, 2022 · Cloud Native

How Koordinator Improves Efficiency and Stability for Cloud‑Native Mixed Workloads

This article explains how Alibaba Cloud's open‑source Koordinator system tackles mixed‑workload challenges by introducing priority and QoS models, resource overcommit, load‑aware scheduling, fine‑grained CPU orchestration, and upcoming features such as GPU scheduling and resource recommendation, all illustrated with architecture diagrams and code examples.

Cloud NativeKoordinatorKubernetes
0 likes · 24 min read
How Koordinator Improves Efficiency and Stability for Cloud‑Native Mixed Workloads
58 Tech
58 Tech
Jun 16, 2022 · Artificial Intelligence

Backend Architecture and Performance Optimization of an AI Interview Robot

This article details the backend architecture, dialogue engine design, resource estimation, and performance optimization techniques of an AI interview robot used in 58.com’s recruitment platform, illustrating how multi‑branch dialogue flows, RTP communication, session management, and monitoring enable scalable, stable, and efficient online interview services.

AIDialogue EngineInterview Robot
0 likes · 17 min read
Backend Architecture and Performance Optimization of an AI Interview Robot
Big Data Technology Architecture
Big Data Technology Architecture
Jun 8, 2022 · Big Data

Bilibili Offline Computing Platform: Migration from Hive to Spark and Comprehensive Performance Optimizations

The article details Bilibili's evolution of its offline computing platform from Hadoop‑based Hive to Spark, describing migration tools, SQL conversion, result and resource comparison, shuffle stability, small‑file handling, runtime filters, data skipping, ZSTD support, Hive Metastore federation, traffic control, and future optimization directions.

Data MigrationHiveResource Management
0 likes · 29 min read
Bilibili Offline Computing Platform: Migration from Hive to Spark and Comprehensive Performance Optimizations
Alibaba Terminal Technology
Alibaba Terminal Technology
May 31, 2022 · Mobile Development

How DX Achieves Near‑Native Performance: Pipeline Async, Async Rendering, and Resource Control

This article details how Alibaba’s DX (DinamicX) framework attains near‑native performance through server‑side XML compilation, lightweight virtual trees, pipeline asynchronous execution, async drawing, and off‑screen resource control, and presents measurable gains such as 40% reduction in main‑thread work and 65% CPU savings on media‑heavy pages.

Async RenderingDXMobile UI
0 likes · 18 min read
How DX Achieves Near‑Native Performance: Pipeline Async, Async Rendering, and Resource Control
DataFunTalk
DataFunTalk
May 21, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

This talk presents Xiaomi's design and deployment of an elastic scheduling system for Hadoop YARN, covering background analysis, resource‑pool strategy, auto‑scaling architecture, stability challenges, label‑based resource isolation, Spark shuffle handling, cost‑saving results and future plans.

Big DataHadoopResource Management
0 likes · 16 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN
MaGe Linux Operations
MaGe Linux Operations
May 5, 2022 · Fundamentals

Master Python’s with Statement: Simplify Resource Management

This article explains the purpose and mechanics of Python's with statement, showing how it leverages context managers to replace verbose try‑finally blocks, and demonstrates both class‑based and decorator‑based implementations with clear code examples.

Code ExampleDecoratorPython
0 likes · 6 min read
Master Python’s with Statement: Simplify Resource Management
Alibaba Cloud Native
Alibaba Cloud Native
Apr 27, 2022 · Cloud Native

How ACK’s Resource Profiling Optimizes Kubernetes CPU & Memory Requests

This article explains how Alibaba Cloud Container Service for Kubernetes (ACK) uses container‑level resource profiling with half‑life sliding windows and quantile algorithms to automatically recommend accurate CPU and memory requests, improving cluster utilization while maintaining application stability.

ACKCloud NativeKubernetes
0 likes · 9 min read
How ACK’s Resource Profiling Optimizes Kubernetes CPU & Memory Requests
HomeTech
HomeTech
Apr 27, 2022 · Big Data

AutoStream Real‑Time Computing Platform: Architecture, Resource Management, Scaling, Lakehouse Integration, and PyFlink Practices

This article details Car Home's AutoStream platform evolution from Storm to Flink‑based versions, covering real‑time application scenarios, strict budget‑controlled resource management, automatic scaling, lake‑house architecture with Iceberg, PyFlink integration, and future plans for resource optimisation and batch‑stream unification.

AutoStreamFlinkLakehouse
0 likes · 15 min read
AutoStream Real‑Time Computing Platform: Architecture, Resource Management, Scaling, Lakehouse Integration, and PyFlink Practices
dbaplus Community
dbaplus Community
Apr 10, 2022 · Operations

How to Build a Practical SRE Operations Framework for Large‑Scale Systems

This article presents a hands‑on SRE framework covering the full product lifecycle—code development, resource planning, deployment, operational reliability, and decommissioning—derived from real‑world practices at Xiaomi and Sina to help teams manage massive internet services efficiently and cost‑effectively.

Resource ManagementSRESystem Lifecycle
0 likes · 16 min read
How to Build a Practical SRE Operations Framework for Large‑Scale Systems
ByteFE
ByteFE
Mar 23, 2022 · Frontend Development

Effects of Too Much Lazy Loading on Web Performance

While lazy loading can reduce initial page load time and save resources, overusing it can slow scrolling, cause layout shifts, hinder SEO, and ultimately degrade overall web performance, so developers need clear guidelines on when and how to apply it effectively.

Frontend OptimizationResource ManagementSEO
0 likes · 7 min read
Effects of Too Much Lazy Loading on Web Performance
Ops Development Stories
Ops Development Stories
Mar 22, 2022 · Operations

Optimize Kubernetes Resource Use with Requests, Limits, and Scheduling

This article explains common causes of resource waste in Kubernetes clusters, such as over‑provisioned requests and fluctuating workloads, and provides practical methods—including proper request/limit settings, ResourceQuota and LimitRange policies, node affinity, taints and tolerations, and HPA—to improve overall resource utilization and cluster stability.

KubernetesLimitRangeNode Affinity
0 likes · 16 min read
Optimize Kubernetes Resource Use with Requests, Limits, and Scheduling
Ops Development Stories
Ops Development Stories
Mar 22, 2022 · Cloud Native

Mastering Kubernetes Pod Resource Requests, Limits, and QoS

This guide explains how to configure CPU and memory requests and limits for Kubernetes pods, implement QoS classes, use LimitRange and ResourceQuota, and monitor resource usage with Prometheus queries and Grafana dashboards to ensure stable cluster operations.

CPUKubernetesMemory
0 likes · 11 min read
Mastering Kubernetes Pod Resource Requests, Limits, and QoS
IT Services Circle
IT Services Circle
Mar 15, 2022 · Backend Development

Understanding Pooling Techniques: Thread Pools, Memory Pools, Database Connection Pools, and HttpClient Pools in Java

This article explains the concept of pooling technology, its advantages, and its practical applications in Java—including thread pools, memory pools, database connection pools, and HttpClient connection pools—while highlighting how these techniques improve performance, resource utilization, and system stability.

Connection PoolJavaResource Management
0 likes · 10 min read
Understanding Pooling Techniques: Thread Pools, Memory Pools, Database Connection Pools, and HttpClient Pools in Java
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 18, 2022 · Cloud Native

CPU Share Syncer: Enabling High‑Priority Task CPU Preemption in iQIYI Video Production Kubernetes Clusters

iQIYI’s cpu‑share‑syncer daemon runs on every node, reads a pod’s iqiyi.com/cpu‑share annotation, updates the pod’s cpu.shares after disabling the Kubernetes CPU CFS quota, and lets high‑priority video‑production pods pre‑empt CPU from lower‑priority pods, significantly speeding task execution.

CPU schedulingDaemonSetHigh priority tasks
0 likes · 13 min read
CPU Share Syncer: Enabling High‑Priority Task CPU Preemption in iQIYI Video Production Kubernetes Clusters
Tencent Cloud Developer
Tencent Cloud Developer
Jan 24, 2022 · Fundamentals

Understanding Move Semantics in C++11

Move semantics in C++11 let objects transfer ownership of resources via rvalue references and std::move, eliminating costly copies, improving performance, requiring explicit move constructors and assignment operators—preferably marked noexcept—to keep source objects valid, enable container optimizations, and work with NRVO when copying would otherwise occur.

C++C++11Resource Management
0 likes · 23 min read
Understanding Move Semantics in C++11
Alibaba Cloud Native
Alibaba Cloud Native
Jan 18, 2022 · Cloud Native

How Alibaba Cloud’s Differential SLO Boosts Kubernetes Resource Utilization

This article explains Alibaba Cloud Container Service for Kubernetes's differential SLO approach, detailing the reclaimed‑resource model, CPU burst and topology‑aware scheduling, kernel group identity, memory watermark tiering, and real‑world case studies that demonstrate significant improvements in cluster efficiency and latency‑sensitive workload performance.

ACKAlibaba CloudCPU Burst
0 likes · 16 min read
How Alibaba Cloud’s Differential SLO Boosts Kubernetes Resource Utilization
HomeTech
HomeTech
Jan 13, 2022 · Cloud Native

AutoKH: A Mixed‑Workload Resource Management Solution on Kubernetes and Hadoop

AutoKH is a cloud‑native mixed‑workload framework that integrates Kubernetes and Hadoop to dynamically schedule online and offline tasks, improve CPU and memory utilization, enforce priority classes, and ensure service stability through operators, CronHPA, and resource‑control components.

CPU ManagerHadoopKubernetes
0 likes · 19 min read
AutoKH: A Mixed‑Workload Resource Management Solution on Kubernetes and Hadoop
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 12, 2022 · Big Data

Common Production Issues and Troubleshooting Guide for Apache Flink

This article compiles a comprehensive list of common production problems encountered with Apache Flink, covering cluster sizing, checkpoint failures, backpressure analysis, resource allocation, deployment errors, UDF definitions, data skew, Kafka configurations, and provides detailed troubleshooting steps and best‑practice recommendations.

Apache FlinkCheckpointKafka
0 likes · 39 min read
Common Production Issues and Troubleshooting Guide for Apache Flink
Bitu Technology
Bitu Technology
Jan 7, 2022 · Backend Development

Design and Implementation of Tubi Multimedia Processing Platform (TMPP)

The article details Tubi's Multimedia Processing Platform (TMPP), describing its architecture, processing stages, resource management, and distributed task scheduling for large‑scale video transcoding and delivery across multiple devices.

Distributed SystemsResource ManagementVideo processing
0 likes · 8 min read
Design and Implementation of Tubi Multimedia Processing Platform (TMPP)
ByteDance Terminal Technology
ByteDance Terminal Technology
Dec 17, 2021 · Mobile Development

Solving Android Plugin Resource ID Conflicts with a No‑Resource‑Fixation Approach

This article explains the problem of resource ID mismatches when Android plugins use host resources, reviews existing resource‑fixation solutions, and presents a novel “no‑resource‑fixation” method that dynamically maps host IDs at compile‑time and runtime to keep plugins stable across host updates.

AAPT2AndroidDynamic Loading
0 likes · 26 min read
Solving Android Plugin Resource ID Conflicts with a No‑Resource‑Fixation Approach
Java Interview Crash Guide
Java Interview Crash Guide
Dec 17, 2021 · Backend Development

Designing Effective Rate Limiting and Circuit Breaking for Microservice APIs

This article explores the motivations, concepts, and practical implementation strategies for rate limiting and circuit breaking in microservice architectures, covering resource granularity, rule definition, sliding‑window calculations, and integration with API gateways to prevent cascading failures and resource exhaustion.

Circuit BreakingHystrixMicroservices
0 likes · 14 min read
Designing Effective Rate Limiting and Circuit Breaking for Microservice APIs
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 8, 2021 · Cloud Native

How Alibaba Cut Costs by 30% with Cloud‑Native Database Scheduling

This article explains how Alibaba leveraged cloud‑native Kubernetes scheduling, CPUSet/CPUShare mixed deployment, and a custom multi‑cluster scheduler to reduce database resource costs by over 30% while maintaining performance and stability during large‑scale sales events.

KubernetesResource Managementcloud-native
0 likes · 16 min read
How Alibaba Cut Costs by 30% with Cloud‑Native Database Scheduling
Java Architect Essentials
Java Architect Essentials
Sep 21, 2021 · Fundamentals

Coding Standards and Best Practices for Robust Software Development

This article presents a set of coding "military rules" covering topics such as avoiding magic numbers, limiting method parameters, proper resource release, specific exception handling, and precise arithmetic, followed by practical development efficiency tips and resource links for further learning.

Exception HandlingResource ManagementSoftware Engineering
0 likes · 9 min read
Coding Standards and Best Practices for Robust Software Development
IT Architects Alliance
IT Architects Alliance
Sep 4, 2021 · Cloud Computing

Understanding Virtualization: How Abstracted Resources Power Modern IT

Virtualization is a resource management technique that abstracts physical servers, networks, memory, and storage into flexible virtual components, enabling higher utilization, isolation, and flexibility across software, hardware, memory, network, desktop, and service layers, with hypervisors like VMware ESXi, KVM, and Xen orchestrating multiple operating systems on a single physical host.

KVMResource ManagementVMware
0 likes · 4 min read
Understanding Virtualization: How Abstracted Resources Power Modern IT
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Sep 3, 2021 · Operations

Understanding Linux cgroups: Mechanism, Data Structures, and Core Logic

Linux cgroups are a kernel mechanism that groups processes into hierarchical directories, each subsystem (e.g., freezer, CPU, memory, IO) exposing control files such as cgroup.procs and freezer.state, with core data structures like cgroup_subsys, cgroup, css_set linking threads to multiple subsystems and enabling resource policies, freezing, throttling, and allocation.

KernelLinuxResource Management
0 likes · 6 min read
Understanding Linux cgroups: Mechanism, Data Structures, and Core Logic
DataFunTalk
DataFunTalk
Aug 29, 2021 · Big Data

Building and Optimizing the Offline Computing Platform at Autohome: Challenges, Solutions, and Future Plans

This article details the evolution of Autohome's offline computing platform from a 50‑node cluster in 2013 to a multi‑thousand‑node Hadoop ecosystem, describing performance and stability challenges, multi‑tenant operational issues, low resource utilization, and the comprehensive technical solutions and future roadmap implemented to address them.

AI on HadoopMetaStoreOffline Computing
0 likes · 11 min read
Building and Optimizing the Offline Computing Platform at Autohome: Challenges, Solutions, and Future Plans
Sohu Tech Products
Sohu Tech Products
Aug 4, 2021 · Mobile Development

Comprehensive Guide to Reducing iOS App Package Size

This article presents a step‑by‑step guide for shrinking iOS app IPA size by leveraging App Thinning, removing unused image assets, compressing media, and eliminating dead code through tools like FengNiao, LSUnusedResources, AppCode, and LinkMap analysis.

App ThinningMobile DevelopmentResource Management
0 likes · 20 min read
Comprehensive Guide to Reducing iOS App Package Size
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 4, 2021 · Cloud Computing

How Partitioned Synchronization Scales Alibaba’s Massive Cloud Clusters

At USENIX ATC2021, Alibaba Cloud’s Fuxi 2.0 team presented a best‑paper‑award research showing how a partitioned‑synchronization (ParSync) scheduling architecture dramatically reduces conflicts and latency in ultra‑large production clusters, balancing efficiency, quality, and fairness without adding resources.

Cluster SchedulingResource Managementcloud computing
0 likes · 17 min read
How Partitioned Synchronization Scales Alibaba’s Massive Cloud Clusters
dbaplus Community
dbaplus Community
Jul 19, 2021 · Cloud Native

Avoid These 10 Common Kubernetes Mistakes to Boost Reliability and Cost Efficiency

This article shares a practical guide to the most frequent Kubernetes pitfalls—from misconfigured resource requests and limits to improper liveness/readiness probes, load‑balancer settings, IAM misuse, pod anti‑affinity, and disruption budgets—offering concrete YAML examples and remediation steps to help operators run more reliable and cost‑effective clusters.

Cloud NativeKubernetesProbes
0 likes · 18 min read
Avoid These 10 Common Kubernetes Mistakes to Boost Reliability and Cost Efficiency
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 17, 2021 · Big Data

Comprehensive Guide to Presto: Origins, Architecture, Optimization, and Real‑World Applications

This article provides an in‑depth overview of Presto, covering its history, core principles, architectural components, query optimization techniques, resource management, tuning tips, data model, and case studies from companies like Didi and Youzan, offering practical guidance for deploying and operating the distributed SQL engine at scale.

PrestoQuery EngineResource Management
0 likes · 33 min read
Comprehensive Guide to Presto: Origins, Architecture, Optimization, and Real‑World Applications
IT Architects Alliance
IT Architects Alliance
Jun 3, 2021 · Cloud Computing

Cloud Computing Reference Architecture and Key Design Considerations

This article explains cloud computing reference architecture, illustrating private and hybrid cloud setups, and discusses essential design factors such as scalability, availability, manageability, feasibility, measurable resources, queue-based load balancing, error handling, and decoupling to build robust, cost‑effective cloud systems.

Resource ManagementScalabilityarchitecture
0 likes · 5 min read
Cloud Computing Reference Architecture and Key Design Considerations
Sohu Tech Products
Sohu Tech Products
May 26, 2021 · Mobile Development

Comprehensive Guide to iOS App Package Size Optimization

This article systematically explains how to analyze, reduce, and monitor iOS IPA package size by examining Xcode build settings, resource files, and code, providing detailed step‑by‑step configurations, tables of component sizes, practical scripts, and best‑practice recommendations for sustainable bundle‑size management.

Build SettingsMobile DevelopmentPackage Size
0 likes · 31 min read
Comprehensive Guide to iOS App Package Size Optimization
High Availability Architecture
High Availability Architecture
May 3, 2021 · Operations

Meituan Elastic Scaling System: Evolution, Challenges, and Business Enablement

This article introduces Meituan's elastic scaling platform, detailing its evolution from version 1.0 to 2.0, the technical and operational challenges faced, the strategies adopted for promotion and resource management, and several real‑world business scenarios where elastic scaling reduces cost and improves reliability.

MeituanOperationsResource Management
0 likes · 24 min read
Meituan Elastic Scaling System: Evolution, Challenges, and Business Enablement
MaGe Linux Operations
MaGe Linux Operations
Apr 24, 2021 · Cloud Native

9 Proven Strategies to Slash Kubernetes Costs

Learn how to monitor, limit, and optimize Kubernetes expenses with nine practical techniques—including resource constraints, autoscaling, right-sized instances, Spot instances, sleep schedules, regular clean‑ups, and tagging—to dramatically reduce cloud costs while maintaining performance.

AWSCost OptimizationKubernetes
0 likes · 8 min read
9 Proven Strategies to Slash Kubernetes Costs