Cloud Native 17 min read

How ByteDance Scaled Stateful Applications with Cloud‑Native Kubernetes

This article details ByteDance's journey of migrating stateful services to a cloud‑native Kubernetes platform, covering challenges in state management, infrastructure enhancements, storage solutions, monitoring, and automated operations that together improve efficiency and reduce costs at massive scale.

Volcano Engine Developer Services
Volcano Engine Developer Services
Volcano Engine Developer Services
How ByteDance Scaled Stateful Applications with Cloud‑Native Kubernetes

Background

Stateful applications retain data and often require sharding, replication, and persistence. ByteDance migrated many such services to a cloud‑native environment built on Kubernetes.

Stateful Application Scenarios

Typical use cases include search recall (large models with long load times), push services (each instance handles a shard of users and needs a unique ID), and storage services such as self‑developed KV, Druid, and Elasticsearch, which combine local storage dependence with instance relationships.

Challenges and Benefits of Cloud‑Native Migration

Before migration, services ran on physical machines, leading to complex architecture, inflexible operations, inconsistent environments, and resource fragmentation. Cloud‑native adoption aimed to improve efficiency and reduce cost.

Efficiency was achieved through standardized infrastructure APIs, business‑framework abstraction, automated processes, and unified delivery via containers or images.

Cost reductions came from faster container start‑up, on‑demand resource allocation, and streamlined application iteration.

State Management

State management for stateful apps is divided into version management, data management, and service discovery & routing.

Version management resembles Kubernetes Deployment/StatefulSet capabilities, handling upgrades and rollbacks.

Data management updates external data without changing the number of service replicas.

Service discovery routes requests to the appropriate shard instance.

ByteDance introduced the SolarService abstraction, combining an enhanced StatefulSet (StatefulSet Extension) with a Budset CRD for data versioning. A sidecar container synchronizes data according to Bud definitions.

Rolling Upgrade Example

Shards are upgraded in parallel; within each shard, the MaxUnavailable setting controls how many replicas can be updated concurrently.

Scaling Example

Scaling can increase replica count for a shard (simple) or expand the number of data shards (requires a two‑step process: enlarge StatefulSet, then split Budset data and update service discovery).

Service Discovery & Routing

A custom Proxy component distributes requests to the appropriate StatefulSet Extension pods. Additional routing logic uses per‑pod error rates to implement circuit‑breaking. Service discovery stores ShardID, ReplicaID, and total shard count in a KV store for higher‑level frameworks.

Infrastructure Enhancements

Scheduling : ByteDance extended the Kubernetes scheduler and Kubelet to be NUMA‑aware, adding custom predicates and priorities, and a CPU manager policy that binds pods to specific CPU sets and NUMA nodes.

Storage : Implemented dynamic provisioning for remote block storage (NBD‑based CSI) and local disk storage (LPV). Supported multiple storage media including tmpfs, LVM, full‑disk isolation, and Intel AEP non‑volatile memory, with topology‑aware allocation via an extended Topology Manager policy.

Monitoring & Automated Operations

Developed SysProbe, an eBPF‑based container‑level metrics collector, feeding over 100 metrics into a high‑availability Metrics Aggregation Server (MAS) for dashboards.

Extended Pod Disruption Budgets (PDB) via webhook to customize eviction strategies, enabling coordinated multi‑AZ pod eviction while respecting replica distribution.

CSI Race Conditions

Identified and mitigated race conditions in CSI volume unpublish/unstage sequences and residual mount points, adding cleanup logic to the Kubelet volume manager.

Conclusion

ByteDance’s cloud‑native transformation of stateful services delivered efficient, cost‑effective operations, enhanced performance through NUMA‑aware scheduling, richer storage capabilities, and higher automation levels, while supporting thousands of services across tens of thousands of nodes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringcloud-nativeSchedulingstoragestateful applications
Volcano Engine Developer Services
Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.