Cloud Native 10 min read

Best Data Path for Containerized Workloads: Storage Options and Performance Insights

Understanding the data path in Kubernetes systems reveals performance bottlenecks and guides the selection among private‑cloud storage, container‑native storage software, and software‑defined storage to achieve scalable, low‑latency, high‑performance container workloads.

Cloud Native Technology Community

May 16, 2022

Best Data Path for Containerized Workloads: Storage Options and Performance Insights

Understanding the data path in a system can reveal the root causes of sub‑optimal performance and suggest solutions.

This article is translated from Running Containers At Scale – The Best Data Path to Success [1] by Kirill Shoikhet.

Kubernetes and other container orchestration platforms are rapidly becoming mainstream, and migrating traditional data‑center workloads to containers is relatively straightforward for many business applications. However, when dealing with higher‑performance core workloads such as databases or fast data‑analytics, the situation becomes more complex.

Containerizing applications raises the requirements on the underlying infrastructure—network, storage, and fault tolerance. Although Kubernetes has made great strides in these areas, performance degradation can still occur both on‑premises and in the cloud, and Kubernetes networking often cannot provide low and predictable latency for medium‑scale applications.

We argue that a smoothly running IT system needs sufficient CPU, bandwidth, and storage capacity, and that understanding the data path helps uncover performance‑limiting factors and their remedies.

Three Ways to Provide Storage for Containerized Workloads

Private Cloud and Co‑located Devices/Storage Clusters

While on‑premises storage is typically the most feature‑rich and flexible option, it may not be ideal for native container deployments. In these local instances, storage exists alongside the Kubernetes system, and Kubernetes connects applications to storage via a Container Storage Interface (CSI) plugin that directly attaches application containers to external storage, bypassing the Kubernetes‑controlled network.

Container Storage Software

Solutions that are delivered and operated as containers bring advantages specifically designed for containers. They adopt a “function‑first” approach, preserving features such as streamlined configuration and deduplication. Nevertheless, performance still depends on the data path: the storage controller runs as a container, so all data must traverse the Kubernetes network, affecting latency.

Software‑Defined Storage Running Natively in Kubernetes

There are a few pure software‑defined storage options that run natively in Kubernetes, including standalone bare‑metal SDS products that have been ported to Kubernetes and support private‑cloud and hybrid‑cloud deployments.

Native Kubernetes software‑defined storage combines the advantages of the previous two methods. Depending on the implementation, some solutions isolate the data path from Kubernetes, delivering better performance than container‑storage‑software‑only approaches.

This enables data‑center architects to obtain the best of traditional on‑premises architectures and pure container storage. To ensure predictable latency, the data path sits beneath Kubernetes—between the container and the NVMe SSD—moving from the kernel to the client‑device driver, then to the target driver, and finally accessing the NVMe driver directly.

In this model, the client is completely independent and can communicate directly with the target without cross‑client communication, reducing the number of network hops and communication lines, making the approach suitable for large‑scale environments where the number of connections is a small multiple of the domain size.

Elasticsearch Application

Several use cases that run natively in Kubernetes illustrate the benefits of the software‑defined approach. For example, a major telecom provider in the EMEA region tested three storage methods for a large‑scale Elasticsearch deployment. An external iSCSI‑based SDS was scalable but incurred millisecond‑level latency, degrading indexing performance, while the native Kubernetes storage solution could not meet the scale of hundreds of nodes. The third method—NVMe‑based scalable SDS using NVMe drivers on Kubernetes nodes and native integration into the Kubernetes control and management plane—delivered markedly better performance and latency.

Kubernetes NVMe native shared storage architecture with bare‑metal performance

CI/CD Application

In another example, a top network company ran a native‑Kubernetes SDS in a data‑center with tens of thousands of nodes to support CI/CD workloads, providing a robust control environment for compilation, building, and local testing. Figure 1 shows how the NVMe‑based client and horizontally‑scaled architecture enable the transition of CI/CD workloads to Kubernetes while preserving bare‑metal performance.

When running under Kubernetes, this method uses privileged containers to control the deployment of client and target device drivers, keeping the data path unaffected by the containerized nature of the environment and moving all control‑plane components to native container APIs. In the production environment of this network company, application performance was 15‑20% higher than on bare metal because the storage software aggregated multiple remote NVMe drives into a virtual volume presented to the application containers.

Best Data Path to Success

Choosing the right storage to meet scalability and performance requirements is not a one‑size‑fits‑all solution. When storage architects understand the implications of the data path, they can select storage that makes containerized hybrid deployments smoother, delivering scalable, high‑performance, and agile storage.

Reference Links

[1]

Original article: https://www.networkcomputing.com/cloud-infrastructure/running-containers-scale-%E2%80%93-best-data-path-success

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance cloud-native Kubernetes NVMe container storage Data Path

Written by

Cloud Native Technology Community

The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.