What 3 Years of Running Kubernetes in Production Taught Us
After three years of operating multiple Kubernetes clusters across bare‑metal and cloud environments, we share hard‑won lessons on Java container compatibility, upgrade strategies, CI/CD redesign, probe tuning, conntrack limits, and evaluating whether Kubernetes truly fits your workload.
1 Java Application "Pitfalls"
Engineers often avoid Java in containers because of its historically poor memory management, but recent JVM improvements (e.g., XX:+UnlockExperimentalVMOptions and XX:+UseCGroupMemoryLimitForHeap) have mitigated many issues. Early Java 8 workloads crashed due to the JVM's inability to use Linux cgroup and namespace. We now run Java 11+ and allocate an extra 1 GB of Kubernetes memory beyond the JVM -Xmx heap size to provide headroom.
2 Kubernetes Lifecycle Management: Upgrades
In‑place upgrades are cumbersome; the simplest approach is to provision a fresh cluster with the latest version and migrate workloads. Tools like Kubespray, Kubeone, Kops, and Kubeaws help but often require stepping through every minor version. We built our own clusters on RHEL VMs with Kubespray, which offers playbooks for node addition, removal, and upgrades, though its upgrade playbooks enforce sequential version jumps.
3 Build and Deployment Redesign
We re‑architected our CI/CD pipeline, moving from monolithic Jenkins jobs to a Git‑centric workflow using Helm charts. Application code and its Helm chart live in separate Git repositories, enabling independent versioning. Release versions are linked (e.g., app-1.2.0 with charts-1.1.0); patch updates to Helm values only bump the chart patch number. Non‑code system services (Kafka, Redis) use Docker tags as the sole version indicator, and chart major versions are updated when the Docker tag changes.
4 Liveness and Readiness Probes (Double‑Edged Sword)
Probes automatically restart failing containers, but for stateful services like Kafka they can interfere with long start‑up procedures. Our 3‑broker Kafka cluster with ReplicationFactor=3 and minInSyncReplica=2 sometimes needed 10‑30 minutes to rebuild indexes after a crash; aggressive liveness probes would repeatedly kill the pod. The workaround is to increase initialDelaySeconds to give the application enough time, balancing faster recovery against longer failure detection.
Update: Newer Kubernetes releases (1.16 alpha, 1.18 beta) introduce a "startup probe" that disables liveness and readiness checks until the container signals it is ready, preventing premature restarts.
5 Exposing Services via Static External IPs
Using static external IPs incurs heavy conntrack overhead. Our clusters run Calico CNI with BGP routing and iptables‑mode kube‑proxy. Each external connection is tracked via the kernel conntrack table; once the table reaches its limit (e.g., net.netfilter.nf_conntrack_max = 262144), new connections are dropped. Scaling the conntrack table or distributing inbound traffic across edge routers can mitigate this bottleneck.
$ sysctl net.netfilter.nf_conntrack_count
net.netfilter.nf_conntrack_max = 2621446 Do You Really Need Kubernetes?
Kubernetes brings architectural shifts, operational overhead, and a steep learning curve. Managed services can reduce maintenance burden, but you must assess whether the platform’s benefits outweigh its costs for your specific use case. Adopt Kubernetes only when its features are essential to your workload.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
