Why Prometheus Wins for Cloud‑Native Monitoring and G‑Bank’s Deployment Secrets
Prometheus, favored for cloud‑native monitoring, is deployed at G‑Bank using the Prometheus Operator and CRDs to automate service discovery, rule management, and alerting, while addressing performance limits, metric accuracy, storage strategies, and closed‑loop monitoring to achieve scalable, distributed observability.
1. Why Choose Prometheus
Prometheus is an open‑source monitoring tool designed for cloud‑native applications. It supports business, performance, container, micro‑service, and application monitoring, and can be combined with other systems. Compared with Zabbix and Open‑falcon, Prometheus excels in time‑series storage, multi‑dimensional data collection, and query language.
2. Prometheus Deployment in Cloud Monitoring
Monitoring dimensions
Monitoring in Kubernetes includes resource monitoring (node, cluster, pod utilization) and application monitoring (request volume, response time, etc.).
Key monitoring objects
Core components: apiserver, controller‑manager, etcd.
Static physical resources: node status, kernel events.
Dynamic scalable resources: containers, Deployments, Services, Pods.
Custom application metrics: JMX, response time, latency.
In G‑Bank’s container cloud, the Prometheus Operator uses CRDs to define dynamic entities. ServiceMonitor discovers Services with matching labels, and Prometheus pulls metrics from those endpoints.
Service discovery
Service objects are labeled (e.g., k8s-app) and ServiceMonitor selects them via label selectors to expose endpoints to Prometheus.
Rule discovery
PrometheusRule objects are selected by ruleSelector (labels prometheus:k8s, role:alert-rules) to dynamically add, modify, or delete alerting rules.
Alertmanager
Alertmanager configuration is generated from a ConfigMap containing a YAML file. The generated file defines alert routing and notification settings, which are applied when PrometheusRule changes trigger alerts.
3. Limitations of Prometheus
Performance drawbacks
Prometheus uses a pull model, which can cause high latency and network congestion for large clusters. Its storage format has limited performance, and the single‑node architecture does not scale well.
Metric accuracy
Kubelet metrics are not real‑time, and timestamp loss can produce inaccurate utilization curves, making Prometheus less suitable for latency‑sensitive systems.
4. G‑Bank’s Advanced Prometheus Practices
Monitoring deployment
G‑Bank clones Service YAML files, adds custom labels, creates redundant Services, and links them with ServiceMonitors. This enables dynamic, label‑driven monitoring across the container cloud.
Closed‑loop monitoring
Metrics such as monitoring coverage rate, reachability ratio, and standardization rate are calculated to evaluate and continuously improve monitoring quality.
Storage strategies
Remote Write: Prometheus data from multiple clusters is written to an InfluxDB cluster for a global view.
Prometheus RPC: An RPC server aggregates metrics from external Prometheus instances, providing an approximate global view.
Alerting enhancements
Externalizing alerts: Complex alert rules are moved to an external Prometheus cluster via a Service that encodes the logic.
Integration with Kafka and Flink: Alerts are streamed through Kafka and processed by Flink for real‑time notification.
Unified data: Data from Open‑falcon, Zabbix, etc., are collected into Kafka, normalized, and fed into the alerting pipeline.
5. Distributed Monitoring Impact at G‑Bank
The distributed monitoring platform improves fault detection, rapid localization, and self‑healing, leading to healthier service operation, better compliance, and faster performance optimization for micro‑services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
