Tagged articles
34 articles
Page 1 of 1
MaGe Linux Operations
MaGe Linux Operations
Mar 30, 2026 · Cloud Native

How to Scale Prometheus to Thousands of Nodes with Thanos: A Deep Dive

This article examines the storage, query performance, high‑availability, and high‑cardinality challenges of running Prometheus on a thousand‑node Kubernetes cluster and presents a complete, step‑by‑step Thanos‑based architecture, capacity‑planning models, configuration examples, and operational best practices for reliable horizontal scaling.

KubernetesObservabilityPrometheus
0 likes · 34 min read
How to Scale Prometheus to Thousands of Nodes with Thanos: A Deep Dive
Raymond Ops
Raymond Ops
Dec 22, 2025 · Operations

Build a High‑Availability Prometheus Monitoring System from Scratch: Pitfalls & Performance Tuning

This guide walks you through constructing a production‑grade, highly available Prometheus monitoring stack, covering architecture choices, sharding strategies, common pitfalls such as memory bloat, query latency and storage growth, and provides concrete tuning steps, Kubernetes deployment examples, and advanced optimisation techniques.

AlertingKubernetesPrometheus
0 likes · 11 min read
Build a High‑Availability Prometheus Monitoring System from Scratch: Pitfalls & Performance Tuning
Soul Technical Team
Soul Technical Team
Jan 24, 2025 · Operations

Migration from Thanos to VictoriaMetrics: Architecture, Plan, Issues, and Benefits

This article details the end‑to‑end migration from Thanos to VictoriaMetrics, covering background analysis, architectural comparison, a phased migration plan, encountered configuration and performance issues, resolution strategies, and the resulting performance, cost, and scalability improvements for the monitoring system.

ThanosTime SeriesVictoriaMetrics
0 likes · 16 min read
Migration from Thanos to VictoriaMetrics: Architecture, Plan, Issues, and Benefits
Efficient Ops
Efficient Ops
Dec 11, 2024 · Operations

Thanos vs VictoriaMetrics: Which Prometheus Storage Solution Wins for Scale and Cost?

This article compares Thanos and VictoriaMetrics as long‑term storage solutions for Prometheus, evaluating their architecture, write and read paths, reliability, consistency, performance, scalability, high‑availability, and hosting costs to help you choose the most suitable option for your monitoring stack.

Long‑term StorageThanosVictoriaMetrics
0 likes · 18 min read
Thanos vs VictoriaMetrics: Which Prometheus Storage Solution Wins for Scale and Cost?
Soul Technical Team
Soul Technical Team
Sep 2, 2024 · Databases

Comparative Analysis of VictoriaMetrics and Thanos for Large‑Scale Metric Storage

This article examines the migration from Thanos to VictoriaMetrics for large‑scale metric storage, detailing background challenges, VictoriaMetrics architecture and storage engine, data write and read processes, and a comparative analysis of performance, scalability, and operational costs between the two systems.

ObservabilityThanosTime Series Database
0 likes · 15 min read
Comparative Analysis of VictoriaMetrics and Thanos for Large‑Scale Metric Storage
Efficient Ops
Efficient Ops
Aug 5, 2024 · Operations

Thanos vs VictoriaMetrics: Which Prometheus Long‑Term Storage Wins?

This article compares Thanos and VictoriaMetrics as Prometheus long‑term storage solutions, evaluating their architectures, write and read paths, reliability, data consistency, performance, scalability, high‑availability, and cost to help you choose the best fit for your monitoring stack.

ThanosVictoriaMetricscloud
0 likes · 17 min read
Thanos vs VictoriaMetrics: Which Prometheus Long‑Term Storage Wins?
Alibaba Cloud Native
Alibaba Cloud Native
Jul 10, 2024 · Cloud Native

Migrate Self‑Hosted Prometheus + Thanos to Alibaba Cloud Managed Service

This guide explains how to move from a self‑built open‑source Prometheus + Thanos monitoring stack to Alibaba Cloud's fully managed Prometheus service, covering typical deployment scenarios, migration requirements, step‑by‑step procedures for metric collection, visualization, and alerting, and key considerations for each environment.

Alibaba CloudPrometheusThanos
0 likes · 15 min read
Migrate Self‑Hosted Prometheus + Thanos to Alibaba Cloud Managed Service
DevOps Operations Practice
DevOps Operations Practice
May 19, 2024 · Operations

High‑Availability Solutions for Prometheus Monitoring

Prometheus, a leading monitoring system, can achieve high availability through several common architectures—including dual-node with external storage, federated mode with external storage, and multi-node clusters combined with Thanos and object storage—each offering data persistence and load distribution to enhance system stability and performance.

External StoragePrometheusThanos
0 likes · 3 min read
High‑Availability Solutions for Prometheus Monitoring
Alibaba Cloud Native
Alibaba Cloud Native
Apr 8, 2024 · Cloud Native

How to Build a Global View for Multiple Prometheus Instances – Community and Alibaba Cloud Solutions

This article explains why a global view is needed when Prometheus metrics are scattered across many instances, compares community approaches such as Federation, Thanos, and Remote Write, and details Alibaba Cloud's Global Aggregation Instance and Remote Write solutions with configuration examples and a real‑world case study.

FederationGlobal ViewPrometheus
0 likes · 25 min read
How to Build a Global View for Multiple Prometheus Instances – Community and Alibaba Cloud Solutions
Practical DevOps Architecture
Practical DevOps Architecture
Mar 15, 2024 · Operations

Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development

This multi‑chapter guide provides in‑depth, hands‑on instruction for configuring and optimizing all Prometheus components, exploring Kubernetes monitoring, source‑code analysis, custom exporter development, high‑availability setups, service discovery, resource‑efficient scraping, and integrating Thanos for long‑term storage.

KubernetesObservabilityOperations
0 likes · 4 min read
Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development
dbaplus Community
dbaplus Community
Jul 10, 2023 · Operations

Why Most Logging and Metrics Strategies Fail – and How to Fix Them

The author reflects on the shortcomings of current logging, metrics, and tracing practices, explains why they become costly and unscalable, and offers concrete recommendations—including log level discipline, structured logging, metric aggregation, and the use of tools like Prometheus, Cortex, and Thanos—to build a more efficient observability stack.

MetricsObservabilityPrometheus
0 likes · 18 min read
Why Most Logging and Metrics Strategies Fail – and How to Fix Them
Efficient Ops
Efficient Ops
Apr 12, 2023 · Operations

Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide

This article explains why native Prometheus HA solutions fall short for large, multi‑region clusters and shows how to use Thanos components—including sidecar, query, store gateway, and compactor—to achieve long‑term storage, unlimited scaling, a global view, and non‑intrusive integration with existing Prometheus deployments.

KubernetesObservabilityPrometheus
0 likes · 22 min read
Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide
ITPUB
ITPUB
Nov 27, 2022 · Operations

Designing a Scalable, High‑Availability Monitoring System with Prometheus and Thanos

This article explores the challenges of building a fault‑tolerant monitoring platform, compares open‑source solutions, details why Prometheus is preferred, and shows how to achieve high availability and horizontal scaling using Thanos, remote‑write, hash‑ring sharding, and Kubernetes integration.

Thanoscloud-nativehigh-availability
0 likes · 18 min read
Designing a Scalable, High‑Availability Monitoring System with Prometheus and Thanos
MaGe Linux Operations
MaGe Linux Operations
Jan 22, 2022 · Cloud Native

Boost Kubernetes Monitoring: Migrate from Prometheus to Thanos for Scalable Low‑Cost Metrics

This article examines the limitations of a standard Prometheus‑based monitoring stack on Kubernetes, explains how adopting Thanos improves metric retention and reduces infrastructure costs, and provides a detailed multi‑cluster deployment guide with Terraform, TLS configuration, and Grafana visualization.

KubernetesObservabilityPrometheus
0 likes · 16 min read
Boost Kubernetes Monitoring: Migrate from Prometheus to Thanos for Scalable Low‑Cost Metrics
Open Source Linux
Open Source Linux
Nov 21, 2021 · Operations

Building a Scalable Prometheus Monitoring Stack with Thanos on Kubernetes

This article explains how to design and deploy a robust monitoring solution using Prometheus, Thanos, Pushgateway, and Alertmanager on Kubernetes, covering metric collection, naming conventions, query language, high‑availability strategies, and practical YAML configurations for a production‑grade observability platform.

AlertmanagerKubernetesPrometheus
0 likes · 20 min read
Building a Scalable Prometheus Monitoring Stack with Thanos on Kubernetes
Efficient Ops
Efficient Ops
Nov 16, 2021 · Operations

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential for production stability, compares white‑box and black‑box approaches, and provides a step‑by‑step guide to deploying Prometheus, configuring scrape targets, using Pushgateway and Alertmanager, and scaling the solution with Thanos in a Kubernetes environment.

AlertmanagerObservabilityPrometheus
0 likes · 21 min read
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes
Open Source Linux
Open Source Linux
Aug 26, 2021 · Cloud Native

Why Switch from Prometheus to Thanos? Boost Metric Retention & Cut Costs

This article explains the limitations of a traditional Prometheus‑based monitoring stack for Kubernetes, demonstrates how integrating Thanos improves metric retention, scalability, and storage cost, and provides a complete multi‑cluster deployment example with Terraform and Helm configurations.

Cloud NativeKubernetesObservability
0 likes · 15 min read
Why Switch from Prometheus to Thanos? Boost Metric Retention & Cut Costs
MaGe Linux Operations
MaGe Linux Operations
Jul 18, 2021 · Cloud Native

Boost Kubernetes Monitoring: Why Switch from Prometheus to Thanos

This article examines the limitations of a traditional Prometheus monitoring stack on Kubernetes, explains how adopting a Thanos‑based architecture improves metric retention and reduces infrastructure costs, and provides a detailed multi‑cluster deployment guide with Terraform, code snippets, and visualizations.

KubernetesPrometheusTerraform
0 likes · 15 min read
Boost Kubernetes Monitoring: Why Switch from Prometheus to Thanos
Efficient Ops
Efficient Ops
Apr 18, 2021 · Operations

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential for production stability, compares white‑box and black‑box approaches, details the advantages of Prometheus, walks through its architecture, metric types, query language, high‑availability strategies with Thanos, and provides practical Kubernetes deployment manifests and configuration tips.

DevOpsKubernetesObservability
0 likes · 21 min read
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes
MaGe Linux Operations
MaGe Linux Operations
Apr 3, 2021 · Operations

Designing a Scalable, High‑Availability Monitoring System with Prometheus & Thanos

This article explores the challenges of building a reliable monitoring platform, compares open‑source solutions such as Elasticsearch, Nagios, Zabbix and Prometheus, and details how to achieve high availability and horizontal scaling using Prometheus, Thanos, sharding, remote‑write, and Kubernetes orchestration.

ObservabilityThanoshigh availability
0 likes · 22 min read
Designing a Scalable, High‑Availability Monitoring System with Prometheus & Thanos
dbaplus Community
dbaplus Community
Mar 30, 2021 · Operations

How to Build a Scalable Prometheus Monitoring Stack on Kubernetes with Thanos

This article explains why monitoring is essential for production stability, introduces Prometheus fundamentals, metric naming conventions, query types, and high‑availability solutions such as Thanos federation, then walks through a complete Kubernetes deployment including StatefulSets, RBAC, Pushgateway, Alertmanager, and Ingress configuration.

AlertmanagerDevOpsKubernetes
0 likes · 20 min read
How to Build a Scalable Prometheus Monitoring Stack on Kubernetes with Thanos
Efficient Ops
Efficient Ops
Nov 25, 2020 · Operations

How to Build a Scalable, Highly‑Available Prometheus Monitoring Stack with Thanos

This article explains why standard Prometheus HA solutions fall short for large, multi‑region deployments, and walks through using Thanos—its components, configuration, and best‑practice tips—to achieve long‑term storage, unlimited scaling, a global view, and non‑intrusive monitoring across 300+ clusters.

KubernetesObservabilityPrometheus
0 likes · 24 min read
How to Build a Scalable, Highly‑Available Prometheus Monitoring Stack with Thanos
Efficient Ops
Efficient Ops
Nov 3, 2020 · Operations

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential, compares white‑box and black‑box approaches, details Prometheus features, metric naming, query language, high‑availability challenges, and shows how to extend Prometheus with Thanos, Pushgateway, Alertmanager, and Kubernetes deployments for a robust observability stack.

AlertmanagerKubernetesObservability
0 likes · 20 min read
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes
Cloud Native Technology Community
Cloud Native Technology Community
Apr 21, 2020 · Cloud Native

Deploying Thanos on Kubernetes: Architecture, Deployment Options, and Practical Guide

This article explains the Thanos architecture, compares Sidecar and Receiver deployment modes, walks through object‑storage configuration, and provides complete Kubernetes YAML examples for Prometheus, Thanos Sidecar, Query, Store Gateway, Ruler, Compact, and Receiver to build a large‑scale cloud‑native monitoring system.

Cloud NativeDeploymentKubernetes
0 likes · 27 min read
Deploying Thanos on Kubernetes: Architecture, Deployment Options, and Practical Guide
Cloud Native Technology Community
Cloud Native Technology Community
Apr 8, 2020 · Operations

Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring

This article provides a detailed analysis of Thanos' architecture, explaining each core component—Query, Sidecar, Store Gateway, Ruler, Compact, and the upcoming Receiver—how they enable global view, high availability, and long‑term storage for distributed Prometheus deployments, and discusses design trade‑offs and optimization strategies.

Cloud NativeLong‑term StorageObservability
0 likes · 12 min read
Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 25, 2019 · Operations

Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage

This guide explains the background, key features, architecture, and step‑by‑step deployment of Thanos—including Sidecar, Store, Query, Compact, Bucket, Rule, and Check components—to provide a unified, high‑availability Prometheus monitoring view with unlimited historical data storage using object storage.

Cloud NativeDeploymentLong‑term Storage
0 likes · 9 min read
Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage