Operations 14 min read

How Grafana Phlare Enables Scalable Continuous Profiling for Cloud‑Native Environments

Grafana Phlare is an open‑source, horizontally scalable continuous profiling database that integrates with Grafana, offering easy installation, multi‑tenant support, cheap object‑storage persistence, and both monolithic and microservice deployment modes, with detailed Helm‑based Kubernetes setup and usage instructions.

Ops Development Stories
Ops Development Stories
Ops Development Stories
How Grafana Phlare Enables Scalable Continuous Profiling for Cloud‑Native Environments
Grafana Phlare

is an open‑source project for aggregating continuous profiling data. It integrates fully with Grafana, allowing correlation with other observability signals.

What is continuous profiling?

Profiling helps understand a program’s resource usage to optimize performance and cost. In distributed cloud‑native architectures, continuous profiling automatically collects, compresses, and stores resource‑usage information as time‑series data, enabling visualization over time and zoom‑in on periods of interest, such as CPU usage at peak.

Continuous profiling is considered the fourth pillar of observability, alongside metrics, logging, and tracing.

Grafana Labs uses continuous profiling to analyze performance of its own services (Loki, Mimir, Tempo, Grafana), e.g., identifying slow queries in Mimir or memory‑heavy objects before crashes.

Existing open‑source projects did not meet the scalability, reliability, and performance requirements of Grafana Labs, so a dedicated project was started during a company‑wide hackathon, demonstrating the value of profiling data when linked with metrics, logs, and traces.

Consequently, a profiling telemetry database was built using the same design principles as Loki, Tempo, and Mimir: horizontal scalability and object‑storage backing.

Core Features

Grafana Phlare offers horizontal scalability, high availability, long‑term storage, and query capabilities. Like Prometheus, it runs from a single binary without extra dependencies. Object storage provides cheap, durable history, and native multi‑tenant isolation lets multiple teams share a single database.

Easy installation : a single binary ( Grafana Phlare) runs in monolithic mode; Helm charts enable other deployment modes on Kubernetes.

Horizontal scalability : run Grafana Phlare on many machines to handle profiling load.

High availability : Grafana Phlare replicates incoming profiles to avoid data loss during node failures.

Cheap, durable profile storage : uses object storage (AWS S3, GCS, Azure Blob, OpenStack Swift, or any S3‑compatible store) for long‑term data.

Native multi‑tenant : isolates data and queries per team or business unit.

Architecture

Grafana Phlare follows a microservice architecture with multiple horizontally scalable components compiled into a single binary. The -target flag selects which components run, similar to Grafana Loki. In monolithic mode all components run in one process.

Most components are stateless; some are stateful and rely on durable storage. The main components form a cluster: Distributor, Ingester, and Querier.

Monolithic mode

All required components run in a single process (default). Use -target=all to enable. To list components for -target=all, run -modules:

./phlare -modules

Microservice mode

Components are deployed as separate processes, allowing independent scaling and finer fault domains. Deploy each required component (e.g., -target=ingester, -target=distributor)—Kubernetes is recommended for production.

Deployment

Deploy with Helm on a Kubernetes cluster (kubectl and helm configured).

Create a namespace: kubectl create namespace phlare-test Add the Helm repository:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Install in monolithic mode (default):

helm -n phlare-test install phlare grafana/phlare

For microservice mode, fetch the default values file and adjust as needed:

# Get default micro‑service values
curl -LO values-micro-services.yaml https://raw.githubusercontent.com/grafana/phlare/main/operations/phlare/helm/phlare/values-micro-services.yaml
# Example excerpt of the values file
phlare:
  components:
    querier:
      kind: Deployment
      replicaCount: 3
      resources:
        limits:
          memory: 1Gi
        requests:
          memory: 256Mi
          cpu: 100m
    # ... other components ...
minio:
  enabled: true

Install with the customized values:

helm -n phlare-test upgrade --install phlare grafana/phlare -f values-micro-services.yaml

Verify pods are running:

kubectl get pods -n phlare-test

Usage

Install Grafana in the same cluster and configure a Phlare datasource.

# Generate Grafana manifest with profiling enabled
helm template -n phlare-test grafana grafana/grafana \
  --set image.repository=aocenas/grafana \
  --set image.tag=profiling-ds-2 \
  --set env.GF_FEATURE_TOGGLES_ENABLE=flameGraph \
  --set env.GF_AUTH_ANONYMOUS_ENABLED=true \
  --set env.GF_AUTH_ANONYMOUS_ORG_ROLE=Admin \
  --set env.GF_DIAGNOSTICS_PROFILING_ENABLED=true \
  --set env.GF_DIAGNOSTICS_PROFILING_ADDR=0.0.0.0 \
  --set env.GF_DIAGNOSTICS_PROFILING_PORT=6060 \
  --set-string 'podAnnotations.phlare\.grafana\.com/scrape=true' \
  --set-string 'podAnnotations.phlare\.grafana\.com/port=6060' > grafana.yaml
kubectl apply -f grafana.yaml

Port‑forward Grafana and add a Phlare datasource with URL http://phlare-querier.phlare-test.svc.cluster.local.:4100/. The datasource can be queried in Grafana Explore similarly to Loki or Prometheus, including flame‑graph panels.

Phlare’s Helm chart uses default annotations to scrape pods via relabel_config and kubernetes_sd_config. Pods must include annotations:

metadata:
  annotations:
    phlare.grafana.com/scrape: "true"
    phlare.grafana.com/port: "8080"

Set the port to the pod’s /debug/pprof/ endpoint. With these annotations, Phlare continuously collects profiles from the Grafana application.

References

https://github.com/grafana/phlare

https://grafana.com/blog/2022/11/02/announcing-grafana-phlare-oss-continuous-profiling-database/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativeKubernetesContinuous Profilinghelm
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.