Cloud Native 12 min read

How PromQL Copilot Turns Natural Language into Precise Monitoring Queries

PromQL Copilot leverages Alibaba Cloud's observability platform and AI techniques to convert ambiguous natural‑language monitoring requests into accurate PromQL statements, addressing challenges of ambiguity, domain knowledge, and metric coverage while providing generation, explanation, diagnosis, and recommendation features for cloud‑native environments.

Alibaba Cloud Observability
Alibaba Cloud Observability
Alibaba Cloud Observability
How PromQL Copilot Turns Natural Language into Precise Monitoring Queries

In modern cloud‑native and distributed system operations, Prometheus and its query language PromQL have become the de‑facto standard for monitoring and alerting, but the steep learning curve and complex syntax hinder efficient use.

Challenges of Natural‑Language to PromQL

Ambiguity: the same natural‑language query can map to different metrics depending on context.

Implicit intent: vague questions such as "how is system load?" lack concrete metric references.

Label mapping: terms like "instance" may correspond to different label keys (instance, instanceID).

Solutions Implemented in PromQL Copilot

Built a comprehensive metric knowledge base covering Alibaba Cloud products and open‑source components, enabling accurate metric recommendation.

Query rewriting and multi‑turn interaction to clarify user intent and enrich queries with domain tags (e.g., container, k8s).

Integrated SLS SPL to retrieve actual metric labels, ensuring generated PromQL uses correct label names.

Pre‑run validation and automatic diagnosis to fix syntax or semantic errors before execution.

Core Features

PromQL generation from natural language.

PromQL explanation that breaks down query structure.

PromQL diagnosis with suggestions for improvement.

Metric recommendation tailored to the user's cloud product or scenario.

Architecture Overview

The system relies on Alibaba Cloud's SLS and CMS infrastructure and the Dify framework to form an end‑to‑end loop of natural‑language understanding, knowledge graph retrieval, query generation, and execution verification.

Usage in Cloud Console and MCP Service

In the Cloud Monitoring console, users can input everyday language and receive an executable PromQL query.

The Observability MCP service exposes the same capabilities via API and GitHub‑hosted tools.

Example Query

To find the pod with the highest outbound traffic:

topk(1, max by (pod_name)(rate(container_network_transmit_bytes_total{}[1m])))

Future Outlook

Ongoing work focuses on improving metric knowledge graph accuracy, enhancing prompt engineering for higher PromQL correctness, and optimizing the output format of MCP tools.

Monitoringcloud-nativeAIMetricsPromQL
Alibaba Cloud Observability
Written by

Alibaba Cloud Observability

Driving continuous progress in observability technology!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.