How LoongSuite Python Probe Brings Full‑Stack Observability to GenAI Applications

This article explains the three core challenges of AI‑agent observability—data back‑flow, inconsistent semantics, and missing end‑to‑end traces—and shows how the LoongSuite Python probe, built on OpenTelemetry, provides automatic instrumentation, unified GenAI semantics, multi‑dimensional coverage, and flexible OTLP export to simplify monitoring, debugging, and optimizing AI applications.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How LoongSuite Python Probe Brings Full‑Stack Observability to GenAI Applications

Problem Statement

Complex AI applications that combine multiple agents, tool calls, Retrieval‑Augmented Generation (RAG), and memory often suffer from hidden latency spikes, unpredictable token costs, and opaque context changes because runtime data is buried inside the model.

Traditional micro‑service observability focuses on performance and availability, but AI observability must also capture runtime context and behavior. The three core challenges are:

Capturing context changes both inside AI frameworks and user code without imposing heavy overhead.

Handling large multimodal inputs (images, audio, video) without slowing the processing pipeline.

Ensuring that collected data follows a unified semantic model so it can be reused across downstream tools.

The OpenTelemetry GenAI Special Interest Group (SIG) defined a common semantic specification for GenAI, and many platforms (e.g., Langfuse, Arize, Alibaba Cloud Observability) have adopted it. Implementing the spec, however, remains non‑trivial.

Solution – LoongSuite Python Probe

LoongSuite is an Alibaba‑open‑source distribution of the OpenTelemetry Python probe, designed to provide agile and efficient AI observability while staying compatible with upstream standards.

Key capabilities :

Automatic instrumentation – detects installed libraries such as DashScope, LangChain, Flask, etc., and injects instrumentation without modifying business code.

Unified GenAI semantics – all spans and events conform to the OpenTelemetry GenAI convention, eliminating the need for adapters in downstream visualisation tools.

Multi‑dimensional coverage – traces AI calls (LLM, Agent, Tool, RAG, Memory) together with regular micro‑service calls (HTTP, gRPC, database) in a single trace.

Flexible export – data can be sent via OTLP to Jaeger, Langfuse, Alibaba Cloud Observability, or any OTLP‑compatible backend.

Quick Start (Three Steps)

Install the distribution: pip install loongsuite-distro Install the probe package (default installs all AI‑related instrumentations; optional flags --auto-detect or --whitelist can limit the set): loongsuite-bootstrap -a install --version 0.1.0 Run the application with the wrapper (replace the endpoint with your OTLP collector address):

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental \
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_ONLY \
loongsuite-instrument python app.py

After these steps the AI application emits full observability data that can be viewed in Jaeger, Langfuse, or any OTLP‑compatible platform, showing complete call chains, latency, errors, and context changes.

Manual Instrumentation for Custom Flows

When developers implement custom agent loops, RESTful LLM calls, or hand‑crafted ReAct processes, automatic instrumentation may miss those spans. LoongSuite recommends using the OpenTelemetry GenAI Util to create manual spans that match the automatic ones, preserving parent‑child relationships, error handling, and optional input/output recording.

Example code:

from opentelemetry.util.genai.extended_handler import get_extended_telemetry_handler
from opentelemetry.util.genai.extended_types import InvokeAgentInvocation
from opentelemetry.util.genai.types import InputMessage, OutputMessage, Text

handler = get_extended_telemetry_handler()
invocation = InvokeAgentInvocation(
    provider="dashscope",
    request_model=request["model"],
    agent_name="OrderAgent",
    input_messages=[
        InputMessage(role="user", parts=[Text(content="帮我查询单号为 101 的订单状态")]),
        InputMessage(role="system", parts=[Text(content="你是一个订单管理员,负责调用工具查询订单信息")]),
    ],
)
with handler.invoke_agent(invocation) as inv:
    # ... invoke the agent ...
    inv.output_messages = [
        OutputMessage(
            role="assistant",
            parts=[Text(content="好的,我来帮您查询……您的订单信息暂未找到,请确认您的单号是否正确。")],
            finish_reason="stop",
        )
    ]
    inv.input_tokens = 15
    inv.output_tokens = 20

LoongSuite GenAI Util (Independent Package)

Because the upstream OpenTelemetry GenAI Util evolves slowly, LoongSuite provides the loongsuite-util-genai package with extended features:

Support for multimodal upload (Base64, Blob, URI) to OSS/SLS or local storage, keeping only a reference URI in the span.

Additional span types such as invoke_agent, create_agent, execute_tool, retrieve, rerank, embedding, memory.

Enriched semantic attributes like gen_ai.usage.total_tokens and gen_ai.response.time_to_first_token.

Configurable pre‑uploader and uploader entry points.

Installation and configuration example:

pip install loongsuite-util-genai
pip install loongsuite-util-genai[multimodal_upload]

export OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental
export OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=SPAN_AND_EVENT
export OTEL_INSTRUMENTATION_GENAI_EMIT_EVENT=true
export OTEL_INSTRUMENTATION_GENAI_MULTIMODAL_UPLOAD_MODE=both
export OTEL_INSTRUMENTATION_GENAI_MULTIMODAL_STORAGE_BASE_PATH=file:///var/log/genai/multimodal

Release Notes and Roadmap

The LoongSuite Python probe and GenAI Util are released on GitHub. See the release page for version history and upcoming features:

https://github.com/alibaba/loongsuite-python-agent/releases

References

OpenTelemetry GenAI SIG – https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/util/opentelemetry-util-genai

GenAI Semantic Convention – https://opentelemetry.io/docs/specs/semconv/gen-ai/

LoongSuite Python probe repository – https://github.com/alibaba/loongsuite-python-agent

cloud nativePythonOpenTelemetryGenAIAI ObservabilityLoongSuite
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.