Artificial Intelligence 10 min read

How ADB MySQL Turns Agent Logs into Actionable Insights – A Step‑by‑Step Guide

This article analyzes why over 40% of Agentic AI projects fail due to inadequate observability, outlines three common pain points—trace blindness, uncontrolled token costs, and untraceable failures—and demonstrates a practical ADB MySQL solution that reconstructs logs, classifies failures with AI functions, quantifies token waste, and generates prompt‑optimization suggestions, all with a few SQL statements.

Alibaba Cloud Developer

Apr 2, 2026

How ADB MySQL Turns Agent Logs into Actionable Insights – A Step‑by‑Step Guide

According to Gartner, more than 40% of Agentic AI projects will be cancelled in the coming years because they cannot meet commercial goals, and a major cause is a misaligned evaluation system that relies on generic metrics like hallucination rate instead of root‑cause analysis.

Three Observability Pain Points

Trace blindness: Agent execution is non‑linear, with multiple tool calls and model inferences hidden in flat log rows, making it impossible to see the full task chain.

TokenOps cost explosion: Organizations cannot attribute token consumption to specific failure modes, so they do not know where money is being wasted.

Failure diagnosis opacity: When a context window is cleared, debugging relies on guesswork, and valuable troubleshooting knowledge is not captured.

ADB MySQL Solution Overview

The solution compresses the entire observability workflow into four SQL‑driven steps, eliminating the need for Python scripts.

Step 1 – Reconstruct Full Execution Chains

A single window function groups linear log rows into meaningful task chains, turning opaque traces into transparent sequences.

WITH TaskBoundaries AS (
    SELECT *, SUM(CASE WHEN role='user' THEN 1 ELSE 0 END) OVER (PARTITION BY session_id ORDER BY row_id) AS chain_id
    FROM openclaw_logs.openclaw_sessions
    WHERE role IS NOT NULL
),
TaskChains AS (
    SELECT CONCAT(session_id,'_',chain_id) AS unique_chain_id,
           GROUP_CONCAT(... ORDER BY row_id SEPARATOR ' >>> ') AS full_trace,
           COUNT(CASE WHEN tool_name IS NOT NULL THEN 1 END) AS tool_usage_count,
           ...
    FROM TaskBoundaries
    GROUP BY session_id, chain_id
)
SELECT chain_id, session_id, tool_usage_count, LEFT(full_trace,200) AS trace_preview
FROM TaskChains LIMIT 5;

Result: 1484 log rows → 171 complete task chains, 292 tool calls.

Step 2 – Auto‑Label Failure Modes with AI Functions

Two built‑in AI functions, ai_classify and ai_generate, are invoked directly in SQL to classify failure types and generate root‑cause notes without any external code.

SELECT unique_chain_id,
       ai_classify('qwen_max_test', LEFT(full_trace,600), '["Loop", "Tool Param Hallucination", "Execution Refused", "Logic Break", "Success"]') AS failure_label,
       ai_generate('qwen_max_test', CONCAT('You are an OpenClaw AI diagnostician. Analyze the following task chain...', LEFT(full_trace,400))) AS root_cause_notes
FROM TaskChains
WHERE tool_usage_count>0 OR last_stop_reason IS NULL OR last_stop_reason!='stop';

Analysis: 15% of task chains have failure risk; 10.5% suffer from "tool‑parameter hallucination". 100% of root causes point to tool‑return data quality issues (missing API keys, out‑of‑bounds paths), not model inference flaws.

Step 3 – Quantify Token Consumption per Failure Mode

After defining failure modes, token usage is aggregated per chain and per label to answer questions like "How many tokens did hallucination waste? Which prompt yields the biggest ROI when fixed?"

WITH TaskBoundaries AS (...),
ChainTokens AS (
    SELECT CONCAT(session_id,'_',chain_id) AS unique_chain_id,
           SUM(IFNULL(total_tokens,0)) AS chain_total_tokens
    FROM TaskBoundaries
    GROUP BY session_id, chain_id
)
SELECT a.failure_label,
       COUNT(*) AS task_count,
       ROUND(AVG(ct.chain_total_tokens)) AS avg_tokens,
       SUM(ct.chain_total_tokens) AS total_tokens_burned
FROM openclaw_logs.t_ai_audit_results a
JOIN ChainTokens ct ON a.unique_chain_id = ct.unique_chain_id
GROUP BY a.failure_label
ORDER BY total_tokens_burned DESC;

Insight: Tool‑parameter hallucination accounts for only 15% of tasks but burns 3,161,237 tokens—3.27× the total tokens of all successful tasks. The single highest‑consumption chain used 958,743 tokens.

Step 4 – Generate Prompt‑Optimization Suggestions

Using the same AI functions, the workflow extracts the original user prompt and produces an optimized version based on the diagnosed root cause, all within a single SQL query.

WITH TaskBoundaries AS (...), ChainTokens AS (...), FirstUserMsg AS (
    SELECT CONCAT(session_id,'_',chain_id) AS unique_chain_id,
           SUBSTRING_INDEX(content_text, CONCAT('```',CHAR(10),CHAR(10)), -1) AS original_prompt,
           ROW_NUMBER() OVER (PARTITION BY session_id, chain_id ORDER BY row_id) AS rn
    FROM TaskBoundaries WHERE role='user'
),
FailedChains AS (
    SELECT unique_chain_id, failure_label, root_cause_notes,
           ROW_NUMBER() OVER (PARTITION BY unique_chain_id ORDER BY created_at DESC) AS rn
    FROM openclaw_logs.t_ai_audit_results
    WHERE failure_label!='成功解决'
)
SELECT fc.unique_chain_id,
       LEFT(fu.original_prompt,200) AS original_prompt,
       fc.root_cause_notes AS root_cause,
       ai_generate('qwen_max_test', CONCAT('You are a Prompt‑optimization expert. Original: ', LEFT(fu.original_prompt,500), ' Failure cause: ', fc.root_cause_notes)) AS optimized_prompt
FROM FailedChains fc
JOIN ChainTokens ct ON ...
JOIN FirstUserMsg fu ON fu.rn=1
WHERE fc.rn=1
ORDER BY ct.chain_total_tokens DESC LIMIT 3;

Key Takeaways

Agent failures are probabilistic; the same prompt can succeed 7 out of 10 times with different failure paths, so statistical failure‑mode analysis is essential.

Token anomalies are a powerful early‑warning signal—abnormal token spikes indicate model “looping” and are far more sensitive than rule‑based checks.

Closing the observability loop with a database‑backed “immune system” enables continuous, automated prompt refinement and experience replay for agents.

By embedding these SQL steps into routine workflows, enterprises gain a low‑cost, high‑ROI solution for turning raw Agent logs into actionable, repeatable improvements.