How AutoDev’s Agentic RAG Turns Docs into a Programmable Knowledge Base

This article explains how AutoDev builds an Agentic Retrieval‑Augmented Generation system with a Document Query Language (DocQL) that lets LLM agents navigate hierarchical code and documentation structures using JSONPath‑like queries, detailing implementation, multi‑level keyword expansion, and experimental findings.

phodal
phodal
phodal
How AutoDev’s Agentic RAG Turns Docs into a Programmable Knowledge Base

Origin of the AutoDev Knowledge Agent

The concept originated from a client discussion about Retrieval‑Augmented Generation (RAG), where structured data such as JSON and JSONPath queries are preferred over unstructured text.

Building a New Knowledge Agent

Install the CLI: npm install -g @autodev/cli Run a sample query:

autodev document -p -q "How DocumentAgent works?"

Download AutoDev Desktop (compose‑0.3.2) from https://github.com/unit-mesh/auto-dev/releases.

Why Agentic RAG for Document Query

Agentic RAG lets an LLM act as an active participant, performing controlled, multi‑step, reflective retrieval.

Modern agents such as Cursor, Claude Code, and Antigravity already perform read‑file, grep, and code‑base operations after user input, which is essentially Agentic RAG.

Hierarchical Query: Human‑like Search

RAG consists of two stages: index construction and query execution. The query style often dictates how the index is built—by paragraphs, chapters, or code structure. The goal is to emulate an engineer’s iterative document‑search workflow rather than a single‑turn black‑box retrieval.

An engineer never searches only once; they iteratively narrow scope, jump across documents, compare context, and adjust the query strategy.

Code Query Path

Identify the relevant module or package (e.g., user-service).

Locate the class or interface (e.g., UserController).

Enter the specific method (e.g., createUser()).

Analyze the call chain and dependencies (e.g., HTTP handler → service → repository).

Optionally trace tests, configuration, or comments.

Document Query Path

Confirm document type or version (Installation, Architecture, API Reference, etc.).

Enter the relevant chapter or section (Authentication, Deployment Guide, Module Design).

Locate the subsection or heading (JWT Flow, Config Options, Error Handling).

Read the specific paragraph or example.

Jump to FAQ, design diagram, or supplemental docs if needed.

Markdown Hierarchical Indexing

Converting documents to Markdown enables navigation via heading levels. In practice, converting to HTML first preserves richer semantics before turning into Markdown.

Uber’s EAg‑RAG moves knowledge sources to Google Docs (HTML), then to Markdown, improving parsing accuracy. Tables and nested structures still need LLM‑assisted rewriting.

When resources are limited, direct Markdown conversion is simpler. Adding metadata such as filename and timestamp aids LLM sorting.

Implementation of AutoDev DocQL

DocQL (Document Query Language) is a DSL that bridges unstructured text and structured queries, offering a deterministic, programmable retrieval interface for agents.

1. AI‑Friendly Query Syntax (JSONPath‑like)

LLMs handle JSON structures robustly, so DocQL adopts a JSONPath‑like syntax, supporting attribute access and filter conditions with few‑shot learning.

$.toc : access the document’s table of contents.

$.entities : access extracted key entities (classes, methods, APIs).

$.content : access raw content blocks.

$.code : access source‑code structure via AST.

Example queries:

// Scenario 1: Get top‑level sections
$.toc[?(@.level == 1)]

// Scenario 2: Find all API definitions containing "User"
$.entities[?(@.type == "API" && @.name =~ "User")]

// Scenario 3: Retrieve the "Authentication" chapter
$.content.heading("Authentication")

// Scenario 4: Get the source of class "AuthService"
$.code.class("AuthService")

2. Unified Document Object Model (DOM for RAG)

Two parsers were built to produce a common hierarchical Document Tree:

Markdown/PDF : MarkdownDocumentParser and TikaDocumentParser extract text and structural information, recording line numbers, anchors, and parent‑child relations.

Codebase : Tree‑Sitter powers CodeDocumentParser, turning Java/Kotlin/Python files into the same tree model where classes become high‑level sections and functions become subsections.

This abstraction lets an agent query a deployment guide the same way it queries a DeploymentService.java method.

3. Agentic Retrieval: Multi‑Level Expansion & Scoring

DocQLTool implements a search pipeline:

Multi‑Level Keyword Expansion

Level 1 (Primary): generate exact variants (e.g., "Auth", "Authentication").

Level 2 (Secondary): decompose components (e.g., "AuthService" → "Service").

Level 3 (Tertiary): produce stems and synonyms (e.g., "authorize", "access").

Adaptive expansion or contraction based on recall count.

Parallel Multi‑Channel Retrieval dispatches queries to different channels, e.g.

$.code.classes   // class name match
$.code.functions // function name match
$.content.heading // document title match
$.entities        // term definition match

RRF Fusion & Re‑ranking combines results using a Composite Scorer with weights:

Type Priority – code entities outrank plain text.

Name Match – titles/names outrank content.

Reciprocal Rank Fusion (RRF) – merges multiple recall streams.

Experiment Results & Conclusions

In the first version tested on DeepSeek, performance fell short of expectations: the LLM stopped after a few results (likely prompting issues) and JSONPath queries often failed to produce the desired multi‑level results.

AI halted after a few hits, suggesting prompt‑engineering problems.

JSONPath remained error‑prone for complex cases.

Subsequent refinements via @DocumentCli.kt aimed to iteratively improve the process.

The key takeaway is that in RAG scenarios for software engineering, Query (deterministic, structure‑aware lookup) is more effective than Search (probabilistic, relevance‑based). DocQL transforms documentation and codebases into a programmable “database” that agents can explore like senior engineers—first scanning the table of contents, then locating definitions, and finally reading source code to answer complex knowledge questions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AILLMdocument parsingAgentic RAGDocQL
phodal
Written by

phodal

A prolific open-source contributor who constantly starts new projects. Passionate about sharing software development insights to help developers improve their KPIs. Currently active in IDEs, graphics engines, and compiler technologies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.