Industry Insights 24 min read

How DeepSearch Redefines AI-Powered Search: Architecture, Iterative Reasoning, and Lessons Learned

This article analyzes the emergence of DeepSearch and DeepResearch as next‑generation AI search frameworks, detailing their iterative search‑read‑reason loop, design decisions, implementation challenges, code architecture, and practical insights for building robust LLM‑driven agents in 2025.

Architect

Mar 13, 2025

DeepSearch Overview

DeepSearch implements a continuous search‑read‑reason cycle. The search stage issues web queries, the reading stage extracts detailed content from each page (e.g., via the Jina Reader API), and the reasoning stage decides whether to answer, decompose the problem, or perform another search. The loop repeats until a stopping condition is met, such as a token‑budget limit or a maximum number of failed attempts.

DeepResearch Framework

DeepResearch builds on DeepSearch to generate long‑form research reports. It first creates a table of contents, then applies DeepSearch iteratively to each required section (introduction, related work, methodology, conclusion, etc.). After all sections are generated, they are stitched together to form a coherent narrative.

Key Design Decisions

Prompt engineering : System prompts are assembled from XML‑like sections ( <knowledge>, <context>, <bad-attempts>, <learned-strategy>, <actions>) and end with a strict JSON‑schema instruction.

Knowledge‑gap detection : Before answering, the agent identifies missing knowledge, creates sub‑questions, and pushes them to the front of a FIFO queue while keeping the original question at the tail.

FIFO queue vs. recursion : A rotating FIFO queue balances depth and breadth, avoiding uncontrolled recursion while still allowing deep exploration of sub‑questions.

Query rewriting & de‑duplication : Queries are normalized, de‑duplicated with jina-embeddings-v3, and rewritten into more effective search expressions.

Web crawling : Pages are fetched via the Jina Reader API, stored as knowledge entries, and limited per step to control memory usage.

Memory management : Instead of a vector database, three in‑memory collections (knowledge, visited URLs, failure logs) are kept inside the LLM context.

Answer evaluation : A separate evaluation phase uses predefined criteria and few‑shot examples to score answers before acceptance.

Budget control & "Beast Mode" : Token usage and bad‑attempt counters are tracked; when the budget is near exhaustion, a forced answer generation step ensures a final response.

Main Inference Loop (simplified)

while (tokenUsage < tokenBudget && badAttempts <= maxBadAttempts) {
  const currentQuestion = gaps.length > 0 ? gaps.shift() : question;
  const system = getPrompt(...);
  const result = await LLM.generateStructuredResponse(system, messages, schema);
  const thisStep = result.object;
  if (thisStep.action === 'answer') {
    // handle answer
  } else if (thisStep.action === 'reflect') {
    // handle reflection, possibly add new gaps
  }
  // other actions: search, read, visit, coding, etc.
}

Prompt Construction Example

function getPrompt(params) {
  const sections = [];
  sections.push("You are a senior AI research assistant skilled in multi‑step reasoning...");
  if (knowledge?.length) sections.push("<knowledge>[entries]</knowledge>");
  if (context?.length) sections.push("<context>[history]</context>");
  if (badContext?.length) {
    sections.push("<bad-attempts>[fails]</bad-attempts>");
    sections.push("<learned-strategy>[improvements]</learned-strategy>");
  }
  sections.push("<actions>[available actions]</actions>");
  sections.push("Respond in valid JSON matching the provided schema.");
  return sections.join("

");
}

Budget‑Aware "Beast Mode" Trigger

if (!thisStep.isFinal && badAttempts >= maxBadAttempts) {
  console.log('Enter Beast mode!!!');
  system = getPrompt(..., false, false, false, false, false, true);
  const result = await LLM.generateStructuredResponse(system, messages, answerOnlySchema);
  thisStep = result.object;
  thisStep.isFinal = true;
}

Implementation Details

Query Rewriting

if (thisStep.action === 'search') {
  const uniqueRequests = await dedupQueries(thisStep.searchRequests, existingQueries);
  const optimizedQueries = await rewriteQuery(uniqueRequests);
  const newQueries = await dedupQueries(optimizedQueries, allKeywords);
  for (const query of newQueries) {
    const results = await searchEngine(query);
    if (results.length > 0) {
      storeResults(results);
      allKeywords.push(query);
    }
  }
}

Web Crawling

async function handleVisitAction(URLs) {
  const uniqueURLs = normalizeAndFilterURLs(URLs);
  const results = await Promise.all(uniqueURLs.map(async url => {
    try {
      const content = await readUrl(url);
      addToKnowledge(`What is in ${url}?`, content, [url], 'url');
      return {url, success: true};
    } catch (error) {
      return {url, success: false};
    } finally {
      visitedURLs.push(url);
    }
  }));
  updateDiaryWithVisitResults(results);
}

Memory Management

function addToKnowledge(question, answer, references, type) {
  allKnowledge.push({question, answer, references, type, updated: new Date().toISOString()});
}

function addToDiary(step, action, question, result, evaluation) {
  diaryContext.push(`Step ${step}: on question "${question}" performed **${action}**. Details: ${result}. Evaluation: ${evaluation}`);
}

Answer Evaluation

async function evaluateAnswer(question, answer, metrics, context) {
  const criteria = await determineEvaluationCriteria(question);
  const results = [];
  for (const criterion of criteria) {
    const result = await evaluateSingleCriterion(criterion, question, answer, context);
    results.push(result);
  }
  return {
    pass: results.every(r => r.pass),
    think: results.map(r => r.reasoning).join('
')
  };
}

Budget Control

if (thisStep.action === 'reflect' && thisStep.questionsToAnswer) {
  gaps.push(...newGapQuestions);
  gaps.push(question); // keep original question
}

// Disable answer after a failure to force further search or reflection
allowAnswer = false;

Open‑source Repository

GitHub: https://github.com/jina-ai/node-DeepResearch

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Prompt Engineering RAG DeepSearch AI search LLM agents Iterative Reasoning Knowledge Gap Management

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.