How DeepSearch Redefines AI-Powered Search: Architecture, Iterative Reasoning, and Lessons Learned
This article analyzes the emergence of DeepSearch and DeepResearch as next‑generation AI search frameworks, detailing their iterative search‑read‑reason loop, design decisions, implementation challenges, code architecture, and practical insights for building robust LLM‑driven agents in 2025.
DeepSearch Overview
DeepSearch implements a continuous search‑read‑reason cycle. The search stage issues web queries, the reading stage extracts detailed content from each page (e.g., via the Jina Reader API), and the reasoning stage decides whether to answer, decompose the problem, or perform another search. The loop repeats until a stopping condition is met, such as a token‑budget limit or a maximum number of failed attempts.
DeepResearch Framework
DeepResearch builds on DeepSearch to generate long‑form research reports. It first creates a table of contents, then applies DeepSearch iteratively to each required section (introduction, related work, methodology, conclusion, etc.). After all sections are generated, they are stitched together to form a coherent narrative.
Key Design Decisions
Prompt engineering : System prompts are assembled from XML‑like sections ( <knowledge>, <context>, <bad-attempts>, <learned-strategy>, <actions>) and end with a strict JSON‑schema instruction.
Knowledge‑gap detection : Before answering, the agent identifies missing knowledge, creates sub‑questions, and pushes them to the front of a FIFO queue while keeping the original question at the tail.
FIFO queue vs. recursion : A rotating FIFO queue balances depth and breadth, avoiding uncontrolled recursion while still allowing deep exploration of sub‑questions.
Query rewriting & de‑duplication : Queries are normalized, de‑duplicated with jina-embeddings-v3, and rewritten into more effective search expressions.
Web crawling : Pages are fetched via the Jina Reader API, stored as knowledge entries, and limited per step to control memory usage.
Memory management : Instead of a vector database, three in‑memory collections (knowledge, visited URLs, failure logs) are kept inside the LLM context.
Answer evaluation : A separate evaluation phase uses predefined criteria and few‑shot examples to score answers before acceptance.
Budget control & "Beast Mode" : Token usage and bad‑attempt counters are tracked; when the budget is near exhaustion, a forced answer generation step ensures a final response.
Main Inference Loop (simplified)
while (tokenUsage < tokenBudget && badAttempts <= maxBadAttempts) {
const currentQuestion = gaps.length > 0 ? gaps.shift() : question;
const system = getPrompt(...);
const result = await LLM.generateStructuredResponse(system, messages, schema);
const thisStep = result.object;
if (thisStep.action === 'answer') {
// handle answer
} else if (thisStep.action === 'reflect') {
// handle reflection, possibly add new gaps
}
// other actions: search, read, visit, coding, etc.
}Prompt Construction Example
function getPrompt(params) {
const sections = [];
sections.push("You are a senior AI research assistant skilled in multi‑step reasoning...");
if (knowledge?.length) sections.push("<knowledge>[entries]</knowledge>");
if (context?.length) sections.push("<context>[history]</context>");
if (badContext?.length) {
sections.push("<bad-attempts>[fails]</bad-attempts>");
sections.push("<learned-strategy>[improvements]</learned-strategy>");
}
sections.push("<actions>[available actions]</actions>");
sections.push("Respond in valid JSON matching the provided schema.");
return sections.join("
");
}Budget‑Aware "Beast Mode" Trigger
if (!thisStep.isFinal && badAttempts >= maxBadAttempts) {
console.log('Enter Beast mode!!!');
system = getPrompt(..., false, false, false, false, false, true);
const result = await LLM.generateStructuredResponse(system, messages, answerOnlySchema);
thisStep = result.object;
thisStep.isFinal = true;
}Implementation Details
Query Rewriting
if (thisStep.action === 'search') {
const uniqueRequests = await dedupQueries(thisStep.searchRequests, existingQueries);
const optimizedQueries = await rewriteQuery(uniqueRequests);
const newQueries = await dedupQueries(optimizedQueries, allKeywords);
for (const query of newQueries) {
const results = await searchEngine(query);
if (results.length > 0) {
storeResults(results);
allKeywords.push(query);
}
}
}Web Crawling
async function handleVisitAction(URLs) {
const uniqueURLs = normalizeAndFilterURLs(URLs);
const results = await Promise.all(uniqueURLs.map(async url => {
try {
const content = await readUrl(url);
addToKnowledge(`What is in ${url}?`, content, [url], 'url');
return {url, success: true};
} catch (error) {
return {url, success: false};
} finally {
visitedURLs.push(url);
}
}));
updateDiaryWithVisitResults(results);
}Memory Management
function addToKnowledge(question, answer, references, type) {
allKnowledge.push({question, answer, references, type, updated: new Date().toISOString()});
}
function addToDiary(step, action, question, result, evaluation) {
diaryContext.push(`Step ${step}: on question "${question}" performed **${action}**. Details: ${result}. Evaluation: ${evaluation}`);
}Answer Evaluation
async function evaluateAnswer(question, answer, metrics, context) {
const criteria = await determineEvaluationCriteria(question);
const results = [];
for (const criterion of criteria) {
const result = await evaluateSingleCriterion(criterion, question, answer, context);
results.push(result);
}
return {
pass: results.every(r => r.pass),
think: results.map(r => r.reasoning).join('
')
};
}Budget Control
if (thisStep.action === 'reflect' && thisStep.questionsToAnswer) {
gaps.push(...newGapQuestions);
gaps.push(question); // keep original question
}
// Disable answer after a failure to force further search or reflection
allowAnswer = false;Open‑source Repository
GitHub: https://github.com/jina-ai/node-DeepResearch
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
