Artificial Intelligence 36 min read

Deep Dive into DeerFlow’s 14‑Layer Middleware: An Onion‑Style Chain Architecture Case Study

This article provides a detailed technical analysis of DeerFlow 2.0’s 14‑layer middleware stack, explaining how it extends LangChain’s runnable middleware with an onion‑style responsibility‑chain, compares the design to MyBatis interceptors, and breaks down each middleware’s purpose, implementation details, execution order, and engineering benefits for AI agent frameworks.

Tech Freedom Circle

Apr 21, 2026

Deep Dive into DeerFlow’s 14‑Layer Middleware: An Onion‑Style Chain Architecture Case Study

DeerFlow 2.0 is a super‑agent framework that rebuilds the Harness AI platform on top of LangChain and LangGraph. Its middleware layer adopts the classic “onion model” responsibility‑chain, where each middleware wraps a Runnable and provides before / after hooks to inject cross‑cutting logic.

LangChain middleware fundamentals : The core abstraction is langchain_core.runnables.middleware.Middleware, an abstract class that defines the __call__(self, request, call_next) method. Concrete middlewares implement this method to process the request, invoke call_next(request) (the next layer), and optionally handle the response on the way back. This mirrors MyBatis’s Interceptor interface, but LangChain supports asynchronous execution and flexible composition.

Key components of the LangChain design :

Abstract handler – Middleware class.

Concrete handlers – e.g., LoggingMiddleware (class‑based) and @middleware ‑decorated functions such as log_middleware.

Chain assembly – MiddlewareChain builds a nested hierarchy so that requests penetrate from the outermost middleware to the core runnable and responses backtrack in reverse order.

Execution flow – request → outer middleware → … → core runnable → … → outer middleware (response).

DeerFlow’s 14‑layer middleware stack extends this model with a strict, hard‑coded order (defined in _build_middlewares()) to satisfy the specific needs of a Super‑Agent system. The layers are: ThreadDataMiddleware: extracts thread_id, creates per‑thread work directories, and provides the foundational state for downstream middlewares. SandboxMiddleware: creates an isolated sandbox instance per thread, injects SandboxState, and guarantees code‑execution safety. UploadsMiddleware: scans the thread‑specific upload folder, injects newly uploaded files into the conversation context, and updates the workspace. TokenUsageMiddleware: intercepts model calls to record prompt_tokens, completion_tokens, and total token consumption for cost monitoring. MemoryMiddleware: collects valid Human, AI, and Tool messages, queues them for asynchronous persistence, and builds long‑term memory. TitleMiddleware: after the first human‑AI exchange, calls an LLM to generate a conversation title and stores it in ConversationState. ViewImageMiddleware: detects image data in the dialogue, converts it (e.g., Base64) for visual‑model consumption, and injects it into the context. DanglingToolCallMiddleware: fixes “dangling” tool calls caused by user interruption or network errors by injecting placeholder ToolMessage objects. SummarizationMiddleware: monitors token usage; when the context approaches the model’s limit, it triggers an LLM summarization step and replaces the original context with a concise summary. TodoMiddleware: in Plan mode, injects a write_todos tool to record multi‑step task progress. DeferredToolFilterMiddleware: hides a large number of tools behind a tool_search interface, exposing only the most relevant ones to reduce token waste. SubagentLimitMiddleware: caps the number of concurrent Sub‑Agent task() calls (default 3) to prevent resource exhaustion. LoopDetectionMiddleware: uses a sliding‑window hash of recent tool calls to detect and break dead‑loops, forcing the LLM to produce a final answer. ClarificationMiddleware: the final layer that intercepts ask_clarification tool calls, aborts the remaining chain, and prompts the user for clarification.

Engineering advantages of this design include:

Improved process control – explicit handling of workflow issues such as missing tool responses, token overflow, and infinite loops.

System stability – sandbox isolation, sub‑agent concurrency limits, and tool‑filtering protect the runtime from abuse or resource spikes.

Maintainability – each middleware follows the single‑responsibility principle; new functionality can be added by implementing a new middleware class without touching the core logic.

Extensibility – the underlying LangChain Middleware API allows developers to reuse existing LangChain middlewares or create custom ones that automatically integrate into the chain.

The complete execution order is:

ThreadDataMiddleware → SandboxMiddleware → UploadsMiddleware → TokenUsageMiddleware → MemoryMiddleware → TitleMiddleware → ViewImageMiddleware → DanglingToolCallMiddleware → SummarizationMiddleware → TodoMiddleware → DeferredToolFilterMiddleware → SubagentLimitMiddleware → LoopDetectionMiddleware → ClarificationMiddleware

By combining LangChain’s flexible runnable middleware with DeerFlow’s strict ordering, the framework achieves both the composability needed for diverse AI agent scenarios and the deterministic control required for production‑grade super‑agent deployments.

from abc import ABC, abstractmethod
from langchain_core.runnables.middleware import Request

class Middleware(ABC):
    @abstractmethod
    async def __call__(self, request: Request, call_next):
        """Core interface for onion‑chain handlers.
        :param request: request object containing input and config
        :param call_next: callable to invoke the next layer
        :return: processed response"""
        pass

class LoggingMiddleware(Middleware):
    async def __call__(self, request, call_next):
        print(f"请求开始：{request('input')}")
        response = await call_next(request)
        print(f"请求结束：响应为{response}")
        return response

Architecture Python AI agents LangChain Middleware DeerFlow OnionChain

Written by

Tech Freedom Circle

Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.