Building a Smart Web AI Agent with FastAPI, LangGraph, and MCP

This article walks through the design and implementation of a production‑ready Web AI agent that uses FastAPI as the HTTP layer, LangGraph to orchestrate multi‑step reasoning, and MCP to expose external tools, showing how to manage state, integrate multiple LLM providers, and extend the system with persistence, rate‑limiting, and monitoring.

Data STUDIO
Data STUDIO
Data STUDIO
Building a Smart Web AI Agent with FastAPI, LangGraph, and MCP

What is an AI Agent and Why It Matters

An AI agent goes beyond a single‑turn chatbot by supporting multi‑step reasoning, tool invocation, and state persistence, enabling tasks such as scheduling meetings or performing calculations.

System Architecture Overview

The architecture consists of four layers:

FastAPI : Handles HTTP requests and provides a RESTful API.

LangGraph : Defines the agent’s workflow and decision logic.

MCP tools : Offer external capabilities like calculation and time queries.

LLM service : Supplies the core reasoning capability and can be swapped between OpenAI and Anthropic.

A layered design ensures clear separation of concerns.

FastAPI – Web Entry Point

The entry point is defined in app/main.py with structured logging, CORS configuration, and health‑check endpoints.

# app/main.py
""" FastAPI application entry point """
import structlog
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from app.api.v1.router import api_router
from app.core.config import settings
from app.core.logging import setup_logging

setup_logging()
logger = structlog.get_logger(__name__)

@asynccontextmanager
async def lifespan(app: FastAPI):
    logger.info("application_starting", environment=settings.ENVIRONMENT)
    yield
    logger.info("application_shutting_down")

app = FastAPI(
    title=settings.PROJECT_NAME,
    description="AI Agent with FastAPI, LangGraph, and MCP",
    version="0.1.0",
    lifespan=lifespan,
)
app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.ALLOWED_ORIGINS,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
app.include_router(api_router, prefix="/api/v1")

@app.get("/health")
async def health_check():
    return JSONResponse(content={"status": "healthy", "environment": settings.ENVIRONMENT})

@app.get("/")
async def root():
    return {"message": "AI Agent API", "docs": "/docs", "version": "0.1.0"}

Agent Core – LangGraph Workflow

The agent’s brain is built with LangGraph. State is defined using a TypedDict that automatically accumulates messages via the Annotated field.

# app/agents/state.py
""" Agent state definition """
from typing import Annotated, TypedDict
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]

The workflow graph creates a reasoning node ( call_model) and a tool node ( ToolNode), connects them with conditional edges, and compiles the graph.

# app/agents/graph.py
""" LangGraph agent definition """
import structlog
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from app.agents.state import AgentState
from app.agents.tools import get_tools
from app.services.llm import get_llm

logger = structlog.get_logger(__name__)

def create_agent_graph():
    llm = get_llm()
    tools = get_tools()
    llm_with_tools = llm.bind_tools(tools)
    workflow = StateGraph(AgentState)

    async def call_model(state: AgentState) -> dict:
        logger.info("calling_model")
        response = await llm_with_tools.ainvoke(state["messages"])
        return {"messages": [response]}

    tool_node = ToolNode(tools)
    workflow.add_node("agent", call_model)
    workflow.add_node("tools", tool_node)
    workflow.add_edge(START, "agent")

    def should_continue(state: AgentState) -> str:
        last_message = state["messages"][-1]
        if hasattr(last_message, "tool_calls") and last_message.tool_calls:
            return "tools"
        return END

    workflow.add_conditional_edges("agent", should_continue, ["tools", END])
    workflow.add_edge("tools", "agent")
    return workflow.compile()

_agent = None

def get_agent():
    global _agent
    if _agent is None:
        _agent = create_agent_graph()
    return _agent

Tool System – MCP Integration

Two example tools are provided: a safe calculator and a current‑time getter. The @tool decorator generates a schema that the LLM can understand.

# app/agents/tools.py
""" Custom tools for the agent """
import structlog
from typing import List
from langchain_core.tools import tool
from datetime import datetime

logger = structlog.get_logger(__name__)

@tool
def calculator(expression: str) -> str:
    """Evaluate a mathematical expression"""
    try:
        result = eval(expression, {"__builtins__": {}})
        logger.info("calculator_used", expression=expression, result=result)
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}"

@tool
def get_current_time() -> str:
    """Get the current time in ISO format"""
    return datetime.now().isoformat()

def get_tools() -> List:
    """Return all available tools"""
    return [calculator, get_current_time]

An optional MCP server can host more complex tools, demonstrating the protocol’s decoupling.

# mcp_servers/math_server.py
""" Example MCP server for math operations """
from fastmcp import FastMCP

mcp = FastMCP("Math Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b

@mcp.tool()
def multiply(a: int, b: int) -> int:
    """Multiply two numbers"""
    return a * b

if __name__ == "__main__":
    mcp.run(transport="stdio")

Multi‑LLM Support

The get_llm function reads configuration from settings and returns either an OpenAI or Anthropic client, making the agent provider‑agnostic.

# app/services/llm.py
""" LLM service - multi‑provider support """
import structlog
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from app.core.config import settings

logger = structlog.get_logger(__name__)

def get_llm():
    if settings.LLM_PROVIDER == "openai":
        logger.info("initializing_openai", model=settings.DEFAULT_LLM_MODEL)
        return ChatOpenAI(
            api_key=settings.OPENAI_API_KEY,
            model=settings.DEFAULT_LLM_MODEL,
            temperature=settings.DEFAULT_TEMPERATURE,
        )
    elif settings.LLM_PROVIDER == "anthropic":
        logger.info("initializing_anthropic", model=settings.DEFAULT_LLM_MODEL)
        return ChatAnthropic(
            api_key=settings.ANTHROPIC_API_KEY,
            model=settings.DEFAULT_LLM_MODEL,
            temperature=settings.DEFAULT_TEMPERATURE,
        )
    else:
        raise ValueError(f"Unsupported LLM provider: {settings.LLM_PROVIDER}")

API Endpoints – Connecting Users to the Agent

Pydantic models validate incoming requests and outgoing responses. The /invoke endpoint creates an initial state, runs the compiled graph, and returns the final reply together with message count and optional session ID.

# app/api/v1/router.py
""" Main API router """
from fastapi import APIRouter
from app.api.v1 import agent

api_router = APIRouter()
api_router.include_router(agent.router, prefix="/agent", tags=["Agent"])
# app/api/v1/agent.py
""" Agent API endpoints """
import structlog
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from app.agents.graph import invoke_agent

logger = structlog.get_logger(__name__)
router = APIRouter()

class AgentRequest(BaseModel):
    query: str = Field(..., min_length=1, description="User query")
    session_id: str | None = Field(None, description="Session ID for multi‑turn dialogue")

class AgentResponse(BaseModel):
    response: str = Field(..., description="Agent reply")
    message_count: int = Field(..., description="Total message rounds")
    session_id: str | None = Field(None, description="Session ID")

@router.post("/invoke", response_model=AgentResponse)
async def invoke_agent_endpoint(request: AgentRequest) -> AgentResponse:
    try:
        logger.info("agent_invoked", query=request.query)
        result = await invoke_agent(request.query, request.session_id)
        return AgentResponse(**result)
    except Exception as e:
        logger.error("agent_error", error=str(e))
        raise HTTPException(status_code=500, detail=str(e))

@router.get("/status")
async def agent_status():
    from app.core.config import settings
    from app.agents.tools import get_tools
    tools = get_tools()
    return {
        "status": "operational",
        "llm_provider": settings.LLM_PROVIDER,
        "tool_count": len(tools),
        "available_tools": [tool.name for tool in tools],
    }

Demo Run and Tests

Start the service with uvicorn app.main:app --reload, then use curl to invoke simple calculations or composite tasks. A test suite under tests/ validates the /invoke and /status endpoints.

# tests/test_agent.py
"""Agent API tests."""
import pytest

@pytest.mark.asyncio
async def test_agent_invoke(client):
    response = await client.post(
        "/api/v1/agent/invoke",
        json={"query": "What is 2+2?"},
    )
    assert response.status_code == 200
    data = response.json()
    assert "response" in data
    assert "message_count" in data

@pytest.mark.asyncio
async def test_agent_status(client):
    response = await client.get("/api/v1/agent/status")
    assert response.status_code == 200
    data = response.json()
    assert data["status"] == "operational"

Production‑Level Enhancements

Persistence : Use PostgresChatMessageHistory to store conversation history across sessions.

Rate Limiting : Apply slowapi middleware (e.g., 10/minute) to protect the endpoint.

Monitoring & Tracing : Integrate LangSmith via environment variables for observability.

Common Issues & Debugging Tips

Q1: Agent gets stuck in a loop

Add a maximum iteration counter in create_agent_graph and a check node that aborts with a friendly message when the limit is reached.

Q2: How to log the reasoning process

Implement a custom BaseCallbackHandler that logs LLM start, tool start, and tool end events.

Q3: Improving tool call accuracy

Provide richer docstrings for tools and supply example calls via tool_examples when binding tools to the LLM.

Conclusion

The demo showcases the essential components of a modern AI agent: stateful management, tool calling, conditional loops, and modular design. By swapping out the LLM, adding persistent storage, or extending the MCP tool set, developers can evolve this prototype into a production‑grade digital assistant.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonLLMMCPAI AgentFastAPIWeb APITool CallingLangGraph
Data STUDIO
Written by

Data STUDIO

Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.