Understanding AIGC, RAG, Function Calling, and the MCP Protocol: A Practical AI Guide
This article explains the fundamentals of AI‑generated content (AIGC), the Retrieval‑Augmented Generation (RAG) technique, Function Calling, autonomous agents, and the Model Context Protocol (MCP), highlighting their evolution, technical workflows, limitations, and real‑world examples for developers.
AIGC Overview
Artificial‑Intelligence‑Generated Content (AIGC) is the automatic creation of text, images, video, or other media using generative AI models such as large language models (LLMs) and diffusion models. Early AIGC systems were single‑modal (e.g., GPT‑3 only generated text, Stable Diffusion only generated images). Modern models are multi‑modal , supporting text‑to‑image, image‑to‑text, text‑to‑video, etc.
Two inherent limitations of pure AIGC:
No real‑time knowledge : models are trained on a fixed corpus and cannot answer questions about events after the training cutoff.
Cannot use external tools : they cannot query live APIs or invoke functions.
Retrieval‑Augmented Generation (RAG)
RAG mitigates the knowledge‑staleness problem by retrieving relevant passages from an external knowledge base at query time and feeding them together with the original prompt to the LLM.
Typical RAG workflow:
Retrieval : Identify keywords in the user query and fetch the most relevant text fragments from a large corpus (document store, database, or web index).
Augmentation : Append the retrieved passages to the original prompt, providing up‑to‑date context.
Generation : The LLM generates an answer grounded in the retrieved facts, reducing hallucinations.
Because RAG still relies on static knowledge sources, it cannot answer truly real‑time queries such as current weather without additional tool support.
Function Calling
Function Calling equips LLMs with the ability to invoke external APIs or executable functions when a request requires up‑to‑date information or side‑effects.
Core capabilities:
Detect whether a user request needs a tool.
Automatically generate a structured JSON payload with the required parameters.
Execute the function, capture the result, and feed it back to the model for the final response.
Example interaction:
"I’m traveling to Hangzhou tomorrow, please check the weather."
Traditional LLM response:
"I can only provide information up to October 2025."
Model with Function Calling:
It determines the request requires a get_weather function, generates {"city":"Hangzhou"} , calls the weather API, receives the forecast, and replies: "Tomorrow Hangzhou 24 °C, light rain, bring an umbrella."
Agents
Agents combine RAG and Function Calling to perform multi‑step tasks autonomously. An agent can plan, decide, and execute a sequence of tool calls without explicit step‑by‑step user instructions.
Typical travel‑planning agent workflow (user wants a Shanghai→Shenzhen itinerary for a national holiday):
Query weather for the travel date.
Check highway traffic conditions.
Locate fuel stations and service areas.
Suggest mid‑trip accommodation.
Output a complete travel itinerary.
The agent iterates through a "think → plan → decide → act" loop, producing the final plan after all sub‑tasks are resolved.
Model Context Protocol (MCP)
MCP (Model Context Protocol) is an open, universal protocol released by Anthropic in November 2024 to standardize how LLMs interact with external data sources, tools, and prompts. It acts as a "USB‑C" for AI applications, allowing plug‑and‑play integration of diverse resources.
Architecture
MCP Host : The application that launches the LLM (e.g., Cursor, Claude).
MCP Client : Maintains a 1:1 connection to an MCP Server; a host can run multiple clients.
MCP Server : Lightweight process that exposes context, tools, and resources to the client via the protocol.
Local resources : Files, databases, or APIs on the client side.
Remote services : External APIs or data stores.
Communication
MCP uses a JSON‑RPC 2.0‑style message format with three types: request , response , and notification . Two primary transport modes are supported:
STDIO : Standard input/output for local processes.
SSE + HTTP POST : Server‑Sent Events for remote services.
Optional Streamable HTTP for large file streaming.
Primitives
Servers can expose three primitive types:
Prompts : Pre‑written prompt templates that can be inserted into the LLM’s input.
Resources : Read‑only data objects (e.g., database rows, documents) that the model can consume as context.
Tools : Callable functions (e.g., send_email, fetch_weather) that require user approval before execution.
Clients can provide two auxiliary primitives to assist servers:
Roots : Authorized file‑system entry points allowing the server to read client‑side files securely.
Sampling : A reverse call where the server asks the client’s LLM to generate text, enabling multi‑turn reasoning within an agent.
Current Challenges
Manual configuration of MCP services can lead to token waste when many tools are advertised to the model.
Lack of built‑in tool‑availability detection; unavailable services can degrade user experience.
Language‑binding ecosystem is still limited—most examples are in JavaScript or Python, with Java support immature.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
