17 min read

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

The article examines the inherent knowledge cutoff of large language models, introduces function calling, Model Context Protocol (MCP), and Agent‑to‑Agent (A2A) as solutions for real‑time data access, compares their architectures, communication patterns, and use cases, and discusses their respective strengths and drawbacks.

Sohu Tech Products

May 21, 2025

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

LLM Limitations

Large language models (LLMs) excel at reasoning thanks to advances in algorithms, data scale, and hardware, but they are limited to the knowledge present in their training data. They cannot retrieve up‑to‑date information or act on external systems without additional mechanisms.

Function Calling

Function calling, introduced by OpenAI in 2023 for the ChatGPT plugin system, lets a model decide—based on the user prompt—whether to invoke a user‑provided function and return a structured tool_calls object instead of plain text. This bridges LLMs to external services via well‑defined JSON schemas.

The API request includes a tools array that describes each callable function. Example (Python, using DeepSeek’s endpoint):

response = requests.post(
    "https://api.deepseek.com/v1/chat/completions",
    headers=headers,
    json={
        "messages": [{"role": "user", "content": question.question}],
        "model": "deepseek-chat",
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": "get_current_time",
                    "description": "获取现在的时间",
                    "parameters": {"type": "object", "properties": {}, "required": []}
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "获取今天的天气",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {"type": "string", "description": "获取天气情况的城市或者国家，比如北京、东京、新加坡"}
                        },
                        "required": ["location"]
                    }
                }
            }
        ],
        "tool_choice": "auto"
    }
)

The tool_choice field controls the behavior: auto (default): the model decides whether to call a tool. {"name": "your_function"}: forces a specific function. none: disables tool calls.

If the model decides a tool is needed, it returns a tool_calls object. The server then executes the indicated function, appends the result as a tool message, and sends a follow‑up request to obtain the final answer. Example response fragment:

{
    "id": "a48ab60a-48a7-4f7b-a09e-dd93eead3c4d",
    "object": "chat.completion",
    "model": "deepseek-chat",
    "choices": [{
        "message": {
            "role": "assistant",
            "content": "",
            "tool_calls": [{
                "id": "call_0_3b76f546-f8c3-4f67-93cd-9bffb62dc1bf",
                "type": "function",
                "function": {"name": "get_current_time", "arguments": "{}"}
            }]
        },
        "finish_reason": "tool_calls"
    }]
}

Python code that processes the response, invokes the function, and sends the second request:

# Define the function that will be called
import datetime

def get_current_time():
    return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

if response.status_code == 200:
    result = response.json()
    message = result["choices"][0]["message"]
    if "tool_calls" in message:
        tool_call = message["tool_calls"][0]
        if tool_call["function"]["name"] == "get_current_time":
            current_time = get_current_time()
            new_data = {
                "messages": [
                    {"role": "user", "content": question.question},
                    message,
                    {"role": "tool", "tool_call_id": tool_call["id"], "content": current_time}
                ],
                "model": "deepseek-chat",
                "tools": tools
            }
            new_response = requests.post(
                "https://api.deepseek.com/v1/chat/completions",
                headers=headers,
                json=new_data
            )
            if new_response.status_code == 200:
                final = new_response.json()
                answer = final["choices"][0]["message"]["content"]
                print({"answer": answer})
            else:
                raise Exception("Second API call failed")
    else:
        print({"answer": message["content"]})
else:
    raise Exception("Initial API call failed")

Limitations of raw function calling include the lack of a unified standard across providers, inconsistent context handling, and growing complexity when many tools or dependencies are involved.

Model Context Protocol (MCP)

MCP, proposed by Anthropic, standardizes how applications provide context and tools to LLMs. It acts like a "USB‑C" for AI, enabling dynamic, multi‑tool integration while keeping data handling local when needed.

Four core components:

Host – the AI application that needs external data (e.g., Claude Desktop, IDE).

Client – a bridge that maintains a 1:1 connection to the server, handling routing, capability negotiation, and subscription management.

Server – supplies external data, APIs, or services and can recursively call other servers to build pipelines.

Base Protocol – defines message formats, lifecycle, and transport (JSON‑RPC over WebSocket or HTTP).

Key advantages:

Reusable server implementations; a single server can serve many clients.

Local‑first data handling satisfies compliance requirements.

Native support for multi‑step workflows via recursive server calls.

Growing open‑source ecosystem (numerous community‑maintained servers).

Agent‑to‑Agent (A2A)

A2A, announced by Google in April 2025, is an open protocol that enables AI agents from different frameworks and vendors to discover each other, advertise capabilities, coordinate tasks, and cooperate securely. It complements MCP: MCP solves model‑to‑tool communication, while A2A solves agent‑to‑agent interaction.

Core Architecture

Two agent roles are defined:

Client Agent – receives user requests, formulates concrete tasks, and invokes remote agents.

Remote Agent – executes the assigned task and returns results.

Key components of each agent:

Agent Card – a JSON document (typically hosted at /.well-known/agent.json) that describes the agent’s capabilities, authentication requirements, and endpoint URLs.

Standard Message & Task Structures – based on JSON‑RPC 2.0, defining Task, Message, Part, and Artifact objects.

Multi‑Content Support – TextPart, FilePart, DataPart allow agents to exchange plain text, files, or structured JSON.

Streaming via Server‑Sent Events (SSE) – asynchronous endpoints tasks/sendSubscribe and tasks/resubscribe push status updates and artifacts in real time.

Push Notification (Webhook) – optional tasks/pushNotification/set lets an agent proactively POST updates to a client‑provided URL.

Task Lifecycle Management – well‑defined states ( submitted, working, input‑required, completed, failed, canceled) and transitions.

Authentication & Security – agents declare required auth mechanisms in the Agent Card and in push‑notification configs.

Reference Implementations – libraries and examples for Python and JavaScript/TypeScript, plus integrations with ADK, CrewAI, LangGraph, Genkit, LlamaIndex, Marvin, Semantic Kernel.

A2A Workflow

Discovery – the client fetches the remote agent’s Agent Card (e.g., GET https://example.com/.well-known/agent.json) to learn its capabilities.

Initiation – the client sends a task via tasks/send (synchronous) or tasks/sendSubscribe (asynchronous streaming).

Processing – the remote agent transitions the task state: submitted → working If additional input is required, state becomes input‑required and the client can resend a tasks/send with the missing data.

During streaming, status updates ( TaskStatusUpdateEvent) and partial results ( TaskArtifactUpdateEvent) are pushed via SSE.

Eventually the task reaches a terminal state: completed, failed, or canceled.

Query & Management – the client may call tasks/get to query status, tasks/cancel to abort, or tasks/resubscribe to re‑attach to a streaming task.

Push Notification (optional) – if a webhook URL was configured, the remote agent can POST status or artifact updates without the client polling.

Artifacts – upon completion, the task returns one or more artifacts (text, files, JSON) that the client can consume.

Comparison of Function Calling, MCP, and A2A

Focus : Function Calling – model ↔ single tool; MCP – model ↔ multiple tools; A2A – agent ↔ agent.

Communication Mode : Function Calling uses a one‑shot RPC; MCP uses bidirectional JSON‑RPC/WebSocket; A2A uses HTTP + SSE or gRPC streaming.

Scalability : Function Calling scales as M × N (models × tools); MCP as M + N (single server mediates many tools); A2A as K agents (each agent can be a client or server).

Invocation Style : Function Calling is explicit at the application layer; MCP allows recursive calls inside the server; A2A relies on asynchronous task subscription.

Typical Use‑Case : Function Calling – add a single capability to a model; MCP – complex enterprise apps needing many data sources/tools; A2A – multi‑agent pipelines for sophisticated task decomposition.

These protocols are complementary rather than mutually exclusive:

Function Calling enables a model to actively invoke a tool.

MCP provides a unified plug‑in interface so many tools can be accessed consistently.

A2A orchestrates multiple agents, allowing them to cooperate on large, multi‑step problems.

LLM MCP Tool Integration Function Calling A2A AI protocols agent communication

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.