Artificial Intelligence 22 min read

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

This article explains how the Model Context Protocol (MCP) standardizes AI‑assistant communication, compares the traditional Server‑Sent Events (SSE) transport with the newer Streamable HTTP mechanism, and provides step‑by‑step code examples for building both MCP servers and clients that leverage Streamable HTTP for bidirectional, session‑aware data exchange.

Instant Consumer Technology Team

May 30, 2025

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

MCP Protocol Overview

MCP, an open standard driven by Anthropic, defines a uniform interface between large language models (LLMs) and external data sources or tools, enabling LLMs to fetch context dynamically and extend their capabilities. Its core architecture consists of an MCP client (implemented on the LLM side), an MCP server (the external system), and a bidirectional context‑information exchange layer.

MCP Client : builds requests on the LLM side and sends them to the MCP server.

MCP Server : receives client requests, interacts with actual data sources or tools, formats the response according to the MCP specification, and returns it.

Context Information Exchange : facilitates two‑way context sharing between LLMs and external systems.

Using MCP, developers can integrate AI assistants with various applications and data sources more easily, achieving efficient AI communication.

Limitations of Traditional SSE

Server‑Sent Events (SSE) is a one‑way HTTP protocol that pushes real‑time updates from server to client. Its drawbacks for modern AI use cases include:

Communication direction limited : only server‑to‑client, unable to support interactive dialogs.

Missing session management : no built‑in mechanism for maintaining complex state.

Weak connection recovery : limited ability to resume after network interruptions, risking data loss.

Data format support limited : primarily UTF‑8 text, cannot handle diverse payload types.

These constraints make SSE increasingly unsuitable for contemporary AI applications.

Innovations of Streamable HTTP

Streamable HTTP is the transport mechanism recommended by the MCP framework. It leverages standard HTTP to provide efficient, bidirectional data streaming with the following key features:

Single‑endpoint communication : a single HTTP endpoint handles all MCP traffic, simplifying network architecture.

Multiple response modes : supports both batch JSON responses and streaming (SSE‑style) responses to meet different needs.

Built‑in session management : uses the Mcp-Session-Id header to maintain state automatically.

Connection recoverability : can resume streams after interruptions, improving reliability.

Flexible authentication : supports various auth schemes for enhanced security.

CORS configuration : provides flexible cross‑origin settings for easy web integration.

These capabilities give Streamable HTTP a clear advantage over traditional SSE in AI communication scenarios.

Technical Comparison: SSE vs. Streamable HTTP

Compared side‑by‑side, Streamable HTTP offers bidirectional communication, built‑in session handling, support for multiple data formats, and reliable reconnection, while maintaining high compatibility with existing infrastructure.

Implementation and Demo

Based on Streamable HTTP – MCP Server

Previously a weather‑query server was built using the SSE transport. To switch to Streamable HTTP, the server’s execution mode is changed to streamable-http. The only required modification is updating the transport argument when calling mcp.run(). The full server code is shown below.

from typing import Any
import httpx
from mcp.server.fastmcp import FastMCP

# Initialize FastMCP server
mcp = FastMCP(
    name="weather",
    host="0.0.0.0",
    port=8002,
    description="通过城市名称（拼音）或经纬度获取天气信息",
    sse_path="/mcp",
)

# Constants
NWS_API_BASE = "https://api.openweathermap.org/data/2.5/weather"
USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"

def kelvin_to_celsius(kelvin: float) -> float:
    return kelvin - 273.15

async def get_weather_from_cityname(cityname: str) -> dict[str, Any] | None:
    """向openweathermap发送请求并进行适当的错误处理。"""
    headers = {"User-Agent": USER_AGENT, "Accept": "application/geo+json"}
    params = {"q": cityname, "appid": "24ecadbe4bb3d55cb1f06ea48a41ac51"}
    async with httpx.AsyncClient() as client:
        try:
            response = await client.get(NWS_API_BASE, headers=headers, params=params)
            response.raise_for_status()
            return response.json()
        except Exception:
            return None

async def get_weather_from_latitude_longitude(latitude: float, longitude: float) -> dict[str, Any] | None:
    """向openweathermap发送请求并进行适当的错误处理。"""
    headers = {"User-Agent": USER_AGENT, "Accept": "application/geo+json"}
    params = {"lat": latitude, "lon": longitude, "appid": "24ecadbe4bb3d55cb1f06ea48a41ac51"}
    async with httpx.AsyncClient() as client:
        try:
            response = await client.get(NWS_API_BASE, headers=headers, params=params)
            response.raise_for_status()
            return response.json()
        except Exception:
            return None

def format_alert(feature: dict) -> str:
    """将接口返回的天气信息进行格式化文本输出"""
    if feature["cod"] == 404:
        return "参数异常，请确认城市名称是否正确。"
    elif feature["cod"] == 401:
        return "API key 异常，请确认API key是否正确。"
    elif feature["cod"] == 200:
        return f"""
        City: {feature.get('name', 'Unknown')}
        Weather: {feature.get('weather', [{}])[0].get('description', 'Unknown')}
        Temperature: {kelvin_to_celsius(feature.get('main', {})).get('temp', 0):.2f}°C
        Humidity: {feature.get('main', {}).get('humidity', 0)}%
        Wind Speed: {feature.get('wind', {}).get('speed', 0):.2f} m/s
        """
    else:
        return "未知错误，请稍后再试。"

@mcp.tool()
async def get_weather_from_cityname_tool(city: str) -> str:
    """Get weather information for a city.

    Args:
        city: City name (e.g., "wuhan"). For Chinese cities, please use pinyin
    """
    data = await get_weather_from_cityname(city)
    return format_alert(data)

@mcp.tool()
async def get_weather_from_latitude_longitude_tool(latitude: float, longitude: float) -> str:
    """Get weather information for a location.

    Args:
        latitude: Latitude of the location
        longitude: Longitude of the location
    """
    data = await get_weather_from_latitude_longitude(latitude, longitude)
    return format_alert(data)

if __name__ == "__main__":
    print("Starting server...")
    mcp.run(transport='streamable-http')

After updating the mcp[cli] version to 1.8.0, start the server and test it with the Cherry Studio client (or any compatible UI). The UI steps include adding a new MCP server, configuring the endpoint URL (e.g., http://127.0.0.1:8002/mcp), selecting the LLM (such as Moonshot), and sending queries like “武汉和北京天气怎么样？” to observe concurrent responses.

Based on Streamable HTTP – MCP Client

The client code mirrors the earlier SSE implementation but adds support for the streamable_http transport. It loads a JSON configuration file that lists active MCP servers, establishes connections, and provides an interactive chat loop that can invoke server‑side tools.

import asyncio
from typing import Optional
from contextlib import AsyncExitStack
import json
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp.client.sse import sse_client
from mcp.client.streamable_http import streamablehttp_client
from dotenv import load_dotenv
import os, re
from openai import OpenAI
from lxml import etree

load_dotenv()

class MCPClient:
    def __init__(self):
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.API_KEY = os.getenv("API_KEY")
        self.BASE_URL = os.getenv("BASE_URL")
        self.MODEL = os.getenv("MODEL")
        self.client = OpenAI(api_key=self.API_KEY, base_url=self.BASE_URL)
        self.sessions = {}
        self.messages = []
        with open("./MCP_Prompt.txt", "r", encoding="utf-8") as file:
            self.system_prompt = file.read()

    async def mcp_json_config(self, mcp_json_file):
        try:
            with open(mcp_json_file, 'r') as f:
                mcp_config: dict = json.load(f)
        except json.JSONDecodeError:
            raise ValueError("Invalid MCP config")
        servers_config = mcp_config.get('mcpServers', {})
        for k, v in servers_config.items():
            try:
                if not v.get('isActive', False):
                    continue
                print('-' * 50)
                mcp_name = v.get('name', k)
                mcp_type: str = v.get('type', 'stdio')
                if mcp_type.lower() == 'stdio':
                    command = v.get('command')
                    args = v.get('args', [])
                    env = v.get('env', {})
                    if command is None:
                        raise ValueError(f"{mcp_name} command is empty.")
                    if args == []:
                        raise ValueError(f"{mcp_name} args is empty.")
                    await self.connect_to_stdio_server(mcp_name, command, args, env)
                elif mcp_type.lower() == 'sse':
                    server_url = v.get('url')
                    if server_url is None:
                        raise ValueError(f"{mcp_name} server_url is empty.")
                    await self.connect_to_sse_server(mcp_name, server_url)
                elif mcp_type.lower() == 'streamable_http':
                    server_url = v.get('url')
                    if server_url is None:
                        raise ValueError(f"{mcp_name} server_url is empty.")
                    await self.connect_to_streamable_http_server(mcp_name, server_url)
                else:
                    raise ValueError(f"{mcp_name} mcp type must in [stdio, sse, streamable_http].")
            except Exception as e:
                print(f"Error connecting to {mcp_name}: {e}")

    async def connect_to_stdio_server(self, mcp_name, command: str, args: list[str], env: dict[str, str] = {}):
        server_params = StdioServerParameters(command=command, args=args, env=env)
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        self.sessions[mcp_name] = self.session
        await self.session.initialize()
        response = await self.session.list_tools()
        available_tools = [
            '##' + mcp_name + '
### Available Tools
- ' + tool.name + "
" + tool.description + "
" + json.dumps(tool.inputSchema)
            for tool in response.tools
        ]
        self.system_prompt = self.system_prompt.replace("<$MCP_INFO$>", "
".join(available_tools) + "
<$MCP_INFO$>")
        print(f"Successfully connected to {mcp_name} server with tools:", [tool.name for tool in response.tools])

    async def connect_to_sse_server(self, mcp_name, server_url: str):
        stdio_transport = await self.exit_stack.enter_async_context(sse_client(server_url))
        self.sse, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.sse, self.write))
        self.sessions[mcp_name] = self.session
        await self.session.initialize()
        response = await self.session.list_tools()
        available_tools = [
            '##' + mcp_name + '
### Available Tools
- ' + tool.name + "
" + tool.description + "
" + json.dumps(tool.inputSchema)
            for tool in response.tools
        ]
        self.system_prompt = self.system_prompt.replace("<$MCP_INFO$>", "
".join(available_tools) + "
<$MCP_INFO$>
")
        print(f"Successfully connected to {mcp_name} server with tools:", [tool.name for tool in response.tools])

    async def connect_to_streamable_http_server(self, mcp_name, server_url: str):
        stdio_transport = await self.exit_stack.enter_async_context(streamablehttp_client(server_url))
        self.streamable_http, self.write, _ = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.streamable_http, self.write))
        self.sessions[mcp_name] = self.session
        await self.session.initialize()
        response = await self.session.list_tools()
        available_tools = [
            '##' + mcp_name + '
### Available Tools
- ' + tool.name + "
" + tool.description + "
" + json.dumps(tool.inputSchema)
            for tool in response.tools
        ]
        self.system_prompt = self.system_prompt.replace("<$MCP_INFO$>", "
".join(available_tools) + "
<$MCP_INFO$>
")
        print(f"Successfully connected to {mcp_name} server with tools:", [tool.name for tool in response.tools])

    async def process_query(self, query: str) -> str:
        self.messages.append({"role": "system", "content": self.system_prompt})
        self.messages.append({"role": "user", "content": query})
        response = self.client.chat.completions.create(model=self.MODEL, max_tokens=1024, messages=self.messages)
        final_text = []
        content = response.choices[0].message.content
        if '<use_mcp_tool>' not in content:
            final_text.append(content)
        else:
            server_name, tool_name, tool_args = self.parse_tool_string(content)
            result = await self.sessions[server_name].call_tool(tool_name, tool_args)
            self.messages.append({"role": "assistant", "content": content})
            self.messages.append({"role": "user", "content": f"[Tool {tool_name}
 returned: {result}]"})
            response = self.client.chat.completions.create(model=self.MODEL, max_tokens=1024, messages=self.messages)
            final_text.append(response.choices[0].message.content)
        return "
".join(final_text)

    def parse_tool_string(self, tool_string: str) -> tuple[str, str, dict]:
        tool_string = re.findall("(<use_mcp_tool>.*?</use_mcp_tool>)", tool_string, re.S)[0]
        root = etree.fromstring(tool_string)
        server_name = root.find('server_name').text
        tool_name = root.find('tool_name').text
        try:
            tool_args = json.loads(root.find('arguments').text)
        except json.JSONDecodeError:
            raise ValueError("Invalid tool arguments")
        return server_name, tool_name, tool_args

    async def chat_loop(self):
        print("
MCP Client Started!")
        print("Type your queries or 'quit' to exit.")
        self.messages = []
        while True:
            try:
                query = input("
Query: ").strip()
                if query.lower() == 'quit':
                    break
                if query == '':
                    print("Please enter a query.")
                    continue
                response = await self.process_query(query)
                print(response)
            except Exception as e:
                print(f"
Error: {str(e)}")

    async def cleanup(self):
        await self.exit_stack.aclose()

async def main():
    client = MCPClient()
    mcp_config_file = './mcp.json'
    await client.mcp_json_config(mcp_config_file)
    await client.chat_loop()
    await client.cleanup()

if __name__ == "__main__":
    asyncio.run(main())

Two example mcp.json configurations are provided: one defining a single weather‑http server, and another defining both a time‑http and a weather‑http server. After loading the configuration, the client can invoke the defined tools (e.g., weather queries) and receive responses via Streamable HTTP.

Conclusion

Streamable HTTP, as the recommended transport in the MCP protocol, combines the broad compatibility of standard HTTP with the real‑time push capabilities of SSE, delivering an efficient and flexible communication method for AI assistants. As AI‑assistant‑to‑application communication demands grow, Streamable HTTP is poised to become the new standard for AI communication.

Further resources:

MCP official documentation: https://modelcontextprotocol.io/

MCP framework documentation: https://mcp-framework.com/docs/Transports/http-stream-transport/

MCP GitHub repository: https://github.com/modelcontextprotocol/python-sdk

AI LLM MCP Protocol Streamable HTTP

Written by

Instant Consumer Technology Team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.