How AI Gateway Redefines AI Application Infrastructure with Serverless Flexibility
The article provides a comprehensive overview of the AI Gateway product, detailing its evolution, core capabilities across model, tool, and agent access, security features, the open‑source HiMarket platform, and the new Serverless edition that dramatically lowers entry costs for AI workloads.
AI Gateway Evolution and Core Capabilities
AI Gateway has been offered as a cloud product for half a year and has undergone extensive evolution from its kernel to its external features. It originated from the need to support AI‑native architectures, providing a unified entry point for AI applications that must quickly, stably, and securely access models, tools, and other agents.
Model Access Challenges ("Three‑many, Two‑high")
Multiple models: Different vendors expose incompatible APIs, making seamless switching difficult.
Multiple modalities: Various transport protocols (SSE, WebSocket, WebRTC) and request/response patterns increase infrastructure complexity.
Multiple scenarios: Real‑time, high‑stability, and other use‑cases require distinct throttling strategies.
High security: Model calls risk data leakage and compliance violations.
High stability: Model services have low rate limits and unstable response times, affecting overall AI application availability.
The AI Gateway addresses these issues by offering configurable routing strategies, OpenAI‑compatible interfaces, support for HTTP and WebSocket protocols, per‑consumer authentication, KMS‑backed secret management, comprehensive observability, and built‑in security guardrails.
Tool Access Challenges
Precision: The gateway supports both legacy HTTP services and hosted MCP servers, allowing dynamic tool composition and intelligent routing to return only relevant tools.
Security: Consumer‑level authentication can be applied per tool, and AI‑security modules intercept injection attacks and enforce data sanitization before model invocation.
Agent Access Challenges
Stability: The gateway integrates health checks, gray‑release mechanisms, and multi‑level rate limiting to protect downstream agents.
Flexibility: Service discovery and REST‑to‑A2A conversion enable heterogeneous AI agents to be exposed uniformly, with optional secondary authentication for low‑code agents.
Security Defense Layers
Network security: SSL certificates, WAF integration, and IP black/white lists protect the gateway entry point.
Data security: Backend service authentication and API‑KEY management, with optional KMS storage, prevent data leakage.
Content security: Deep integration with AI security guardrails provides protection against sensitive‑word leakage, compliance violations, prompt injection, and brute‑force attacks.
AI Open Platform (HiMarket) and Ecosystem
HiMarket is an open‑source AI developer portal that helps enterprises manage developers, MCP servers, and agents. It consists of three layers:
Developer portal – customizable per agent or MCP server.
Admin backend – manages portals, products (MCP servers and agents), and developer accounts.
Infrastructure layer – AI Gateway and Nacos, typically visible only to administrators.
The platform’s source code is available at https://github.com/higress-group/himarket.
AI Gateway Serverless Edition
The Serverless version, launched on Alibaba Cloud, reduces costs by eliminating instance fees and charging only for usage. Its three main characteristics are:
All core functions of the dedicated‑instance edition are retained, except built‑in WAF and plugin capabilities.
Pay‑per‑call pricing, cutting costs by up to 90 % for low‑traffic workloads.
Zero‑maintenance operation with automatic upgrades and elastic scaling; however, sudden traffic spikes may be throttled to protect the service.
For detailed comparisons and pricing, refer to the official Alibaba Cloud documentation links provided in the original article.
Conclusion
Over the past six months, AI Gateway has progressed from a single‑model proxy to a unified AI application gateway, supporting model, MCP, and agent APIs, offering end‑to‑end routing, protocol conversion, rate limiting, caching, observability, and security. Combined with the open‑source HiMarket platform and the cost‑effective Serverless edition, it now serves as a foundational, pluggable, and extensible infrastructure for AI workloads.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
