How AI Gateway Redefines AI Application Infrastructure with Serverless Flexibility

The article provides a comprehensive overview of the AI Gateway product, detailing its evolution, core capabilities across model, tool, and agent access, security features, the open‑source HiMarket platform, and the new Serverless edition that dramatically lowers entry costs for AI workloads.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How AI Gateway Redefines AI Application Infrastructure with Serverless Flexibility

AI Gateway Evolution and Core Capabilities

AI Gateway has been offered as a cloud product for half a year and has undergone extensive evolution from its kernel to its external features. It originated from the need to support AI‑native architectures, providing a unified entry point for AI applications that must quickly, stably, and securely access models, tools, and other agents.

Model Access Challenges ("Three‑many, Two‑high")

Multiple models: Different vendors expose incompatible APIs, making seamless switching difficult.

Multiple modalities: Various transport protocols (SSE, WebSocket, WebRTC) and request/response patterns increase infrastructure complexity.

Multiple scenarios: Real‑time, high‑stability, and other use‑cases require distinct throttling strategies.

High security: Model calls risk data leakage and compliance violations.

High stability: Model services have low rate limits and unstable response times, affecting overall AI application availability.

The AI Gateway addresses these issues by offering configurable routing strategies, OpenAI‑compatible interfaces, support for HTTP and WebSocket protocols, per‑consumer authentication, KMS‑backed secret management, comprehensive observability, and built‑in security guardrails.

Tool Access Challenges

Precision: The gateway supports both legacy HTTP services and hosted MCP servers, allowing dynamic tool composition and intelligent routing to return only relevant tools.

Security: Consumer‑level authentication can be applied per tool, and AI‑security modules intercept injection attacks and enforce data sanitization before model invocation.

Agent Access Challenges

Stability: The gateway integrates health checks, gray‑release mechanisms, and multi‑level rate limiting to protect downstream agents.

Flexibility: Service discovery and REST‑to‑A2A conversion enable heterogeneous AI agents to be exposed uniformly, with optional secondary authentication for low‑code agents.

Security Defense Layers

Network security: SSL certificates, WAF integration, and IP black/white lists protect the gateway entry point.

Data security: Backend service authentication and API‑KEY management, with optional KMS storage, prevent data leakage.

Content security: Deep integration with AI security guardrails provides protection against sensitive‑word leakage, compliance violations, prompt injection, and brute‑force attacks.

AI Open Platform (HiMarket) and Ecosystem

HiMarket is an open‑source AI developer portal that helps enterprises manage developers, MCP servers, and agents. It consists of three layers:

Developer portal – customizable per agent or MCP server.

Admin backend – manages portals, products (MCP servers and agents), and developer accounts.

Infrastructure layer – AI Gateway and Nacos, typically visible only to administrators.

The platform’s source code is available at https://github.com/higress-group/himarket.

AI Gateway Serverless Edition

The Serverless version, launched on Alibaba Cloud, reduces costs by eliminating instance fees and charging only for usage. Its three main characteristics are:

All core functions of the dedicated‑instance edition are retained, except built‑in WAF and plugin capabilities.

Pay‑per‑call pricing, cutting costs by up to 90 % for low‑traffic workloads.

Zero‑maintenance operation with automatic upgrades and elastic scaling; however, sudden traffic spikes may be throttled to protect the service.

For detailed comparisons and pricing, refer to the official Alibaba Cloud documentation links provided in the original article.

Conclusion

Over the past six months, AI Gateway has progressed from a single‑model proxy to a unified AI application gateway, supporting model, MCP, and agent APIs, offering end‑to‑end routing, protocol conversion, rate limiting, caching, observability, and security. Combined with the open‑source HiMarket platform and the cost‑effective Serverless edition, it now serves as a foundational, pluggable, and extensible infrastructure for AI workloads.

serverlessOpen PlatformAI infrastructure
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.