How Do Kong, Gloo, and Higress Stack Up as AI‑Ready API Gateways?
This article compares Kong, Gloo, and Higress on their AI‑related extensions—covering tech stacks, logging, proxy normalization, API‑key handling, caching, request/response rewriting, and RAG support—to help developers choose the most suitable gateway for emerging LLM workloads.
Overview
The three products—Kong, Gloo, and Higress—originated as traditional API gateways and later added AI capabilities through plugins. Their documentation was the primary source, so some details may differ from real‑world usage.
Technology Stack
Kong: Nginx + Lua
Gloo: Envoy + Go
Higress: Envoy + WASM
Logging & Monitoring
Kong: AI plugins add model name and token cost to the audit log, but lack custom metadata fields.
Gloo: No AI‑specific log enhancements; uses generic monitoring.
Higress: Similar to Kong, logs token usage without custom metadata support.
Proxy Normalization
Kong: Provides a unified API that abstracts different LLM endpoints, allowing developers to switch models without code changes.
Gloo: Acts only as a reverse proxy to upstream LLM APIs, offering no normalization.
Higress: Translates various LLM APIs into OpenAI‑compatible calls, easing integration with the dominant OpenAI format.
API‑Key Management
Kong: Can transform client keys via additional plugins.
Gloo: Allows the client‑side key to differ from the upstream key, adding a protective layer.
Higress: Passes the client key directly to the upstream service.
Cache Capabilities
Kong: No LLM‑specific caching.
Gloo: Offers semantic cache using OpenAI embeddings and Redis vectors, though fine‑grained TTL or similarity settings are unclear.
Higress: Provides text‑match cache via Redis and JSON‑PATH selection, but no semantic cache.
Request/Response Rewriting
Kong: Supports prompt‑based rewriting at both request and response stages, enabling simple workflows.
Gloo: Only supports prepend‑system‑prompt, offering limited flexibility.
Higress: Similar to Kong but currently limited to Alibaba Cloud’s Tongyi Qianwen model.
RAG (Retrieval‑Augmented Generation)
Kong: No RAG‑related plugins.
Gloo: Can connect a Postgres store and OpenAI embeddings to build custom RAG pipelines.
Higress: Mirrors Gloo’s approach but restricts to Alibaba Cloud vector service and Tongyi Qianwen, reducing openness.
Conclusions
All three gateways were built before the LLM boom and were not originally designed for AI use cases, so their extensions require deep knowledge of gateway internals and languages like Lua or WASM. Higress appears to have the most complete AI feature set, though it is tightly coupled to specific cloud services. Gloo places most AI functions behind a commercial license, while Kong offers only exploratory capabilities. Future AI architecture changes—such as evolving RAG, caching, or orchestration patterns—may challenge the adaptability of these gateways, especially given their reliance on non‑mainstream extension languages.
Cloud Native Technology Community
The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
