Seamlessly Migrate from OpenAI to DeepSeek with Higress AI Gateway
This guide explains how to install the Higress AI gateway, configure provider API keys, set up gray‑release routing between OpenAI and DeepSeek, use a Python client to call DeepSeek, and enable content security and observability features for safe, cost‑effective large‑model deployments.
On January 20, DeepSeek released the DeepSeek‑R1 inference model with open weights and a free API that rivals the $200‑per‑month OpenAI o1 tier, prompting many users to consider switching from OpenAI to save costs.
Higress, an open‑source AI gateway, offers a smooth migration path by supporting gray‑release routing and built‑in observability, making it easy to transition workloads.
Quick Installation of Higress
With Docker installed, run a single command to deploy the gateway locally:
curl -sS https://higress.cn/ai-gateway/install.sh | bashThe installer launches an interactive prompt where you can enter your LLM provider API key or skip it.
After configuration, the AI gateway starts and provides a web console URL for further setup.
Configuring the DeepSeek Provider
In the console, add a new provider by entering the DeepSeek API key; the gateway then routes requests to DeepSeek.
Calling DeepSeek via Python
import json
from openai import OpenAI
client = OpenAI(
api_key="none",
base_url="http://localhost:8080/v1",
default_headers={"Accept-Encoding": "identity"}
)
completion = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello!"}
],
stream=False
)
print(completion.choices[0].message.content)This script sends a request to the local Higress gateway, which forwards it to DeepSeek.
Gray‑Release Routing (90% OpenAI, 10% DeepSeek)
Higress supports proportional routing, allowing a gradual shift of traffic between models. The configuration routes 90% of requests to OpenAI and 10% to DeepSeek, enabling performance and cost comparison.
API‑Key Re‑issuance
The gateway can issue its own API keys to downstream users, masking the original provider keys, controlling usage quotas, and collecting token‑level metrics per consumer.
Observability
Higress provides out‑of‑the‑box metrics for token consumption and latency at global, provider, model, and consumer dimensions, helping operators monitor the impact of gray‑release migrations.
Content Security Integration
DeepSeek’s terms note that inputs/outputs may still contain unsafe content. Higress integrates Alibaba Cloud Content Security, which filters and blocks prohibited content in real time. A sample blocked response is:
{
"id": "chatcmpl-E45zRLc5hUCxhsda4ODEhjvkEycC9",
"object": "chat.completion",
"model": "from-security-guard",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "我不能处理隐私信息"},
"logprobs": null,
"finish_reason": "stop"
}],
"usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
}Audit logs for each request are viewable in the content‑security console.
Alibaba Cloud Native API Gateway
The cloud‑hosted version of Higress adds richer observability, one‑click policy configuration, and built‑in content safety, rate limiting, and caching without manual YAML files. It also offers semantic vector indexing for topic clustering, intent detection, sentiment analysis, and quality assessment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
