Deploy a Serverless Stable Diffusion API for Scalable AI Image Generation
This guide explains how to overcome GPU cost, high‑concurrency, and model‑switching challenges by using Alibaba Cloud's Serverless Stable Diffusion API, detailing deployment steps, supported use cases, performance advantages, and the full set of RESTful endpoints for AI image creation.
Background
Stable Diffusion is widely used for AI‑generated image creation. Native Stable Diffusion APIs encounter three major limitations in enterprise scenarios: high GPU cost and complex pool management, limited concurrency of a single inference instance, and expensive model‑switch overhead under high load.
Serverless Solution
The Function Compute team provides a Stable Diffusion API Serverless solution that runs inference on Alibaba Cloud Serverless infrastructure. It eliminates the need for dedicated GPU hardware, offers pay‑as‑you‑go billing, and automatically scales resources.
Required Cloud Services
Function Compute (FC) – CPU + GPU compute.
Object Storage Service (OSS) – stores generated images and intermediate data.
Table Store – records inference results and function metadata.
NAS – shared storage for multi‑node deployments.
Deployment Steps
Open the Function Compute console and create a new application.
Select the fc-stable-diffusion-v3 template under the “Artificial Intelligence” category.
Configure region, namespace, and drawing type (e.g., artistic text).
Grant the required permissions when prompted.
Confirm creation and wait ~1 minute for the service to become ready.
Generate the WebUI domain (keep it private) and switch to the “Serverless API” tab.
Initialize the Serverless API, ensuring FC, OSS, and Table Store are enabled.
Authorize the role, enable the Serverless API, and let the OTS instance be created automatically.
API Overview
The Serverless API exposes two groups of endpoints:
Non‑inference APIs – model management, result queries, application restart, etc.
Inference APIs – text‑to‑image (txt2img), image‑to‑image (img2img), image upscaling, and related operations.
All API definitions are available in the OpenAPI spec at
https://github.com/devsapp/serverless-stable-diffusion-api/blob/main/api/api.yaml.
Model Management
GET /models returns a list of registered models. Example response:
[
{
"type": "stableDiffusion",
"name": "model_v1",
"ossPath": "/path/to/oss/model_v1",
"etag": "3f786850e387550fdab836ed7e6dc881de23001b",
"status": "loaded",
"registeredTime": "2023-01-01T12:00:00Z",
"lastModificationTime": "2023-01-10T12:00:00Z"
}
]Inference – txt2img
POST /txt2img accepts a JSON payload compatible with the native WebUI, with two additional fields: stable_diffusion_model and sd_vae. Images can be supplied as base64 strings or OSS paths. Example request:
{
"stable_diffusion_model": "diffusion_v1",
"sd_vae": "vae_v1",
"enable_hr": true,
"denoising_strength": 0.5,
"firstphase_width": 640,
"firstphase_height": 480,
"hr_scale": 2,
"hr_upscaler": "upscale_method_v1",
"prompt": "Mountain landscape during sunset",
"seed": 123456,
"batch_size": 32,
"steps": 100,
"cfg_scale": 1,
"width": 640,
"height": 480,
"negative_prompt": "Avoid mountains",
"alwayson_scripts": {
"controlnet": {
"args": [
{
"image": "base64srcimg|image/default/xxxx.png",
"enabled": true,
"module": "canny",
"model": "control_v11p_sd15_scribble",
"weight": 1,
"resize_mode": "Crop and Resize",
"low_vram": false,
"processor_res": 512,
"threshold_a": 100,
"threshold_b": 200,
"guidance_start": 0,
"guidance_end": 1,
"pixel_perfect": true,
"control_mode": "Balanced",
"input_mode": "simple"
}
]
}
}
}Successful response (synchronous mode) returns a task ID and temporary OSS URLs:
{
"status": "succeeded",
"taskId": "1HmyrbhBJD",
"ossUrl": ["xxxxx"]
}Inference – img2img
POST /img2img works similarly, with init_images supporting base64 strings or OSS paths. Example request:
{
"stable_diffusion_model": "diffusion_v2",
"sd_vae": "vae_v2",
"init_images": ["Base64SrcImg|ossPath"],
"prompt": "Forest landscape",
"seed": 654321,
"batch_size": 64,
"steps": 50,
"width": 1280,
"height": 960,
"negative_prompt": "Avoid forests"
}Image Upscaling
POST /extra_images upscales a single image. The image field accepts base64 or OSS path.
{
"upscaler_1": "Lanczos",
"upscaling_resize": 4,
"image": "base64|ossPath"
}Result Retrieval
GET /tasks/{taskId}/result returns final image URLs, parameters, and metadata. Progress can be queried via /tasks/{taskId}/progress, and a running task can be cancelled with POST /tasks/{taskId}/cancellation.
Dynamic Resource Management
APIs are provided to list, update, and delete dynamically created Stable Diffusion functions, enabling batch updates of CPU, GPU memory, container image, environment variables, and VPC settings. Example endpoints:
GET /list/sdapi/functions – list dynamic functions.
POST /batch_update_sd_resource – batch update resources (e.g., cpu, memorySize, gpuMemorySize, instanceType, VPC and NAS configs).
POST /del/sd/functions – delete specified functions.
Best Practices
Use the Serverless API for cost‑effective scaling; keep the generated WebUI link private.
Store inference results in OSS and OTS for later retrieval.
When image payloads exceed Function Compute request size limits, prefer OSS paths over base64.
Leverage asynchronous mode (header {"Request-Type":"async"}) for long‑running tasks and query results via the task APIs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
