Deploy a Serverless Stable Diffusion API for Scalable AI Image Generation

This guide explains how to overcome GPU cost, high‑concurrency, and model‑switching challenges by using Alibaba Cloud's Serverless Stable Diffusion API, detailing deployment steps, supported use cases, performance advantages, and the full set of RESTful endpoints for AI image creation.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Deploy a Serverless Stable Diffusion API for Scalable AI Image Generation

Background

Stable Diffusion is widely used for AI‑generated image creation. Native Stable Diffusion APIs encounter three major limitations in enterprise scenarios: high GPU cost and complex pool management, limited concurrency of a single inference instance, and expensive model‑switch overhead under high load.

Serverless Solution

The Function Compute team provides a Stable Diffusion API Serverless solution that runs inference on Alibaba Cloud Serverless infrastructure. It eliminates the need for dedicated GPU hardware, offers pay‑as‑you‑go billing, and automatically scales resources.

Required Cloud Services

Function Compute (FC) – CPU + GPU compute.

Object Storage Service (OSS) – stores generated images and intermediate data.

Table Store – records inference results and function metadata.

NAS – shared storage for multi‑node deployments.

Deployment Steps

Open the Function Compute console and create a new application.

Select the fc-stable-diffusion-v3 template under the “Artificial Intelligence” category.

Configure region, namespace, and drawing type (e.g., artistic text).

Grant the required permissions when prompted.

Confirm creation and wait ~1 minute for the service to become ready.

Generate the WebUI domain (keep it private) and switch to the “Serverless API” tab.

Initialize the Serverless API, ensuring FC, OSS, and Table Store are enabled.

Authorize the role, enable the Serverless API, and let the OTS instance be created automatically.

API Overview

The Serverless API exposes two groups of endpoints:

Non‑inference APIs – model management, result queries, application restart, etc.

Inference APIs – text‑to‑image (txt2img), image‑to‑image (img2img), image upscaling, and related operations.

All API definitions are available in the OpenAPI spec at

https://github.com/devsapp/serverless-stable-diffusion-api/blob/main/api/api.yaml

.

Model Management

GET /models returns a list of registered models. Example response:

[
  {
    "type": "stableDiffusion",
    "name": "model_v1",
    "ossPath": "/path/to/oss/model_v1",
    "etag": "3f786850e387550fdab836ed7e6dc881de23001b",
    "status": "loaded",
    "registeredTime": "2023-01-01T12:00:00Z",
    "lastModificationTime": "2023-01-10T12:00:00Z"
  }
]

Inference – txt2img

POST /txt2img accepts a JSON payload compatible with the native WebUI, with two additional fields: stable_diffusion_model and sd_vae. Images can be supplied as base64 strings or OSS paths. Example request:

{
  "stable_diffusion_model": "diffusion_v1",
  "sd_vae": "vae_v1",
  "enable_hr": true,
  "denoising_strength": 0.5,
  "firstphase_width": 640,
  "firstphase_height": 480,
  "hr_scale": 2,
  "hr_upscaler": "upscale_method_v1",
  "prompt": "Mountain landscape during sunset",
  "seed": 123456,
  "batch_size": 32,
  "steps": 100,
  "cfg_scale": 1,
  "width": 640,
  "height": 480,
  "negative_prompt": "Avoid mountains",
  "alwayson_scripts": {
    "controlnet": {
      "args": [
        {
          "image": "base64srcimg|image/default/xxxx.png",
          "enabled": true,
          "module": "canny",
          "model": "control_v11p_sd15_scribble",
          "weight": 1,
          "resize_mode": "Crop and Resize",
          "low_vram": false,
          "processor_res": 512,
          "threshold_a": 100,
          "threshold_b": 200,
          "guidance_start": 0,
          "guidance_end": 1,
          "pixel_perfect": true,
          "control_mode": "Balanced",
          "input_mode": "simple"
        }
      ]
    }
  }
}

Successful response (synchronous mode) returns a task ID and temporary OSS URLs:

{
  "status": "succeeded",
  "taskId": "1HmyrbhBJD",
  "ossUrl": ["xxxxx"]
}

Inference – img2img

POST /img2img works similarly, with init_images supporting base64 strings or OSS paths. Example request:

{
  "stable_diffusion_model": "diffusion_v2",
  "sd_vae": "vae_v2",
  "init_images": ["Base64SrcImg|ossPath"],
  "prompt": "Forest landscape",
  "seed": 654321,
  "batch_size": 64,
  "steps": 50,
  "width": 1280,
  "height": 960,
  "negative_prompt": "Avoid forests"
}

Image Upscaling

POST /extra_images upscales a single image. The image field accepts base64 or OSS path.

{
  "upscaler_1": "Lanczos",
  "upscaling_resize": 4,
  "image": "base64|ossPath"
}

Result Retrieval

GET /tasks/{taskId}/result returns final image URLs, parameters, and metadata. Progress can be queried via /tasks/{taskId}/progress, and a running task can be cancelled with POST /tasks/{taskId}/cancellation.

Dynamic Resource Management

APIs are provided to list, update, and delete dynamically created Stable Diffusion functions, enabling batch updates of CPU, GPU memory, container image, environment variables, and VPC settings. Example endpoints:

GET /list/sdapi/functions – list dynamic functions.

POST /batch_update_sd_resource – batch update resources (e.g., cpu, memorySize, gpuMemorySize, instanceType, VPC and NAS configs).

POST /del/sd/functions – delete specified functions.

Best Practices

Use the Serverless API for cost‑effective scaling; keep the generated WebUI link private.

Store inference results in OSS and OTS for later retrieval.

When image payloads exceed Function Compute request size limits, prefer OSS paths over base64.

Leverage asynchronous mode (header {"Request-Type":"async"}) for long‑running tasks and query results via the task APIs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ServerlessAIStable DiffusionAPIFunction Compute
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.