Deploy Stable Diffusion on Volcengine Cloud: A Step‑by‑Step Guide
Learn how to deploy your own Stable Diffusion text‑to‑image model on Volcengine Cloud by setting up a VKE Kubernetes cluster, configuring storage, GPU resources, container images, and exposing the service via ALB or API Gateway, while leveraging mGPU sharing and serverless GPU options.
This article demonstrates how to deploy a Stable Diffusion text‑to‑image model on Volcengine Cloud using typical enterprise AI engineering practices.
Stable Diffusion Environment Dependencies
Stable Diffusion is a latent diffusion model that generates high‑quality images from arbitrary text prompts. Deploying it on the cloud requires several Volcengine services:
Container Service VKE (Kubernetes v1.24)
Image Registry CR
Elastic Container VCI
Object Storage TOS
GPU Server ecs.gni2.3xlarge NVIDIA A10
Application Load Balancer ALB
API Gateway APIG
GPU Sharing Technology mGPU
Stable Diffusion model from huggingface.co/CompVis/stable-diffusion-v1-4
Stable Diffusion WebUI from github.com/AUTOMATIC1111/stable-diffusion-webui
Step 1: Prepare VKE Cluster Environment
Create a VKE cluster in the Volcengine console, select version 1.24, use the VPC‑CNI network model, and provision GPU‑enabled nodes (ecs.gni2.3xlarge NVIDIA A10) with the nvidia‑device‑plugin installed.
Log in to the Volcengine console, create the VKE cluster, and configure the node specifications as described.
Enable TOS, create a bucket, and upload the Stable Diffusion model files.
Install the required Python packages:
1 pip install --upgrade diffusers
2 pip install transformers
# Install PyTorch according to the official guide: https://pytorch.org/get-started/locally/Log in to Hugging Face: huggingface-cli login Download the model using snapshot_download:
from huggingface_hub import snapshot_download
snapshot_download(repo_id="CompVis/stable-diffusion-v1-4", local_dir="/root/")Step 2: Upload Model to TOS
Use rclone to copy the downloaded model files to the TOS bucket:
rclone copy diffusers/ ${rclone_config_name}:${bucketname}/diffusers --copy-linksStep 3: Deploy Stable Diffusion Service
Push a prepared container image (e.g.,
cr-demo-cn-beijing.cr.volces.com/diffusers/stable-diffusion:taiyi-0.1) to the CR repository.
Create a PersistentVolumeClaim (PVC) in TOS and mount it at
/stable-diffusion-webui/models/Taiyi-Stable-Diffusion-1B-Chinese-v0.1inside the container. Expose port 7860.
apiVersion: apps/v1
kind: Deployment
metadata:
name: sd-a10
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 0
revisionHistoryLimit: 10
selector:
matchLabels:
app: sd-a10
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: sd-a10
spec:
containers:
- image: cr-demo-cn-beijing.cr.volces.com/${namespace}/stable-diffusion:taiyi-0.1
imagePullPolicy: IfNotPresent
name: sd
resources:
limits:
vke.volcengine.com/mgpu-core: "30"
vke.volcengine.com/mgpu-memory: "10240"
requests:
vke.volcengine.com/mgpu-core: "30"
vke.volcengine.com/mgpu-memory: "10240"
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /stable-diffusion-webui/models/Taiyi-Stable-Diffusion-1B-Chinese-v0.1
name: data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: sd-tos-pvcExpose the Service
Option 1: Use ALB – Create an ALB‑type Ingress to route traffic to the Stable Diffusion WebUI.
Option 2: Use API Gateway – Create an APIG instance, configure an upstream pointing to the VKE cluster, and expose the service via a generated domain name.
Large‑Model Engineering Practices
Beyond basic deployment, enterprise‑grade large‑model serving requires training/inference acceleration, resource‑utilization optimization, and cost control. Volcengine provides mGPU for GPU sharing and Serverless GPU (VCI) for elastic scaling.
GPU Sharing with mGPU
mGPU allows containers to claim fractional GPU cores (e.g., 1% of a GPU) and memory, improving overall utilization by over 50%.
Install the mGPU component via the VKE console.
Enable Prometheus monitoring for GPU metrics.
Add the label vke.volcengine.com/mgpu-enabled=true to node pools or individual nodes to activate mGPU.
Serverless GPU Deployment (VCI)
VCI provides a serverless, container‑based compute service that integrates with VKE. Deploy the same Stable Diffusion image using VCI, specifying the GPU instance type via annotations.
apiVersion: apps/v1
kind: Deployment
metadata:
name: sd-vci
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 0
revisionHistoryLimit: 10
selector:
matchLabels:
app: sd-vci
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
vci.vke.volcengine.com/preferred-instance-types: vci.ini2.26c-243gi
vci.volcengine.com/tls-enable: "false"
vke.volcengine.com/burst-to-vci: enforce
creationTimestamp: null
labels:
app: sd-vci
spec:
containers:
- image: cr-demo-cn-beijing.cr.volces.com/${namespace}/stable-diffusion:taiyi-0.1
imagePullPolicy: IfNotPresent
name: sd-vci
resources:
limits:
nvidia.com/gpu: "1"
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /stable-diffusion-webui/models/Taiyi-Stable-Diffusion-1B-Chinese-v0.1
name: sd
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: sd
persistentVolumeClaim:
claimName: sd-tos-pvcConclusion
AIGC applications focus on delivering multimodal content that solves real problems. Volcengine’s cloud‑native AI infrastructure, including VKE, CR, ALB, APIG, mGPU, and VCI, helps lower the barrier for building and scaling such services.
Related Links
Volcengine: https://www.volcengine.com
Container Service (VKE): https://www.volcengine.com/product/vke
Image Registry (CR): https://www.volcengine.com/product/cr
API Gateway (APIG): https://www.volcengine.com/product/apig
Managed Prometheus: https://www.volcengine.com/product/vmp
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
