Deploying Stable Diffusion on Tencent Cloud: A Step‑by‑Step Guide
Deploy Stable Diffusion on Tencent Cloud by building a Docker image, pushing it to TCR, creating a GPU‑enabled TKE cluster with CFS storage, configuring qGPU sharing, exposing the service via Cloud Native API Gateway, optimizing inference with TACO Kit, storing results in COS, and applying content moderation.
Stable Diffusion is a deep‑learning text‑to‑image model originally released in 2021 (v1.5, v2, v2.1). It can generate detailed images from prompts and is also used for image‑to‑image translation, inpainting, etc. This guide shows how to deploy Stable Diffusion on Tencent Cloud using cloud‑native services.
Typical application scenarios include illustration, game UI assets, packaging design, fashion design, architectural renderings, and other industries that benefit from AI‑generated graphics.
Deployment workflow
1. Prepare a container image : Build a Docker image that contains the Stable Diffusion Web UI (e.g., AUTOMATIC1111) and push it to Tencent Container Registry (TCR).
2. Create a TKE (Tencent Kubernetes Engine) cluster (Kubernetes 1.26.1, GPU‑type worker nodes, install GPU driver, CUDA 11.4.3, cuDNN 8.2.4).
3. Set up Cloud File Storage (CFS) for shared model files. Create a mount point /models/Stable-diffusion and a directory for the UNet‑optimized model.
4. Create static PV & PVC for the CFS mount points and bind them to the Deployment.
5. Deploy the Stable Diffusion Web UI as a Kubernetes Deployment, expose it via a Service (port 7860) and configure the container arguments (e.g., --listen , --api ).
6. Enable qGPU to share a single A10 GPU among multiple Pods. Adjust the Deployment YAML limits as follows:
resources:
limits:
cpu: "20"
memory: 50Gi
tke.cloud.tencent.com/qgpu-core: "50"
tke.cloud.tencent.com/qgpu-memory: "10"7. Expose the service with Cloud Native API Gateway (CNGW) to provide a public endpoint, configure routing, rate‑limiting, and session affinity.
8. Performance optimization using TACO Kit (TencentCloud Accelerated Computing Optimization Kit). Deploy the sd_taco:v3 image and load an A10‑optimized UNet model, which reduces inference time from ~2 s to ~1 s per image.
docker run -it --gpus=all --network=host -v /[diffusers_model_directory]:/[custom_container_directory] sd_taco:v3 bash9. Convert single‑file checkpoints to Diffusers format when needed:
python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path [single_file_model_name] --dump_path [diffusers_model_directory] --from_safetensors10. Store generated images in COS (Cloud Object Storage). Create a bucket with an /images prefix, mount it via COS‑CSI, and configure content‑moderation policies.
11. Example consumer service that calls the Stable Diffusion API and saves the Base64‑encoded PNG to COS:
import json
import base64
import requests
url = 'x.x.x.x:7860/sdapi/v1/txt2img'
data = {'prompt': 'cat', 'steps': 20}
response = requests.post(url, data=json.dumps(data))
with open('cat.png', 'wb') as f:
f.write(base64.b64decode(response.json()['image'][0]))12. Content moderation for the stored images using COS + Cloud Infinite (CI) to ensure compliance.
Conclusion : By leveraging Tencent Cloud’s native services (TKE, TCR, CFS, CNGW, TACO, COS, and CI), Stable Diffusion can be deployed in a highly available, scalable, and cost‑effective manner, with support for GPU sharing, performance tuning, and automated content review.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.