Cloud Native 12 min read

Deploying Dify on Alibaba Cloud ACK for High Availability and Scalability

This guide explains how to deploy the Dify LLMOps platform on Alibaba Cloud Container Service for Kubernetes (ACK), configuring cloud databases, enabling high‑availability replicas, setting up elastic scaling, and exposing the service via Ingress to create a production‑grade, scalable AI application environment.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Deploying Dify on Alibaba Cloud ACK for High Availability and Scalability

Large language models (LLM) are central to AI applications, and managing their lifecycle—LLMOps—requires robust deployment platforms. Dify provides an all‑in‑one LLMOps solution, but its default Docker Compose and source‑code deployments lack high availability and scalability for production.

Alibaba Cloud Container Service for Kubernetes (ACK) offers enterprise‑grade Kubernetes with SLA guarantees, seamless integration with cloud services, and the ability to deploy Dify with high availability, elasticity, and enhanced performance.

The guide outlines the step‑by‑step deployment of Dify on an ACK cluster, including configuring cloud products (Redis, RDS PostgreSQL, AnalyticDB), disabling default component installations, and setting external services via YAML snippets such as:

externalRedis:
  enabled: true
  host: "r-***********.redis.rds.aliyuncs.com"
  port: 6379
  username: "default"
  password: "Dify123456"
  useSSL: false
externalPostgres:
  enabled: true
  username: "postgres"
  password: "Dify123456"
  address: "pgm-*********.pg.rds.aliyuncs.com"
  port: 5432
  dbName: dify
  maxOpenConns: 20
  maxIdleConns: 5

High‑availability is achieved by increasing replica counts for core components (api, worker, web, sandbox) and applying pod anti‑affinity rules to disperse pods across nodes, e.g.:

api.replicas: 2
worker.replicas: 2
web.replicas: 2
sandbox.replicas: 2

Elastic scaling is configured for resource‑intensive components such as the worker, using Horizontal Pod Autoscaler settings based on memory utilization.

worker:
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 5
    metrics:
      - type: Resource
        resource:
          name: memory
          target:
            averageUtilization: 80
            type: Utilization

Service exposure is handled via Ingress, with examples for Nginx Ingress configuration, including TLS settings.

ingress:
  enabled: false
  className: ""
  hosts:
    - host: dify-example.local
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: ack-dify
              port: 80
  tls:
    - secretName: chart-example-tls
      hosts:
        - dify-example.local

The deployment results in a highly available, scalable Dify environment on ACK, suitable for enterprise LLM infrastructure, rapid MVP creation, and integration of LLM capabilities into existing applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud Nativehigh availabilityKubernetesDevOpsDifyACKLLMOps
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.