Cloud Native 12 min read

Deploying Dify on Alibaba Cloud ACK for High Availability and Scalability

This guide explains how to deploy the Dify LLMOps platform on Alibaba Cloud Container Service for Kubernetes (ACK), configuring cloud databases, enabling high‑availability replicas, setting up elastic scaling, and exposing the service via Ingress to create a production‑grade, scalable AI application environment.

Alibaba Cloud Infrastructure

Oct 17, 2024

Deploying Dify on Alibaba Cloud ACK for High Availability and Scalability

Large language models (LLM) are central to AI applications, and managing their lifecycle—LLMOps—requires robust deployment platforms. Dify provides an all‑in‑one LLMOps solution, but its default Docker Compose and source‑code deployments lack high availability and scalability for production.

Alibaba Cloud Container Service for Kubernetes (ACK) offers enterprise‑grade Kubernetes with SLA guarantees, seamless integration with cloud services, and the ability to deploy Dify with high availability, elasticity, and enhanced performance.

The guide outlines the step‑by‑step deployment of Dify on an ACK cluster, including configuring cloud products (Redis, RDS PostgreSQL, AnalyticDB), disabling default component installations, and setting external services via YAML snippets such as:

externalRedis:
  enabled: true
  host: "r-***********.redis.rds.aliyuncs.com"
  port: 6379
  username: "default"
  password: "Dify123456"
  useSSL: false

externalPostgres:
  enabled: true
  username: "postgres"
  password: "Dify123456"
  address: "pgm-*********.pg.rds.aliyuncs.com"
  port: 5432
  dbName: dify
  maxOpenConns: 20
  maxIdleConns: 5

High‑availability is achieved by increasing replica counts for core components (api, worker, web, sandbox) and applying pod anti‑affinity rules to disperse pods across nodes, e.g.:

api.replicas: 2
worker.replicas: 2
web.replicas: 2
sandbox.replicas: 2

Elastic scaling is configured for resource‑intensive components such as the worker, using Horizontal Pod Autoscaler settings based on memory utilization.

worker:
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 5
    metrics:
      - type: Resource
        resource:
          name: memory
          target:
            averageUtilization: 80
            type: Utilization

Service exposure is handled via Ingress, with examples for Nginx Ingress configuration, including TLS settings.

ingress:
  enabled: false
  className: ""
  hosts:
    - host: dify-example.local
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: ack-dify
              port: 80
  tls:
    - secretName: chart-example-tls
      hosts:
        - dify-example.local

The deployment results in a highly available, scalable Dify environment on ACK, suitable for enterprise LLM infrastructure, rapid MVP creation, and integration of LLM capabilities into existing applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native High Availability Kubernetes devops Dify ACK LLMOps

Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.