Cloud Computing 28 min read

Mastering Alibaba Cloud SLB: Build High‑Availability Load Balancing with Terraform

This guide walks through Alibaba Cloud SLB’s architecture, product variants, and environment prerequisites, and step‑by‑step Terraform provisioning for CLB, ALB, and NLB, covering health checks, HTTPS setup, traffic routing, performance testing, best practices, security hardening, monitoring, and disaster‑recovery procedures.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Mastering Alibaba Cloud SLB: Build High‑Availability Load Balancing with Terraform

Overview

When traffic exceeds a single server or high availability is required, load balancing becomes essential. Alibaba Cloud Server Load Balancer (SLB) is the most widely used cloud load balancer in China, handling massive internet traffic.

Example: an e‑commerce platform handled 500 k requests per second during Double‑11 2024 by scaling from 20 to 200 ECS instances with SLB, achieving 99.99 % availability.

SLB offers Layer 4 (TCP/UDP) and Layer 7 (HTTP/HTTPS) modes. Layer 4 suits performance‑critical scenarios; Layer 7 provides richer traffic management such as URL‑based routing, cookie session persistence, and HTTPS offloading.

Product Variants

CLB (Classic Load Balancer) : classic, supports both layers, mature and stable.

ALB (Application Load Balancer) : application‑focused, Layer 7 only, richer routing rules.

NLB (Network Load Balancer) : network‑focused, Layer 4 only, ultra‑high performance.

2025 recommendation: new services prefer ALB/NLB, CLB for legacy workloads.

Key Features

Elastic scaling : automatic scaling without manual capacity changes. CLB billed by specification; ALB/NLB billed by actual usage.

Multi‑AZ disaster recovery : cross‑AZ deployment with automatic failover (primary‑backup or active‑active modes).

Health checks : Layer 4 (TCP/UDP) and Layer 7 (HTTP/HTTPS) checks, automatic isolation of unhealthy servers.

Applicable Scenarios

Web applications – ALB with HTTPS listener + URL routing.

API gateways – ALB with multi‑domain, forwarding rules, rate limiting.

Game services – NLB with UDP listener and session persistence.

Database proxy – NLB with TCP listener and backend server group.

Hybrid cloud – CLB with VPN gateway integration.

Micro‑services – ALB with gRPC support and service discovery.

Environment Requirements

VPC must be created.

At least two availability zones for high‑availability.

ECS instances running normally as backend servers.

Security groups must allow SLB health‑check traffic (100.64.0.0/10).

Alibaba Cloud account must be real‑name verified and SLB service enabled.

RAM permissions for SLB operations.

Detailed Steps

1. Preparation

VPC network planning

VPC: 10.0.0.0/8
├── Zone A
│   ├── Public subnet: 10.0.1.0/24 (SLB, NAT)
│   └── Private subnet: 10.0.10.0/24 (ECS)
├── Zone B
│   ├── Public subnet: 10.0.2.0/24 (SLB backup)
│   └── Private subnet: 10.0.20.0/24 (ECS)
└── Zone C
    └── Private subnet: 10.0.30.0/24 (ECS expansion)

Backend server preparation

# Check web service status
systemctl status nginx

# Verify port listening
ss -tlnp | grep ':80\|:443'

# Test local service
curl -I http://localhost/health

# Ensure security group allows SLB health‑check (100.64.0.0/10)

Terraform infrastructure

# main.tf (excerpt)
provider "alicloud" {
  region = "cn-hangzhou"
}

resource "alicloud_vpc" "main" {
  vpc_name   = "prod-vpc"
  cidr_block = "10.0.0.0/8"
}

resource "alicloud_vswitch" "zone_a" {
  vpc_id     = alicloud_vpc.main.id
  cidr_block = "10.0.10.0/24"
  zone_id    = "cn-hangzhou-h"
  vswitch_name = "prod-vsw-a"
}

/* similar definitions for zone_b, security groups, and ECS instances */

2. Core Configuration

Create CLB instance (console)

Log in to SLB console → Instance Management → Create Load Balancer.

Select configuration:

Instance type: Classic Load Balancer (CLB)

Specification: performance‑guaranteed (choose per workload)

Network: public or private

Primary AZ: cn‑hangzhou‑h

Backup AZ: cn‑hangzhou‑i

Create CLB with Terraform

resource "alicloud_slb_load_balancer" "main" {
  load_balancer_name = "prod-clb"
  address_type       = "internet"
  load_balancer_spec = "slb.s3.medium"
  vswitch_id         = alicloud_vswitch.zone_a.id
  master_zone_id     = "cn-hangzhou-h"
  slave_zone_id      = "cn-hangzhou-i"

  tags = { Environment = "prod" }
}

resource "alicloud_slb_listener" "http" {
  load_balancer_id          = alicloud_slb_load_balancer.main.id
  backend_port              = 80
  frontend_port             = 80
  protocol                  = "http"
  bandwidth                 = -1
  sticky_session            = "on"
  sticky_session_type       = "insert"
  cookie_timeout            = 86400
  health_check              = "on"
  health_check_type         = "http"
  health_check_uri          = "/health"
  health_check_connect_port = 80
  healthy_threshold         = 3
  unhealthy_threshold       = 3
  health_check_timeout      = 5
  health_check_interval     = 2
  health_check_http_code    = "http_2xx,http_3xx"
  gzip                      = true
  request_timeout           = 60
  idle_timeout              = 15
}

Create ALB instance (Terraform)

resource "alicloud_alb_load_balancer" "main" {
  vpc_id                  = alicloud_vpc.main.id
  address_type            = "Internet"
  address_allocated_mode  = "Dynamic"
  load_balancer_name      = "prod-alb"
  load_balancer_edition   = "Standard"
  load_balancer_billing_config {
    pay_type = "PayAsYouGo"
  }
  zone_mappings {
    vswitch_id = alicloud_vswitch.zone_a.id
    zone_id    = "cn-hangzhou-h"
  }
  zone_mappings {
    vswitch_id = alicloud_vswitch.zone_b.id
    zone_id    = "cn-hangzhou-i"
  }
}

Server groups, health checks, sticky sessions, listeners, HTTPS certificates, and redirect rules are defined similarly (full snippets omitted for brevity).

3. Validation

Check SLB status via CLI

# Describe load balancers
aliyun slb DescribeLoadBalancers --RegionId cn-hangzhou --LoadBalancerId lb-xxx

# Describe listeners
aliyun slb DescribeLoadBalancerListeners --RegionId cn-hangzhou --LoadBalancerId lb-xxx

# Health status
aliyun slb DescribeHealthStatus --RegionId cn-hangzhou --LoadBalancerId lb-xxx --ListenerPort 80

Functional testing

# Get public IP
SLB_IP=$(aliyun slb DescribeLoadBalancers --LoadBalancerId lb-xxx --output cols=Address | tail -1)

# HTTP request
curl -I http://$SLB_IP/

# Session persistence test
curl -c cookie.txt http://$SLB_IP/
for i in {1..5}; do curl -b cookie.txt -s http://$SLB_IP/server-info | jq '.hostname'; done

# HTTPS test
curl -I https://www.example.com/

Stress testing

# wrk
wrk -t12 -c400 -d30s http://$SLB_IP/

# ApacheBench
ab -n 10000 -c 100 http://$SLB_IP/

Example Configurations

Production‑grade ALB (Terraform)

# variables.tf, main.tf, server groups, listeners, forwarding rules, ACL, DDoS protection, alarms, etc.

Full snippets are available in the original article.

Best Practices & Caveats

Performance Optimisation

Choose appropriate CLB specification (e.g., slb.s3.medium supports up to 50 k QPS).

Health‑check interval 2 s, timeout 5 s, thresholds 3 / 3 for fast failure detection.

Enable HTTP Keep‑Alive on backend servers.

Security Hardening

Use TLS 1.2+ cipher policy (e.g., tls_cipher_policy_1_2).

Configure ACL whitelist for office IP ranges.

Enable DDoS protection via alicloud_ddoscoo_instance.

High‑Availability Design

Deploy across at least two AZs.

Distribute backend ECS instances evenly.

Test failover by shutting down one AZ and verifying automatic switch.

Troubleshooting & Monitoring

Common Issues

Health check failure : likely security‑group missing 100.64.0.0/10.

502 Bad Gateway : backend error or timeout; check service and increase request_timeout.

504 Gateway Timeout : backend processing too slow; optimise or increase timeout.

Session not sticky : cookie configuration error; verify sticky_session settings.

HTTPS certificate error : mismatch or expiry; renew certificate.

Connection exhaustion : insufficient CLB spec; upgrade.

High latency : cross‑region traffic; use GTM for nearest‑region routing.

Uneven traffic : weight or session‑persistence mis‑config; adjust.

Health‑Check Debugging Steps

# Verify security group
aliyun ecs DescribeSecurityGroupAttribute --SecurityGroupId sg-xxx --Direction ingress | grep 100.64

# Simulate health check from another ECS
curl -I http://backend-ip:80/health

# Check backend service status
ssh backend-server "systemctl status nginx"
ssh backend-server "curl -I localhost/health"

Performance Monitoring

QPS, ActiveConnection, NewConnection, TrafficRX/TX – alert when >80 % of spec.

StatusCode5xx – alert >1 % of total requests.

Rt (average response time) – alert >500 ms.

UnhealthyServerCount – alert when ≥1.

Terraform examples create CMS alarms for these metrics.

Backup & Disaster Recovery

Export configuration via CLI scripts, import into Terraform, and use Terraform plan/apply for recovery. DNS switch and CDN source update are part of the DR workflow.

Conclusion

Key Takeaways

Product selection: CLB for legacy, ALB for Layer 7, NLB for high‑performance Layer 4.

High‑availability: multi‑AZ, health checks, balanced backend distribution.

HTTPS: certificate management, TLS policy, HTTP‑to‑HTTPS redirect.

Traffic management: path/header routing, session persistence, canary releases.

Security: ACL, DDoS protection, strict TLS.

Monitoring: critical metrics, alarms, performance analysis.

Further Learning

GTM global traffic management.

DCDN acceleration.

WAF web application firewall.

Service mesh (ALB + ASM).

Kubernetes Ingress with ALB.

References

Alibaba Cloud SLB documentation.

ALB documentation.

NLB documentation.

Terraform Alibaba Cloud Provider.

cloud computinghigh availabilityLoad BalancingAlibaba CloudSLBTerraform
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.