Cloud Computing 18 min read

How to Build Reusable Multi‑Cloud Infrastructure with Terraform

Learn how to replace manual, error‑prone cloud console clicks with Terraform‑driven, reusable multi‑cloud infrastructure, covering why multi‑cloud matters, Terraform fundamentals, project layout, example networking and compute modules for AWS and Alibaba Cloud, CI/CD integration, security scanning, cost optimization, and best‑practice guidelines.

Raymond Ops
Raymond Ops
Raymond Ops
How to Build Reusable Multi‑Cloud Infrastructure with Terraform

Introduction: From manual clicks to code‑defined infrastructure

Manual, button‑driven provisioning in cloud consoles leads to errors, delays, and high operational cost, especially when the same environment must be replicated across multiple providers. Terraform enables a shift from labor‑intensive tasks to declarative, version‑controlled infrastructure code.

Why Multi‑Cloud is inevitable

Business‑driven multi‑cloud needs

Cost optimization : Different providers price services differently; compute‑heavy workloads may be cheaper on AWS, storage‑heavy on Alibaba Cloud.

Risk diversification : Single‑provider outages (e.g., AWS us‑east‑1 2021 incident) can cause widespread disruption; multi‑cloud mitigates this risk.

Compliance requirements : Data sovereignty laws may mandate regional storage, which multi‑cloud strategies can satisfy.

Technical complementarity : Each provider offers unique services—AWS ML, Azure enterprise integration, Alibaba domestic network advantages.

Challenges of traditional operations

Configuration inconsistency : Manual setups cannot guarantee identical configurations across clouds.

Rising operational cost : Managing multiple consoles and APIs increases effort.

Security risk : Human error and configuration drift are hard to detect.

Poor scalability : Growth turns infrastructure expansion into a nightmare.

Terraform: Best practice for Infrastructure as Code

What is Terraform?

Terraform is an open‑source IaC tool from HashiCorp that uses the declarative HCL language to define and manage infrastructure across providers.

Core advantages

Declarative configuration – describe the desired end state, Terraform computes the actions.

State management – a state file tracks real resources, ensuring consistency.

Plan preview – terraform plan shows changes before execution.

Rich provider ecosystem – supports 1000+ providers covering all major clouds.

Hands‑on: Building a reusable multi‑cloud stack

Project layout

terraform-multicloud/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── terraform.tfvars
│   │   └── versions.tf
│   ├── staging/
│   └── prod/
├── modules/
│   ├── networking/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── versions.tf
│   ├── compute/
│   └── database/
├── shared/
│   ├── backend.tf
│   └── provider.tf
└── scripts/
    ├── deploy.sh
    └── destroy.sh

Reusable networking module (excerpt)

# modules/networking/variables.tf
variable "environment" { description = "Environment name" type = string }
variable "cloud_provider" {
  description = "Cloud provider"
  type        = string
  validation {
    condition     = contains(["aws", "alicloud", "azure"], var.cloud_provider)
    error_message = "Supported providers: aws, alicloud, azure"
  }
}
variable "vpc_cidr" { description = "VPC CIDR block" type = string default = "10.0.0.0/16" }
variable "availability_zones" { description = "AZ list" type = list(string) }

# modules/networking/main.tf
locals {
  common_tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
    Project     = "multicloud-demo"
  }
}
resource "aws_vpc" "main" {
  count                = var.cloud_provider == "aws" ? 1 : 0
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = merge(local.common_tags, { Name = "${var.environment}-vpc" })
}
# Similar resources for alicloud_vpc and subnets omitted for brevity

Compute module example

# modules/compute/variables.tf
variable "environment" { type = string }
variable "cloud_provider" { type = string }
variable "instance_count" { type = number default = 2 }
variable "subnet_ids" { type = list(string) }
variable "vpc_id" { type = string }

# modules/compute/main.tf
locals {
  instance_types = {
    aws = { small = "t3.micro", medium = "t3.small", large = "t3.medium" }
    alicloud = { small = "ecs.t6-c1m1.large", medium = "ecs.t6-c2m1.large", large = "ecs.t6-c4m1.large" }
  }
  instance_size = var.environment == "prod" ? "large" : "small"
  instance_type = local.instance_types[var.cloud_provider][local.instance_size]
}
resource "aws_instance" "app" {
  count         = var.cloud_provider == "aws" ? var.instance_count : 0
  ami           = data.aws_ami.ubuntu[0].id
  instance_type = local.instance_type
  subnet_id     = var.subnet_ids[count.index % length(var.subnet_ids)]
  tags = {
    Name        = "${var.environment}-app-${count.index + 1}"
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}
# Corresponding alicloud_instance and data sources omitted for brevity

Environment configuration

# environments/prod/main.tf
terraform {
  required_version = ">= 1.0"
  backend "s3" {
    bucket = "my-terraform-state-bucket"
    key    = "prod/terraform.tfstate"
    region = "us-west-2"
  }
  required_providers {
    aws      = { source = "hashicorp/aws",      version = "~> 5.0" }
    alicloud = { source = "aliyun/alicloud", version = "~> 1.200" }
  }
}
provider "aws" { region = var.aws_region }
provider "alicloud" { region = var.alicloud_region }

module "aws_networking" {
  source            = "../../modules/networking"
  environment       = "prod"
  cloud_provider    = "aws"
  vpc_cidr          = "10.1.0.0/16"
  availability_zones = ["us-west-2a","us-west-2b","us-west-2c"]
}
module "aws_compute" { /* omitted for brevity */ }
module "alicloud_networking" { /* omitted for brevity */ }
module "alicloud_compute" { /* omitted for brevity */ }

Advanced features

Conditional deployment

# Enable CloudWatch dashboard only in prod
resource "aws_cloudwatch_dashboard" "main" {
  count          = var.environment == "prod" ? 1 : 0
  dashboard_name = "${var.environment}-dashboard"
  dashboard_body = jsonencode({
    widgets = [{
      type = "metric",
      properties = {
        metrics = [["AWS/EC2","CPUUtilization","InstanceId",aws_instance.app[0].id]],
        period  = 300,
        stat    = "Average",
        region  = var.aws_region,
        title   = "EC2 Instance CPU"
      }
    }]
  })
}

Dynamic configuration per provider

locals {
  cloud_configs = {
    aws = {
      machine_types = ["t3.micro","t3.small","t3.medium"]
      storage_types = ["gp3","io1","sc1"]
      network_acl_rules = [{ rule_number = 100, protocol = "tcp", rule_action = "allow", port_range = "80", cidr_block = "0.0.0.0/0" }]
    }
    alicloud = {
      machine_types = ["ecs.t6-c1m1.large","ecs.t6-c2m1.large"]
      storage_types = ["cloud_efficiency","cloud_ssd","cloud_essd"]
      security_rules = [{ type = "ingress", ip_protocol = "tcp", port_range = "80/80", cidr_ip = "0.0.0.0/0" }]
    }
  }
  current_config = local.cloud_configs[var.cloud_provider]
}

Remote state and team collaboration

# shared/backend.tf
terraform {
  backend "s3" {
    bucket         = "my-company-terraform-state"
    key            = "environments/${var.environment}/terraform.tfstate"
    region         = "us-west-2"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
  }
}
resource "aws_dynamodb_table" "terraform_state_lock" {
  name         = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  attribute {
    name = "LockID"
    type = "S"
  }
  tags = { Name = "TerraformStateLock", ManagedBy = "terraform" }
}

CI/CD integration (GitLab)

# .gitlab-ci.yml
stages:
  - validate
  - plan
  - apply

variables:
  TF_ROOT: "${CI_PROJECT_DIR}"
  TF_IN_AUTOMATION: "true"

.validate:
  image: hashicorp/terraform:1.5
  script:
    - cd "${TF_ROOT}/environments/${ENVIRONMENT}"
    - terraform init
    - terraform validate
    - terraform fmt -check=true -diff=true

.plan:
  extends: .validate
  script:
    - terraform plan -out=tfplan
  artifacts:
    paths:
      - "${TF_ROOT}/environments/${ENVIRONMENT}/tfplan"
  only:
    - merge_requests
    - master

.apply:
  extends: .validate
  script:
    - terraform apply tfplan
  dependencies:
    - plan
  only:
    - master
  when: manual

Security scanning integration

# scripts/security-scan.sh
#!/bin/bash
echo "🔍 Starting Terraform security scan..."
tfsec . --format json --out=tfsec-results.json
checkov -d . --output json --output-file checkov-results.json
terraform-compliance -f compliance-tests/ -p .
echo "✅ Security scan completed"

Tag standardization

# modules/common/tags.tf
locals {
  common_tags = {
    Environment = var.environment
    Project     = var.project_name
    ManagedBy   = "terraform"
    Team        = var.team
    CostCenter  = var.cost_center
    CreatedAt   = timestamp()
    Terraform   = "true"
  }
  environment_tags = {
    dev = { AutoShutdown = "true", ShutdownTime = "18:00" }
    prod = { Backup = "true", Monitoring = "enabled", AlertLevel = "critical" }
  }
  final_tags = merge(local.common_tags, lookup(local.environment_tags, var.environment, {}))
}

Cost optimization

Automated shutdown for development

# Development environment auto‑shutdown
resource "aws_lambda_function" "auto_shutdown" {
  count          = var.environment == "dev" ? 1 : 0
  filename       = "auto-shutdown.zip"
  function_name  = "dev-auto-shutdown"
  role           = aws_iam_role.lambda_role[0].arn
  handler        = "index.handler"
  runtime        = "python3.9"
  environment {
    variables = { ENVIRONMENT = var.environment }
  }
}
resource "aws_cloudwatch_event_rule" "shutdown_schedule" {
  count               = var.environment == "dev" ? 1 : 0
  name                = "dev-shutdown-schedule"
  description         = "Auto‑close dev resources"
  schedule_expression = "cron(0 18 * * MON-FRI *)"
}

Dynamic resource scaling

locals {
  business_hours_scaling = {
    weekday_work_hours = { min_capacity = var.environment == "prod" ? 2 : 1, max_capacity = var.environment == "prod" ? 10 : 3, target_cpu = 70 }
    off_hours          = { min_capacity = var.environment == "prod" ? 1 : 0, max_capacity = var.environment == "prod" ? 3 : 1, target_cpu = 80 }
  }
}

Best practices and pitfalls

Module design principles

Single responsibility : each module should manage only one type of resource.

Testability : modules must be easy to unit‑test.

Documentation : every variable and output needs a clear description.

Common traps

State file handling : never edit the state file manually; use terraform import to bring existing resources under management.

Sensitive data : store secrets in Vault or cloud KMS instead of plain variables.

Implicit dependencies : declare explicit depends_on to control resource order.

Conclusion

Terraform transforms repetitive, error‑prone manual steps into reliable, version‑controlled code, delivering consistency, reusability, maintainability, and cost efficiency across AWS, Alibaba Cloud, and Azure. Adopt a modular layout, remote state, CI/CD pipelines, and security checks to fully realize the benefits of infrastructure as code.

Next actions

Start with a small project to gain Terraform experience.

Build a shared module library and document best practices.

Integrate the pipeline into GitLab or other CI systems for full automation.

Set up monitoring and alerting to keep the infrastructure healthy.

Repository references: https://github.com/raymond999999, https://gitee.com/raymond9.

CI/CDmulti-cloudOpscloudTerraformInfrastructure as Code
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.