Master Ansible: Automate Hundreds of Linux Servers with Ease
This guide walks you through why Ansible is ideal for large‑scale Linux server management, shows how to set up control and target nodes, configure inventory and SSH keys, optimize Ansible settings, and provides ready‑to‑run playbooks for system initialization, Nginx clustering, application deployment, plus advanced tips on Vault, dynamic inventory, role‑based structures, performance tuning, monitoring, and troubleshooting.
Why Choose Ansible?
Ansible eliminates manual server management by offering zero‑dependency deployment, built‑in idempotence, and a vast module ecosystem covering networking, storage, and cloud platforms.
No agent required on target machines; uses SSH for secure communication.
Simple YAML syntax makes learning easy.
Over 3,000 modules and an active community ensure continuous updates.
Practical Environment Setup
Architecture
# Control node: CentOS 8 (Ansible Server)
# Target nodes: Ubuntu 20.04 x10 (Web Servers)Install Ansible
# CentOS/RHEL
sudo yum install epel-release -y
sudo yum install ansible -y
# Ubuntu/Debian
sudo apt update
sudo apt install ansible -y
# Verify installation
ansible --versionCore Configuration Details
1. Host Inventory
Create /etc/ansible/hosts with groups and variables:
[webservers]
web01 ansible_host=192.168.1.10
web02 ansible_host=192.168.1.11
web03 ansible_host=192.168.1.12
[databases]
db01 ansible_host=192.168.1.20
db02 ansible_host=192.168.1.21
[production:children]
webservers
databases
[production:vars]
ansible_user=root
ansible_ssh_private_key_file=~/.ssh/id_rsa2. SSH Password‑less Login
# Generate key pair
ssh-keygen -t rsa -b 4096
# Distribute public key to targets
for i in {10..12}; do ssh-copy-id [email protected].$i; done3. Optimize ansible.cfg
[defaults]
host_key_checking = False
timeout = 30
forks = 50
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts_cache
fact_caching_timeout = 3600
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
pipelining = TrueHands‑On Playbooks
Case 1: System Initialization
---
- name: Linux Server Standardization
hosts: webservers
become: yes
tasks:
- name: Update apt cache and upgrade
apt:
update_cache: yes
upgrade: dist
- name: Install common tools
apt:
name:
- vim
- htop
- curl
- wget
- git
- tree
state: present
- name: Set timezone
timezone:
name: Asia/Shanghai
- name: Create devops user
user:
name: devops
groups: sudo
shell: /bin/bash
create_home: yes
- name: Configure firewall
ufw:
rule: allow
port: "{{ item }}"
proto: tcp
loop:
- 22
- 80
- 443Case 2: Nginx Cluster Deployment
---
- name: Deploy Nginx Cluster
hosts: webservers
become: yes
vars:
nginx_version: "1.20.2"
document_root: "/var/www/html"
tasks:
- name: Install Nginx
apt:
name: nginx
state: present
- name: Create website directory
file:
path: "{{ document_root }}"
state: directory
owner: www-data
group: www-data
mode: '0755'
- name: Deploy Nginx config
template:
src: nginx.conf.j2
dest: /etc/nginx/sites-available/default
backup: yes
notify: restart nginx
- name: Deploy website files
copy:
src: "{{ item }}"
dest: "{{ document_root }}/"
owner: www-data
group: www-data
with_fileglob:
- "files/web/*"
notify: restart nginx
- name: Ensure Nginx is running
systemd:
name: nginx
state: started
enabled: yes
handlers:
- name: restart nginx
systemd:
name: nginx
state: restartedCase 3: Application Release Automation
---
- name: Application Release Pipeline
hosts: webservers
serial: 2 # Rolling update, two hosts at a time
max_fail_percentage: 0
tasks:
- name: Health check
uri:
url: "http://{{ ansible_host }}/health"
method: GET
status_code: 200
register: health_check
failed_when: health_check.status != 200
- name: Remove node from load balancer
uri:
url: "http://lb.example.com/api/remove/{{ ansible_host }}"
method: POST
delegate_to: localhost
- name: Stop application service
systemd:
name: myapp
state: stopped
- name: Backup current version
archive:
path: /opt/myapp
dest: "/opt/backup/myapp-{{ ansible_date_time.epoch }}.tar.gz"
- name: Deploy new version
unarchive:
src: "files/myapp-{{ app_version }}.tar.gz"
dest: /opt/
owner: myapp
group: myapp
- name: Update configuration
template:
src: app.conf.j2
dest: /opt/myapp/conf/app.conf
- name: Start application service
systemd:
name: myapp
state: started
- name: Wait for service to start
wait_for:
port: 8080
host: "{{ ansible_host }}"
delay: 10
timeout: 60
- name: Add node back to load balancer
uri:
url: "http://lb.example.com/api/add/{{ ansible_host }}"
method: POST
delegate_to: localhostAdvanced Tips & Best Practices
1. Protect Secrets with Ansible Vault
# Create encrypted file
ansible-vault create secrets.yml
# Edit encrypted file
ansible-vault edit secrets.yml
# Use in playbook
- name: Configure database connection
template:
src: database.conf.j2
dest: /etc/myapp/database.conf
vars:
db_password: "{{ vault_db_password }}"2. Dynamic Inventory Script (Python)
#!/usr/bin/env python3
import json, boto3
def get_aws_instances():
ec2 = boto3.client('ec2')
response = ec2.describe_instances()
inventory = {'_meta': {'hostvars': {}}, 'webservers': {'hosts': []}, 'databases': {'hosts': []}}
for reservation in response['Reservations']:
for instance in reservation['Instances']:
if instance['State']['Name'] == 'running':
ip = instance['PrivateIpAddress']
tags = {t['Key']: t['Value'] for t in instance.get('Tags', [])}
if tags.get('Role') == 'web':
inventory['webservers']['hosts'].append(ip)
elif tags.get('Role') == 'db':
inventory['databases']['hosts'].append(ip)
inventory['_meta']['hostvars'][ip] = {
'ansible_host': ip,
'ec2_instance_id': instance['InstanceId'],
'ec2_instance_type': instance['InstanceType']
}
return inventory
if __name__ == '__main__':
print(json.dumps(get_aws_instances(), indent=2))3. Role‑Based Management
# Create role skeletons
ansible-galaxy init roles/nginx
ansible-galaxy init roles/mysql
ansible-galaxy init roles/monitoring
# Playbook using roles
---
- name: Deploy LAMP stack
hosts: webservers
roles:
- nginx
- php
- mysql
- monitoring4. Performance Optimizations
# Asynchronous bulk file transfer
- name: Transfer files in parallel
copy:
src: "{{ item }}"
dest: /tmp/
with_items: "{{ files_list }}"
async: 300 # timeout 5 minutes
poll: 0
register: copy_jobs
- name: Wait for all transfers to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: copy_results
until: copy_results.finished
retries: 30
delay: 10
with_items: "{{ copy_jobs.results }}"Monitoring & Troubleshooting
Execution Monitoring
# Verbose run
ansible-playbook -i inventory site.yml -v
# Check mode (dry run)
ansible-playbook -i inventory site.yml --check
# Show differences
ansible-playbook -i inventory site.yml --check --diffCommon Issues
SSH connectivity
# Test connectivity
ansible all -m ping
# Debug SSH
ansible all -m ping -vvvPrivilege escalation
- name: Restart nginx with sudo
command: systemctl restart nginx
become: yes
become_method: sudoIdempotence problems
- name: Check service status
systemd:
name: nginx
register: service_status
- name: Reload if active
command: nginx -s reload
when: service_status.status.ActiveState == "active"Cost‑Optimization Case Study
During a major e‑commerce event, the team used Ansible to provision 200 servers in two hours, cutting deployment time by 90%, reducing human error by 95%, and saving 60% in labor costs while decreasing downtime by 80%.
Learning Roadmap
Beginner (1‑2 weeks)
Master YAML syntax.
Learn basic Ansible modules.
Complete simple system configuration tasks.
Intermediate (3‑4 weeks)
Write full Playbooks.
Use variables and templates.
Understand roles and Ansible Galaxy.
Advanced (2‑3 months)
Develop custom modules.
Integrate dynamic inventory.
Connect Ansible with CI/CD pipelines.
Manage large‑scale clusters.
Conclusion & Outlook
Ansible is a cornerstone of modern DevOps, dramatically improving efficiency and reliability. Mastering it equips you with the essential skills for automated infrastructure management and positions you at the forefront of operational excellence.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
