Operations 33 min read

Ansible Best Practices: From Playbooks to Roles

This hands‑on guide walks DevOps engineers through common pitfalls of raw Playbooks, then shows how to structure inventories, variables, templates, handlers, and roles, secure secrets with Vault, optimise performance, and integrate testing and CI/CD for production‑ready Ansible automation.

Ops Community
Ops Community
Ops Community
Ansible Best Practices: From Playbooks to Roles

Problem background

Many teams write monolithic Playbooks with dozens of tasks, store variables inconsistently, embed secrets in Git, and run the same file for development, staging and production, which leads to maintenance headaches and audit failures.

Core concepts

Inventory – static INI or YAML files versus dynamic plugins (AWS, Alibaba) that pull hosts from cloud APIs. Dynamic inventory reduces manual updates but requires minimal‑privilege AK/SK.

Modules – use idempotent built‑in modules (apt, yum, service, copy). Non‑idempotent modules (command, shell, raw) must be paired with creates / removes or changed_when to preserve idempotence.

Playbook structure – hosts, become, gather_facts, vars, tasks, handlers, notify. Use --check --diff to dry‑run and verify changes before execution.

Variable precedence – 22 levels from lowest ( role defaults/main.yml) to highest ( --extra-vars). The article provides a simplified ladder and a mnemonic to remember the order.

Handlers – triggered only when a task reports changed. Use changed_when: true for tasks that must always fire a handler.

Installation

# Ubuntu
sudo apt update && sudo apt install -y software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible

# CentOS
sudo yum install -y epel-release && sudo yum install -y ansible

# pip (recommended for latest version)
python3 -m venv ~/ansible-venv
source ~/ansible-venv/bin/activate
pip install ansible-core ansible

ansible --version

Inventory examples

Static INI:

# /etc/ansible/hosts
[webservers]
web1.example.com
web2.example.com
[dbservers]
db1.example.com
[production:children]
webservers
 dbservers
[webservers:vars]
http_port=80
ansible_user=deploy

YAML (recommended for readability):

all:
  vars:
    ansible_user: deploy
    ansible_ssh_private_key_file: ~/.ssh/id_rsa
  children:
    webservers:
      hosts:
        web1.example.com:
          http_port: 80
        web2.example.com:
          http_port: 8080
    dbservers:
      hosts:
        db1.example.com:
        db2.example.com:
    production:
      children:
        - webservers
        - dbservers

First Playbook – a simple nginx deployment

# deploy_nginx.yml
---
- name: Deploy nginx
  hosts: webservers
  become: yes
  gather_facts: yes
  vars:
    nginx_port: 80
    nginx_user: www-data
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
        update_cache: yes
    - name: Deploy configuration
      template:
        src: nginx.conf.j2
        dest: /etc/nginx/nginx.conf
        owner: root
        group: root
        mode: '0644'
      notify: restart nginx
    - name: Start nginx service
      service:
        name: nginx
        state: started
        enabled: yes
  handlers:
    - name: restart nginx
      service:
        name: nginx
        state: restarted

Run in check mode:

ansible-playbook -i inventory/hosts.yml deploy_nginx.yml --check --diff

Roles – reusable units

A role has a fixed directory layout ( defaults/, vars/, tasks/, handlers/, templates/, files/, meta/). The article shows a complete nginx role with defaults, variables, tasks, handlers and metadata.

Include vs import

import_*

– static, parsed at playbook load time; better performance when the content never changes based on variables. include_* – dynamic, evaluated at execution time; needed for loops, tags, or conditional loading.

Guideline: use import_* whenever possible.

Check mode, limits and serial execution

# Dry‑run only
ansible-playbook -i inventory/hosts.yml deploy_nginx.yml --check --diff

# Run only a subset
ansible-playbook -i inventory/hosts.yml deploy_nginx.yml --limit "web1.example.com"

# Rolling update (30% batch)
ansible-playbook -i inventory/hosts.yml deploy_nginx.yml --serial "30%"

Error handling

Use block / rescue / always to emulate try/catch/finally. Example shows a risky command wrapped in a block with a rescue step and an always step for logging.

Templates (Jinja2)

Templates generate configuration files. The article lists common filters ( default, upper, join, to_nice_yaml) and control structures ( {% if %}, {% for %}).

Ansible Vault – secret management

# Create encrypted file
ansible-vault create vars/secrets.yml
# Edit existing vault
ansible-vault edit vars/secrets.yml
# Use in a playbook
vars_files:
  - vars/secrets.yml
ansible-playbook deploy.yml --ask-vault-pass

Risk notes: keep vault password files at mode 0600, never commit them, and use separate vault IDs for dev/prod.

ansible.cfg – global tuning

[defaults]
inventory = ./inventory/hosts.yml
remote_user = deploy
forks = 20
pipelining = True
host_key_checking = False
stdout_callback = yaml
log_path = /var/log/ansible.log

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s

Key parameters explained: forks (concurrency), gathering (fact caching), pipelining (SSH optimisation), and host_key_checking (disable for first‑time connections).

Performance optimisation

Increase forks for larger fleets (e.g., 30 for 10 hosts, 50 for 100 hosts).

Enable pipelining and ensure Defaults requiretty is disabled in /etc/sudoers.

Turn off fact gathering when not needed ( gather_facts: no).

Use strategy: free for independent host execution, or serial for rolling updates.

Consider the Mitogen accelerator ( strategy = mitogen_linear) for 1.25‑7× speed gains.

Testing and CI integration

Static analysis tools:

# Lint Playbooks
ansible-lint playbooks/deploy_nginx.yml
# YAML syntax check
python3 -c "import yaml, sys; yaml.safe_load(open('playbooks/deploy_nginx.yml'))"
# yamllint configuration (line length 160, truthy values)

Role testing with Molecule (Docker driver):

# Initialise a role test scaffold
molecule init role nginx -d docker
# Run the full test cycle
cd roles/nginx
molecule test

CI examples (GitHub Actions and GitLab CI) show installing dependencies, running ansible-lint, yamllint, and molecule test in parallel jobs.

Best‑practice checklist

Standard directory layout: inventory/, playbooks/, roles/, group_vars/, host_vars/, vars/.

Encapsulate reusable logic in roles; avoid inter‑role dependencies.

Follow the variable precedence ladder (defaults → role defaults → group_vars → host_vars → extra‑vars).

Never store plain secrets; always encrypt with Vault.

Give every task a descriptive name.

Prefer fully qualified collection names (FQCN) like ansible.builtin.apt.

Pair command / shell with changed_when or creates / removes.

Run --check --diff on every new playbook.

Disable fact gathering unless required.

Use handlers for service restarts; order matters.

Integrate ansible-lint, yamllint and Molecule into CI pipelines.

Document each role with a README that lists variables, defaults and usage.

Provide rollback scripts and test them.

Use serial or limit for blue‑green or canary deployments.

Monitor service health after each batch.

FAQ highlights

What is the difference between Ansible Tower and AWX? Tower is the commercial UI with RBAC and scheduling; AWX is the open‑source equivalent.

How does Ansible compare to Puppet or Salt? Ansible is agentless, easier to start, and fits small‑to‑medium fleets; Salt excels at massive scale with its own agent model; Puppet focuses on strict configuration enforcement.

Can Ansible manage Windows? Yes, via WinRM ( ansible_connection=winrm) and the win_* modules.

How to perform rolling updates? Use serial: "30%" or combine limit with health‑check tasks.

Where to store secrets? Encrypt with Vault, keep password files out of Git, and inject them in CI via secret stores.

By following the systematic approach described above, teams can turn ad‑hoc Ansible scripts into maintainable, idempotent, and auditable automation pipelines suitable for production environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CI/CDAutomationInfrastructure as CodeAnsibleRolePlaybookVault
Ops Community
Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.