From IDC Selection to Salt Automation: A DevOps Engineer’s Practical Journey
This transcript shares a senior operations engineer’s step‑by‑step experience covering IDC and bandwidth selection, hardware checks, OS installation, initial configuration, migration from Puppet to Salt, user authentication, audit logging, and KVM virtualization, offering concrete tips and real‑world examples.
Introduction
The speaker, Liu Xin, a senior moderator of the ChinaUnix cluster and high‑availability section, explains why many learning resources focus on isolated skills and offers a holistic view of an operations engineer’s workflow.
Choosing an IDC and Bandwidth
Key factors include the quality‑price ratio of the data center, verification of the provider’s certifications (latest business licence, organization code certificate, tax registration, general taxpayer proof, ICP licence, ISP licence, ISO‑9001, and credit rating 3AA), and thorough testing of connectivity using ping, traceroute, or tcpping for at least 24 hours, especially during holidays to assess packet loss and stability.
Server Hardware Procurement
When buying servers, verify the age and condition of components (hard drives may be reused, memory could be previous‑generation). After hardware inspection, install the OS via PXE or other bulk‑installation methods.
Initial System Configuration
Basic setup includes installing common environments such as Python and Java, and tuning system files like fstab, sysctl.conf, and limits.conf. Early automation can be done with custom scripts (see http://blog.chinaunix.net/uid-10915175-id-3209542.html).
Migration from Puppet to Salt
Initially the team used Puppet with about 20 modules on 410‑class machines, but performance issues when managing 160‑200 servers prompted a switch to Salt. The transition is straightforward; the main difference lies in module syntax. Example Salt user‑management module places most logic in install.sls and includes it via init.sls.
User Authentication and Auditing
Without a dedicated authentication system, SSH keys were used. Salt modules now centralise user creation. For audit trails, the team records terminal sessions with /usr/bin/script (added to /etc/profile) and replays them via scriptreplay. An alternative Python tool TermRecord (install with pip install TermRecord) provides HTML‑based playback.
Kerberos and FreeIPA
Kerberos offers single‑sign‑on for small‑scale environments; for larger or more feature‑rich needs, FreeIPA can be deployed, though it requires more configuration.
Chroot Jump Hosts
To limit user actions, a chroot environment is used on jump servers, allowing only essential commands such as ssh and scp.
Virtualization with KVM
The team prefers KVM for production virtualization, maintaining clean and application‑specific templates. Management tools like Proxmox or OpenNode can be added for web‑based control.
Q&A Highlights
Q1: Direct purchase from manufacturers is possible, but quantity and lead‑time matter.
Q2: Server specs should match application load; a standard R420 can handle ~1 k requests per second with load balancers, and additional resources can be co‑located for efficiency.
Q3: For cost‑insensitive scenarios, use RAID 1 for OS (15k RPM) and RAID 5 for data.
Q4: Proxmox is a VM management tool and supports dual‑node high availability.
Conclusion
The session emphasizes that every operation—hardware selection, OS deployment, configuration management, authentication, auditing, and virtualization—must be documented and reproducible to ensure reliability and accountability in production environments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
