Operations 44 min read

Essential Linux Ops Interview Questions & Answers for High‑Paying Jobs

A comprehensive collection of Linux operations interview questions covering fundamentals, server management, RAID, load balancing, MySQL, networking, security, scripting, and best‑practice solutions to help candidates ace high‑salary positions.

MaGe Linux Operations

Aug 17, 2021

Essential Linux Ops Interview Questions & Answers for High‑Paying Jobs

1. What is Operations? What is Game Operations?

Operations (Ops) refers to the maintenance of an organization’s established network hardware and software to ensure services are online and running smoothly. It encompasses networking, systems, databases, development, security, and monitoring. Ops includes many specializations such as DBA Ops, website Ops, virtualization Ops, monitoring Ops, and game Ops.

Game Ops is divided into three parts: development Ops, application Ops, and system Ops. Development Ops builds tools and platforms for application Ops. Application Ops uses those tools to launch, maintain, and troubleshoot business services. System Ops provides the underlying infrastructure (systems, networks, monitoring, hardware) for application Ops. The three work together in a tightly coupled manner.

2. What does an Operations staff need to cooperate with when working with Product staff?

Game operations not only coordinate internal work but also communicate with platforms to plan server launch time, number of servers, user acquisition, and activity schedules.

3. How would you manage 300 servers?

Management approach:

Set up a jump host with a unified account for secure and convenient login.

Use configuration management tools such as salt, ansible, or puppet for unified scheduling and configuration.

Build a simple CMDB to record system, configuration, and application information for each server.

4. Explain the principles and characteristics of RAID0, RAID1, and RAID5.

RAID aggregates multiple disks into a single logical volume and can provide redundancy. Common levels are 0, 1, 5, and 10.

RAID0 : Stripes data across disks for high read/write speed. No redundancy; a single disk failure results in total data loss.

RAID1 : Mirrors data between two disks. Provides 100% redundancy, but capacity is halved. Read performance improves; write performance is similar to a single disk.

RAID5 : Requires at least three disks, distributes parity across all disks. Allows one disk to fail without data loss. Performance is moderate; reads are decent, writes are slower due to parity calculations.

Typical usage:

Single‑server OS disk: RAID1.

Database primary: RAID10; replica: RAID5 or RAID0 (cost‑effective).

Web servers with little data: RAID5 or RAID0 (single disk).

Multiple application servers: RAID0 or RAID5.

5. What are the differences among LVS, Nginx, and HAProxy? How do you choose in practice?

LVS : Layer‑4 forwarding (IP level).

HAProxy : Layer‑4 and Layer‑7 forwarding; a professional proxy server.

Nginx : Web server, cache server, and reverse proxy; supports Layer‑7 forwarding.

Difference: LVS can only forward based on ports (Layer‑4) and cannot handle URL‑ or path‑based routing, while HAProxy and Nginx can.

Selection guideline: For very high concurrency, choose LVS. For small‑to‑medium traffic, HAProxy or Nginx is sufficient. HAProxy is preferred for medium‑size enterprises because it is a dedicated proxy with simple configuration.

6. What are the differences among Squid, Varnish, and Nginx? How do you choose?

All three are proxy servers.

What is a proxy server?

A proxy acts on behalf of a client to fetch resources from the Internet and can cache the responses locally. When a cached copy exists, the proxy returns it directly; otherwise it retrieves the resource from the upstream server.

Differences:

Nginx is originally a reverse‑proxy/web server; with modules it can act as a cache, but its native caching is limited to static files.

Varnish and Squid are dedicated caching solutions. Varnish uses in‑memory caching with high performance and supports regex‑based invalidation; Squid has a larger ecosystem and more extensive documentation.

Selection:

For pure caching needs, prefer Varnish or Squid.

7. What are the differences between Tomcat and Resin? How would you choose?

Tomcat has a larger user base and more documentation; it is the standard Java servlet container. Resin has fewer users and less documentation. Tomcat offers better compatibility and stability for Java applications, while Resin can deliver higher performance. Large enterprises often choose Resin for performance; small‑to‑medium companies prefer Tomcat for stability.

8. What is middleware? What is JDK?

Middleware is independent system software or services that enable distributed applications to share resources across different technologies. It sits above the OS, managing resources and network communication, allowing heterogeneous systems to exchange information.

JDK (Java Development Kit) is the development environment for building Java applications, applets, and components on the Java platform.

9. Explain the meanings of Tomcat ports 8005, 8009, and 8080.

8005 – used for shutdown.

8009 – AJP port; used by containers such as Apache to communicate with Tomcat via the AJP protocol.

8080 – General application port.

10. What is a CDN?

A Content Delivery Network distributes website content to edge locations closest to users, reducing latency and improving access speed.

11. What is a gray‑release (canary) deployment?

Gray release is a smooth transition deployment method between black (no users) and white (all users). AB testing is a form of gray release where a subset of users receives version B while the rest stay on version A; if B is stable, the rollout expands to all users.

12. Briefly describe the DNS resolution process.

When a user accesses www.example.com, the resolver checks the local hosts file, then the configured DNS server. If not found, it queries the root server, which points to the TLD server (.com), then to the authoritative server for the domain, which finally returns the IP address. The result is cached locally for future queries.

13. What is RabbitMQ?

RabbitMQ is a message‑queue middleware that stores messages during transmission and routes them from producers to consumers. It ensures reliable delivery, even if the consumer is temporarily unavailable.

14. Explain the working principle of Keepalived.

Keepalived implements VRRP (Virtual Router Redundancy Protocol). In a virtual router group, one node is MASTER and sends VRRP advertisements. BACKUP nodes listen; if they stop receiving advertisements, the highest‑priority BACKUP becomes MASTER within <1 s, ensuring service continuity. VRRP packets are encrypted for security.

15. Describe the three LVS load‑balancing modes.

1) NAT mode (VS‑NAT)

Client packets have their destination IP rewritten to a real server (RS) IP by the load balancer, then forwarded. Responses go back through the load balancer, which rewrites the source IP to its own before sending to the client. All traffic passes through the balancer.

2) IP‑tunnel mode (VS‑TUN)

Client packets are encapsulated with a new IP header (only destination IP) and sent to the RS. The RS decapsulates, processes the request, and replies directly to the client, bypassing the balancer for response traffic. Requires RS kernel support for IP‑tunnel.

3) Direct‑routing mode (VS‑DR)

Both the balancer and RS share the same virtual IP. The balancer replies to ARP requests; RS remain silent. The balancer changes the MAC address of incoming packets to the RS’s MAC and forwards them. RS reply directly to the client using the shared IP. Requires balancer and RS to be on the same broadcast domain.

16. How to locate InnoDB lock issues and reduce MySQL master‑slave replication delay?

Lock diagnosis:

SHOW ENGINE INNODB STATUS;  -- reveals deadlocks
SELECT * FROM information_schema.innodb_trx;   -- running transactions
SELECT * FROM information_schema.innodb_locks; -- current locks
SELECT * FROM information_schema.innodb_lock_waits; -- lock wait relationships

Delay reduction strategies:

Ensure slave hardware is not weaker than master.

Enable multi‑threaded replication on newer MySQL versions.

Identify and optimise slow SQL statements.

Reduce network latency.

Balance load: add buffers/caching in front of the master, distribute reads across multiple slaves.

Adjust parameters such as --slave-net-timeout (default 3600 s) and --master-connect-retry (default 60 s) to handle reconnection delays.

17. How to reset the MySQL root password?

When the current password is known:

mysqladmin -u root -p password "new_password"   # prompts for old password

Or via SQL:

UPDATE mysql.user SET password = PASSWORD('new_password') WHERE user='root';
FLUSH PRIVILEGES;

When the password is forgotten:

Stop MySQL: service mysqld stop Start MySQL safely without grant tables: /usr/local/mysql/bin/mysqld_safe --skip-grant-table & Login without a password and reset:

UPDATE mysql.user SET password = PASSWORD('new_password') WHERE user='root';
FLUSH PRIVILEGES;

Alternatively, use

GRANT ALL ON *.* TO 'root'@'localhost' IDENTIFIED BY 'new_password';

18. Advantages and disadvantages of LVS, Nginx, and HAProxy

Nginx advantages

Operates at Layer 7, allowing HTTP‑level routing based on domain, path, etc.

Powerful regular‑expression rules, stronger than HAProxy.

Low dependency on network stability; works as long as the host is reachable.

Simple installation and configuration; logs errors clearly.

Handles high concurrent connections efficiently.

Can act as static file server and cache server.

Nginx disadvantages

Supports only HTTP, HTTPS, and email protocols.

Health checks are port‑based only; no URL‑based checks.

No built‑in session persistence (can use ip_hash).

LVS advantages

Strong load‑handling at Layer 4 with minimal CPU/memory usage.

Simple configuration reduces human error.

Highly stable with built‑in high‑availability (e.g., LVS+Keepalived).

Does not process payload traffic, avoiding bottlenecks.

Works for any TCP/UDP service (web, DB, chat, etc.).

LVS disadvantages

Lacks Layer‑7 features such as URL‑based routing.

Complex to deploy for large web applications; requires additional tools for health checks.

HAProxy characteristics

Supports virtual hosting and session persistence (cookies, source IP).

Provides many load‑balancing algorithms: roundrobin, static‑rr, leastconn, source, uri, hdr, etc.

Handles TCP load balancing (e.g., MySQL read‑replicas).

Generally faster than Nginx for pure load‑balancing tasks.

19. MySQL backup tools

mysqldump

Built‑in logical backup tool; suitable for small databases. Supports hot backup for InnoDB but slower than physical methods.

LVM snapshot backup

Physical backup using filesystem snapshots; requires downtime for non‑InnoDB engines unless using LVM snapshot with careful handling.

Percona XtraBackup

Physical hot backup for InnoDB, supports full and incremental backups, fast, works with separate tablespaces.

20. Keepalived health‑check configuration example

HTTP_GET|SSL_GET {
    url {
        path /   # multiple URLs can be checked
        digest <STRING>   # digest generated by genhash
        status_code 200    # expected HTTP status
    }
    connect_port 80
    bindto <IPADDR>
    connect_timeout 3
    nb_get_retry 3
    delay_before_retry 2
}

21. Find the top‑10 IPs by request count in an Nginx access log

cat access.log | awk '{print $1}' | uniq -c | sort -rn | head -10

22. Capture traffic to host 192.168.1.1 on TCP port 80 and save to a file

tcpdump 'host 192.168.1.1 and port 80' > tcpdump.log

23. Forward local port 80 to 8080 on host 192.168.2.1

iptables -A PREROUTING -d 192.168.2.1 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.2.1:8080

24. RAID0/1/5 principles (re‑stated)

RAID0 provides high performance without redundancy. RAID1 mirrors data for full redundancy but halves usable capacity. RAID5 distributes parity across disks, allowing one disk failure while offering a balance of performance and redundancy.

25. Understanding of a modern Operations Engineer

An Ops engineer must ensure the highest, fastest, most stable, and most secure services for the company and its customers. A small mistake can cause major losses, so the role requires rigor and innovative thinking.

26. Real‑time capture of TCP port 80 traffic

tcpdump -nn tcp port 80

27. Server fails to boot – step‑by‑step troubleshooting

Possible causes include hardware failure, power issues, BIOS misconfiguration, corrupted bootloader, or filesystem errors. Diagnose by checking power, BIOS settings, boot messages, and using rescue media to repair the filesystem or reinstall the bootloader.

28. How to deal with a Linux virus

Simplest method: reinstall the system.

Identify malicious processes with top, locate files with ps aux, and delete them using rm -f.

Check cron jobs, startup scripts, and hidden directories for persistence mechanisms.

After removal, backup data and perform a clean reinstall to ensure no remnants remain.

29. Virus file auto‑recreates after deletion – solution

Identify the parent process that continuously recreates the file (e.g., a hidden /usr/bin/.sshd binary). Stop external network access, use tools like iftop, netstat, lsof, ps to trace the malicious process, kill it, and delete the binary.

30. TCP/IP seven‑layer model

Application: protocols such as HTTP, FTP, SMTP, DNS, etc.

Presentation: data representation, encryption, compression.

Session: establishment, management, termination of sessions.

Transport: TCP/UDP, flow control, error checking.

Network: IP, ICMP, ARP, routing.

Data Link: MAC addressing, error detection.

Physical: actual transmission media and signaling.

31. Common Nginx modules and their purposes

rewrite

: URL rewriting. access: access control. ssl: TLS/SSL encryption. gzip: response compression. proxy: reverse proxy. upstream: define backend server pools. cache_purge: purge cached content.

32. Typical web‑server load architectures

Nginx, HAProxy, Keepalived, LVS.

33. View HTTP concurrent requests and TCP connection states

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
# Check file descriptor limits
ulimit -n   # default 1024
# To increase:
# edit /etc/security/limits.conf
* soft nofile 10240
* hard nofile 10240

34. Use tcpdump to find the top IPs accessing port 80

tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr | head -20

35. Bash script to ping 192.168.1.0/24 and list online hosts

#!/bin/bash
for ip in $(seq 1 255); do
  ping -c 1 192.168.1.$ip > /dev/null 2>&1 && echo 192.168.1.$ip UP || echo 192.168.1.$ip DOWN &
done
wait

36. Keep only the latest 7 days of Apache logs

Delete files older than 7 days:

find /application/logs/ -type f -mtime +7 -name "*.log" -exec rm -f {} \;

37. General Linux optimization tips

Create non‑root users and grant sudo privileges.

Change the default SSH port and disable root remote login.

Synchronise system time automatically.

Use domestic yum mirrors.

Disable SELinux and unnecessary iptables rules (enable only when needed).

Increase the maximum number of file descriptors.

Disable unneeded startup services (crond, rsyslog, network, sshd).

Tune kernel parameters via /etc/sysctl.conf.

Set appropriate locale (prefer UTF‑8).

Lock critical system files and clear /etc/issue.

38. Extract the IP address of eth0 using cut (also awk/sed examples)

# cut method
ifconfig eth0 | sed -n '2p' | cut -d ':' -f2 | cut -d ' ' -f1
# awk method
ifconfig eth0 | awk 'NR==2' | awk -F ':' '{print $2}' | awk '{print $1}'
# sed method
ifconfig eth0 | sed -n '/inet addr/p' | sed -r 's#^.*addr:(.*) B.*$#\1#g'

39. SecureCRT shortcut key functions

Ctrl +a – move cursor to line start.

Ctrl +c – terminate current program.

Ctrl +d – delete character under cursor or exit if line is empty.

Ctrl +e – move cursor to line end.

Ctrl +l – clear screen.

Ctrl +u – cut text before cursor.

Ctrl +k – cut text after cursor.

Ctrl +y – paste previously cut text.

Ctrl +r – search command history.

Tab – command or path completion.

Ctrl +Shift +c – copy.

Ctrl +Shift +v – paste.

40. Nightly backup of /var/www/html to /data with timestamped archive

# /root/backup_www.sh
#!/bin/bash
cd /var/www && tar zcf /data/html-$(date +%m-%d%H).tar.gz html/
# crontab entry
0 0 * * * /bin/sh /root/backup_www.sh

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Ops interview Server

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.