Mastering Ops: Security, High Availability, and Fault Diagnosis for Interviews
This article compiles concise, high‑scoring answers to essential operations interview questions, covering security hardening, intrusion response, high‑availability architecture, disaster‑recovery design, Redis replication and clustering, Docker fundamentals and networking, Kubernetes components, monitoring, CI/CD pipelines, and the evolving role of DevOps.
Security, High Availability and Fault Diagnosis – The Final Ops Defense
This guide focuses on security hardening, incident response, high‑availability architecture, and disaster‑recovery design, providing interview‑ready answers that demonstrate a professional’s ability to protect systems, withstand failures, and safeguard data.
61. What is the principle of Redis master‑slave replication?
High‑score answer: Replication is asynchronous. A slave sends SYNC or PSYNC to the master, which creates an RDB snapshot and buffers write commands during the snapshot transfer. After the snapshot is sent, the master replays the buffered commands on the slave. Thereafter, every write on the master is forwarded asynchronously to all slaves, ensuring eventual consistency while incurring replication lag.
62. What is the role of Redis Sentinel?
High‑score answer: Sentinel provides Redis high availability. It continuously monitors the health of master and slave nodes; when the master fails, Sentinel elects a new master via majority voting and notifies slaves to switch replication sources, also updating client configurations. Sentinel itself can be clustered to avoid a single point of failure and is the most common HA solution for small‑to‑medium Redis deployments.
63. How does Redis Cluster work?
High‑score answer: Redis Cluster uses a decentralized architecture that shards data across 16,384 hash slots, with each master responsible for a subset of slots. Clients can connect to any node; if the key is not owned by that node, the node returns a MOVED redirect. The cluster supports master‑slave replication and automatic failover without Sentinel, suitable for large‑scale, high‑throughput scenarios, but it does not support multi‑key transactions across slots.
64. How to prevent Redis cache penetration, breakdown, and avalanche?
High‑score answer: Penetration : Cache empty results with a short TTL or use a Bloom filter to pre‑check existence. Breakdown : For hot keys that expire simultaneously, set a permanent cache or use a mutex (e.g., SETNX) to ensure only one thread rebuilds the cache. Avalanche : Assign random expiration times to avoid mass expiry and sudden DB load spikes.
65. Difference between Docker and a virtual machine?
High‑score answer: Docker is OS‑level virtualization; containers share the host kernel, start in seconds, and have minimal overhead. A VM is hardware‑level virtualization; each VM runs a full OS, offering stronger isolation but slower startup and higher resource consumption. Docker suits micro‑service rapid deployment, while VMs are better for strong isolation or multiple OS environments.
66. Common Dockerfile instructions?
High‑score answer: Core directives include FROM (base image), RUN (execute build command, creates a layer), COPY / ADD (copy files; ADD also supports archive extraction and URLs), WORKDIR (set working directory), EXPOSE (declare ports), and CMD / ENTRYPOINT (define container start command). Best practice: combine RUN steps to reduce layers and use a .dockerignore file.
67. Docker network modes?
High‑score answer: Four default modes: bridge (containers communicate via docker0 bridge, requires port mapping), host (shares host network namespace, high performance but port conflicts), none (no network, full isolation), and container (shares another container’s network stack). Production typically uses bridge or a custom overlay network (e.g., Swarm/K8s).
68. Docker data persistence methods?
High‑score answer: Two main approaches: bind mount (mount host directory into container, good for debugging but path‑dependent) and volume (Docker‑managed storage, lifecycle independent of containers, recommended for production). Additionally, tmpfs can store temporary sensitive data. Containers should remain stateless; data must be externalized.
69. Core Kubernetes components and their functions?
High‑score answer: Control plane: apiserver (cluster entry point), etcd (state storage), scheduler (assign Pods to nodes), controller‑manager (maintains desired state). Nodes: kubelet (manages local containers), kube‑proxy (implements Service networking). Together they enable automated deployment, scaling, and self‑healing.
70. What is a Pod and its types?
High‑score answer: A Pod is the smallest K8s scheduling unit, containing one or more containers that share network and storage. Types: standalone Pods created directly (no controller, no self‑healing) and controller‑managed Pods (created by Deployments, StatefulSets, etc.) which provide replica control, rolling updates, and high availability.
71. Difference between Deployment and StatefulSet?
High‑score answer: Deployments manage stateless applications; Pods receive random names and no stable network identity, suitable for web services. StatefulSets manage stateful applications; Pods have stable names (e.g., web‑0), fixed storage, and ordered start/stop, ideal for databases or ZooKeeper where identity and persistence matter.
72. Service purpose and types?
High‑score answer: A Service provides a stable network endpoint for a set of Pods selected by labels. Types: ClusterIP (default, internal access), NodePort (exposes IP:Port on each node), LoadBalancer (cloud LB), and ExternalName (maps to external DNS). Services are core to K8s service discovery and load balancing.
73. What is Ingress and its relation to Service?
High‑score answer: Ingress defines seven‑layer HTTP/HTTPS routing rules that forward external requests to different Services. It does not provide load‑balancing itself; an Ingress Controller (e.g., Nginx) implements the routing. Services handle four‑layer traffic, while Ingress adds domain‑ and path‑based routing on top.
74. How to troubleshoot a failing Pod?
High‑score answer: Steps: kubectl describe pod <name> to view events (e.g., image pull errors, resource limits); kubectl logs <pod> for application logs; add -p to see previous container logs; verify CPU/Memory requests, storage mounts, and health‑probe configurations.
75. How does K8s achieve high availability?
High‑score answer: Deploy multiple instances of control‑plane components ( apiserver, etcd) behind a load balancer. Ensure worker nodes are replicated via Deployments with anti‑affinity rules. Regularly back up etcd data; keep apiserver stateless for horizontal scaling. The architecture tolerates any single node failure.
76. What is Helm and its purpose?
High‑score answer: Helm is the package manager for Kubernetes, analogous to yum / apt. It bundles resources such as Deployments, Services, ConfigMaps, etc., into a Chart, enabling versioned, parameterized, one‑click deployments of complex applications like MySQL or Prometheus.
77. How to monitor a K8s cluster?
High‑score answer: The standard stack is Prometheus + Grafana + Alertmanager. Prometheus discovers targets via ServiceMonitor and scrapes node, pod, and container metrics. Grafana visualizes the data, while Alertmanager routes alerts. Logs can be integrated via EFK (Elasticsearch‑Fluentd‑Kibana) and tracing via Jaeger for a full observability loop.
78. What is CI/CD and common tools?
High‑score answer: CI (continuous integration) automates build and test after code commits; CD (continuous delivery/deployment) automates release to test or production. Common toolchain: GitLab CI or Jenkins for pipelines, Docker for image building, K8s/Helm for deployment, SonarQube for code quality, and Nexus for artifact storage.
79. How to implement automated deployment?
High‑score answer: Design a standardized pipeline: code commit → trigger Jenkins/GitLab CI → unit tests → build Docker image → push to registry (e.g., Harbor) → update Helm chart version → roll out to K8s with rolling upgrade. Key points: tie image tags to Git commit IDs, perform health checks before release, enable automatic rollback, and codify all steps as Infrastructure‑as‑Code.
80. What is DevOps and the role of operations?
High‑score answer: DevOps is a culture and practice that aligns development and operations to shorten delivery cycles and improve service quality. Operations evolve from “fire‑fighter” to “platform engineer”, building CI/CD pipelines, designing HA architectures, providing monitoring and alerting, and enabling self‑service for developers while ensuring system stability.
Xiao Liu Lab
An operations lab passionate about server tinkering 🔬 Sharing automation scripts, high-availability architecture, alert optimization, and incident reviews. Using technology to reduce overtime and experience to avoid major pitfalls. Follow me for easier, more reliable operations!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
