Building an Enterprise‑Grade Server Security Audit System: Design, Tools, and Implementation
This article outlines the motivation, design principles, architecture, component choices, and step‑by‑step implementation of a comprehensive server security audit system, covering server information collection, log gathering, access control checks, local vulnerability detection, abnormal traffic analysis, and integration with ELK, Hadoop, and open‑source tools like Lynis and OSSEC.
Design Goals
Functionality – combine log auditing and system security auditing, collect raw data, clean it, store it centrally, and expose analysis APIs.
Operations – support mass deployment, easy upgrades, and health monitoring of each component.
Performance – handle large‑scale log volumes with low latency; the audit system itself must not become a bottleneck.
Architecture
The system is divided into five logical parts:
Client – lightweight agent (daemon or crontab) that gathers host information and forwards it.
Collector – high‑throughput, fault‑tolerant log collector.
Storage – scalable, high‑IO storage for massive data sets (e.g., Elasticsearch).
Analyzer – reporting engine that produces dashboards and alerts.
Scheduler – central brain that coordinates deployment, configuration pushes, and task scheduling.
Implementation Details
Client
The client can be built in‑house or based on existing open‑source projects. It must support both client‑server and agent‑less modes and be configurable for Linux (daemon) or Windows (service).
Collector
The collector must guarantee high availability and low latency; data loss or backlog quickly creates a “bucket effect”.
Storage
A distributed store with large capacity and high I/O, such as Elasticsearch, is recommended.
Analyzer
Generates statistical tables (e.g., server inventory, user command usage) for decision‑making.
Scheduler
Acts as the orchestration layer; each component may also contain its own scheduler to keep the system loosely coupled.
Open‑Source Tools Comparison
Two widely used security‑audit tools are Lynis (Shell‑based, *nix) and OSSEC (cross‑platform HIDS). Both can be integrated with the ELK stack for centralized analysis.
Lynis installation : apt-get install lynis OSSEC installation :
apt-get install ossec-hids ossec-hids-agent # Debian Linux ossec-win32/64-agent.exe # WindowsLog Auditing on Linux
Forward selected syslog facilities to a central collector:
echo 'kern.*;security.*;auth.info;authpriv.info;user.info @x.y.z.com:514' > /etc/rsyslog.d/logaudit.conf && /etc/init.d/rsyslog force-reloadCapture user command history via PROMPT_COMMAND and send it to syslog:
echo "export PROMPT_COMMAND='{ echo \"HISTORY:PID=$$ PPID=$PPID SID=$$ USER=${USER} CMD=$(history 1 | tr -s [[:blank:]] |cut -d\" \" -f 3-100)\" ; } | logger -p user.info'" > /etc/profile.d/logger_userlog.sh; source /etc/profile.d/logger_userlog.shData Collection, Storage and Analysis
The ELK stack (Logstash‑Elasticsearch‑Kibana) is used for real‑time log processing and visualization. Reference versions:
logstash‑2.3.2
elasticsearch‑2.3.2
kafka
zookeeper
kibana‑4.5.1
For offline batch analysis, Hadoop with the Python mrjob library can be employed:
from mrjob.job import MRJob
from mrjob.step import MRStep
import heapq
class UrlRequest(MRJob):
def steps(self):
return [MRStep(mapper=self.mapper,
reducer=self.reducer_sum),
MRStep(reducer=self.reducer_top10)]Deployment and Operations
Configuration management can be handled by Puppet, Ansible, or SaltStack. For high‑availability scheduling, HAProxy/Nginx + Keepalived is recommended.
Example Nginx proxy configuration (used as a reverse proxy for the Flask API):
location / {
proxy_connect_timeout 75s;
proxy_read_timeout 300s;
try_files @uri @gunicorn_proxy;
}
location @gunicorn_proxy {
log_format postdata '$remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent" "$request_body"';
access_log /home/test/var/log/access.log postdata;
proxy_read_timeout 300s;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://127.0.0.1:8001;
}Flask provides a RESTful API for report CRUD operations:
@app.route('/language/<string:name>')
@app.route('/language', methods=['POST'])
@app.route('/language/<string:name>', methods=['PUT', 'PATCH'])
@app.route('/language/<string:name>', methods=['DELETE'])Future Extensions
Feature iteration – improve UX, reduce client heterogeneity, and optionally fork Lynis/OSSEC for custom needs.
Security knowledge base – collect discovered vulnerabilities into a centralized repository (e.g., https://github.com/hanc00l/wooyun_public) and automate patch deployment via CI/CD.
External scanning & threat intel
OpenVAS – https://github.com/mikesplain/openvas-docker
CVE‑search – https://github.com/cve-search/cve-search
OSTrICa – https://github.com/Ptr32Void/OSTrICa
These extensions aim to evolve the platform from a pure detection‑analysis system into a full “detect‑analyze‑block” security framework.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
