Automate Nginx Config Audits: Python Script to Export Structured Excel Reports
Learn how a lightweight Python script can automatically parse complex Nginx configuration files, extract upstream, server, and location details, and generate a structured Excel report for easy auditing, analysis, and collaboration, streamlining operations and configuration management.
Script Overview
This script parses Nginx configuration files, extracts upstream, server, and location blocks, and outputs a structured Excel (.xlsx) report for auditing, analysis, and sharing.
Key Features
Parse Nginx configuration : extract upstream and server information.
Structure data : convert parsing results to a Pandas DataFrame.
Export to Excel : save as nginx_config_table.xlsx for easy viewing and collaboration.
Implementation Details
1. Parse Nginx configuration content
Use regular expressions to match all upstream blocks and extract backend server addresses.
Split all server blocks and parse each for:
Listening ports (supports listen 80; or listen 443 ssl;).
Domain names ( server_name, multiple supported).
Each location block:
Route path (e.g., /api/).
Forward target ( proxy_pass) or static path ( root / alias).
Associated upstream for load balancing.
Other configuration lines and comments are retained as remarks.
2. Generate structured table
Each location becomes a row with fields: Domain/Service, Listening Port, Route‑Location, Target/Path, Load Balancing, and Parameters/Remarks.
3. Export to Excel file
Use pandas.to_excel() to save the DataFrame as nginx_config_table.xlsx, preserving the full structure and supporting filtering, sorting, and sharing.
Full Python Script
import re
import pandas as pd
import sys
def parse_nginx_config(config_content):
"""Parse Nginx configuration, extract upstream and server info."""
servers = []
upstreams = {}
# Extract all upstream blocks
upstream_blocks = re.findall(r'upstream\s+(\w+)\s*\{\s*([^}]*?)\s*\}', config_content, re.DOTALL)
for upstream_name, upstream_body in upstream_blocks:
upstream_servers = re.findall(r'server\s+([^\s;]+);', upstream_body)
upstreams[upstream_name] = ', '.join(upstream_servers)
# Split all server blocks
server_blocks = re.split(r'
\s*server\s*\{', config_content)
for block in server_blocks:
if not block.strip():
continue
server_info = {}
# Extract listening ports
listen_matches = re.findall(r'listen\s+(\d+)(?:\s+ssl)?;', block)
server_info['监听端口'] = ', '.join(listen_matches) if listen_matches else None
# Extract server_name
server_name_match = re.search(r'server_name\s+([^\s;]+(?:\s+[^\s;]+)*);', block)
if server_name_match:
server_info['域名/服务'] = ', '.join(server_name_match.group(1).split())
else:
server_info['域名/服务'] = "无"
# Extract location blocks
locations = []
location_blocks = re.findall(r'location\s+([^\s{]+)\s*\{\s*([^}]*?)\s*\}', block, re.DOTALL)
for loc_path, loc_content in location_blocks:
location_info = {
'路由-location': loc_path,
'目标/路径': None,
'负载均衡': None,
'参数与备注': []
}
# proxy_pass
proxy_pass_match = re.search(r'proxy_pass\s+(http://[^\s;/]+)[^\s;]*;', loc_content)
if proxy_pass_match:
proxy_target = proxy_pass_match.group(1)
location_info['目标/路径'] = proxy_target
for upstream_name, servers_list in upstreams.items():
if proxy_target.endswith(f"/{upstream_name}"):
location_info['负载均衡'] = servers_list
break
# static paths
if not location_info['目标/路径']:
root_match = re.search(r'root\s+([^\s;]+);', loc_content)
if root_match:
location_info['目标/路径'] = root_match.group(1)
else:
alias_match = re.search(r'alias\s+([^\s;]+);', loc_content)
if alias_match:
location_info['目标/路径'] = alias_match.group(1)
if not location_info['目标/路径']:
location_info['目标/路径'] = "无"
# retain original lines as remarks
lines = [line.strip() for line in loc_content.strip().split('
') if line.strip()]
location_info['参数与备注'] = '
'.join(lines)
locations.append(location_info)
server_info['locations'] = locations
servers.append(server_info)
return servers
def generate_table(servers):
"""Convert parsing results to Pandas DataFrame."""
rows = []
for server in servers:
for loc in server.get('locations', []):
row = {
'域名/服务': server.get('域名/服务'),
'监听端口': server.get('监听端口'),
'路由-location': loc.get('路由-location'),
'目标/路径': loc.get('目标/路径'),
'负载均衡': loc.get('负载均衡'),
'参数与备注': loc.get('参数与备注')
}
rows.append(row)
return pd.DataFrame(rows)
def main():
try:
import pandas as pd
except ImportError:
print("错误: 未找到pandas库,请先安装。可以使用命令:pip install pandas")
return
if len(sys.argv) != 2:
print("用法: python3 parse_nginx.py <nginx.conf>")
return
file_path = sys.argv[1]
try:
with open(file_path, 'r', encoding='utf-8') as f:
config_content = f.read()
except FileNotFoundError:
print(f"错误: 文件 {file_path} 未找到,请检查路径是否正确。")
return
except Exception as e:
print(f"发生错误: {e}")
return
servers = parse_nginx_config(config_content)
table = generate_table(servers)
print(table)
excel_file_path = 'nginx_config_table.xlsx'
table.to_excel(excel_file_path, index=False)
print(f"Nginx配置解析结果已保存至 {excel_file_path}")
if __name__ == '__main__':
main()Usage
Install dependencies: pip install pandas (pandas will also install openpyxl for .xlsx output).
Run the script: python3 parse_nginx.py nginx.conf.
Application Scenarios
Configuration audit: quickly list all service routes and backend dependencies.
Migration/refactoring: export configuration structure to assist architectural changes.
Team collaboration: transform complex configs into readable Excel tables.
Security checks: identify unauthorized exposed paths or misrouted traffic.
Conclusion
This lightweight Python script enables operations engineers to visualize complex Nginx configurations efficiently, greatly improving configuration management productivity. The script relies on regular‑expression parsing and works for most standard Nginx setups.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiao Liu Lab
An operations lab passionate about server tinkering 🔬 Sharing automation scripts, high-availability architecture, alert optimization, and incident reviews. Using technology to reduce overtime and experience to avoid major pitfalls. Follow me for easier, more reliable operations!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
