Why X86 Bare‑Metal Services Matter and How to Build Them in the Cloud
This article explains why X86 bare‑metal services are essential for high‑performance, security‑critical workloads, describes their architecture and management processes, and outlines the steps—standardization, automation, service‑orientation, and self‑service—used by Hengfeng Bank to implement and operate them.
Why Need Bare‑Metal Service?
Most public clouds such as AWS and Alibaba Cloud provide virtualized instances, but many workloads at Hengfeng Bank—high‑performance computing, core databases, hardware‑dependent applications, and regulatory‑driven services—cannot run efficiently or securely on virtual machines, requiring physical X86 servers.
What Is Bare‑Metal Service?
Bare‑metal service offers managed X86 physical servers that can be operated like virtual machines while retaining the benefits of cloud services such as storage, networking, and security. The service is split into two parts: a user‑facing side for lifecycle, metering, monitoring, and fault reporting, and a provider‑facing side for resource‑pool, specification, and maintenance management.
Implementation Steps
Standardization
Define uniform server models, network cards, and disks; standardize rack‑level wiring to enable rapid rack‑up of physical resources.
Automation
Automate OS installation, hardware discovery, and routine operations (start, stop, reboot, destroy). Automate tenant network configuration using SDN and provide automated storage provisioning for both local disks and shared storage.
Service‑Orientation
Package standardized and automated resources into service APIs, create resource pools, and integrate with other cloud services (e.g., cloud storage, networking, security) through orchestration.
Self‑Service
Expose APIs and a service catalog so users can request, adjust quotas, and manage servers themselves, with multi‑tenant isolation and SLA agreements.
Architecture Overview
The solution uses the Mogan project (similar to OpenStack Ironic) to provide a dedicated bare‑metal API layer, resource‑pool management, and driver plug‑ins contributed by Huawei and Intel. It integrates with OpenStack services for networking, storage, and monitoring.
Operations
Provisioned servers are accessed securely via VPN and 4A console tools. Monitoring combines Zabbix, SNMP hardware metrics, and network traffic mirroring. Images are maintained with regular security patches, and resource pools are dynamically adjusted based on usage to optimize cost.
Current Status
Typical provisioning time is 5–10 minutes with 99.9 % availability achieved through high‑availability group scheduling.
Future Plans
Improve provisioning speed with hybrid image‑based and dynamic configuration, enhance lifecycle management for 3‑5‑year server refresh cycles, provide rapid fault recovery, enable OS re‑installation, and offer web‑based console access for users.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.