ARM Architecture CPUs and AWS Graviton: Performance and Cost Analysis of M6G Instances
This article examines the rise of ARM‑based CPUs, details AWS's Graviton 2 processor and its deployment in M6G instances, and presents extensive performance, cost, and third‑party benchmark comparisons to illustrate the emerging shift toward ARM in cloud computing.
New Transformation Named “ARM Architecture CPU”
Since its debut in 1985, ARM CPUs have dominated mobile chips due to low power consumption, strong functionality, and low cost.
With Moore's Law slowing, Intel’s performance gains are waning while ARM offers lower procurement costs and power consumption, opening market opportunities.
In 2019, AWS announced its ARM‑based Graviton 2 (Neoverse N1) for M6G/R6G/C6G instances, marking the first major cloud provider to use ARM in mainstream servers; Huawei’s Kunpeng 920 and Apple’s ARM‑based Macs also debuted.
The ARM‑driven shift is expanding from industrial servers to everyday laptops.
AWS and ARM CPU
Historically, cloud providers relied on Intel Xeon (and some AMD) processors; X86‑64 dominated the market.
AWS announced that Graviton 2 delivers 7× performance, 4× cores, 2× cache, and 5× faster memory compared to the first‑gen chip.
AWS Graviton Parameters
Based on author’s testing of AWS M6G instances and official N1 documentation, the Graviton 2nd‑Gen chip features a 7nm process, higher transistor density, and higher clock speeds, while consuming less power than comparable Intel CPUs.
Intel supports a broader instruction set (98 instructions) offering lower overhead for certain workloads.
Some x86‑64‑dependent applications are not yet compatible with ARM, but ecosystem growth is reducing this gap.
AWS M6G Instance Exploration
M6G Basic Information
M6G uses the Graviton 2 chip; each vCPU maps to a physical core, providing full core performance in multithreaded scenarios, surpassing M5.
Optimized Linux kernels (e.g., Debian 10) and GCC flags further boost performance.
CPU‑to‑memory ratio is 1:4, ranging from m6g.medium (1 vCPU 4 GB) to m6g.16xlarge (64 vCPU 256 GB).
M6G Cost Overhead
According to AWS data, M6G instances are about 20% cheaper than comparable M5 (Intel) instances.
Basic Performance Insights
Test Environment
All test data were collected from AWS‑provided M6G instances running Debian 10 ARM (AWS Marketplace version).
* 系统内核版本
Linux ip-172-31-1-39 4.19.0-7-arm64 #1 SMP Debian 4.19.87-1 (2019-12-03) aarch64 GNU/Linux
* 系统版本
10.2
* 网卡驱动版本
ena 2.1.0k
* CPU参数:
BogoMIPS : 243.75
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x3
CPU part : 0xd0c
CPU revision : 1
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: ARM
Model: 1
Stepping: r3p1
L1d cache: 64K
L1i cache: 64K
L2 cache: 1024K
L3 cache: 32768K
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbsVertical Comparison
Performance measured with pystone shows significant gains.
Horizontal Comparison
CPU compute performance (pystone) comparison.
Network performance comparison shows comparable results to M5.
Compared to A1, M6G single‑core compute performance improves 2.6×.
Compared to M5, M6G single‑core compute is about 89% of M5 (pystone 1.1).
Network performance is on par with M5.
Simulated Business Scenarios
Running real workloads on M6G demonstrates strong performance even without ARM‑specific optimizations.
Test Results
Under equal load, CPU and memory utilization metrics were captured (images omitted).
Summary and Conclusion
Testing shows M6G matches M5’s network performance and achieves roughly 10% lower single‑core compute performance in pystone, while delivering 2.5× higher performance than the previous A1 generation.
For compute‑intensive workloads, M6G performs on par or better than M5 even without ARM‑specific tuning.
Pricing is about 20% lower than comparable M5 instances.
Third‑Party Data
KeyDB reported 1.65× faster performance on m6g.large vs m5.large, and 1.45× on xlarge.
Phoronix published horizontal benchmark results for M6G.
NGINX reported 47%–63% higher request throughput on M6G compared to M5.
NetEase Game Operations Platform
The NetEase Game Automated Operations Platform delivers stable services for thousands of NetEase titles, focusing on efficient ops workflows, intelligent monitoring, and virtualization.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.