Cloud Computing 12 min read

ARM Architecture CPUs and AWS Graviton: Performance and Cost Analysis of M6G Instances

This article examines the rise of ARM‑based CPUs, details AWS's Graviton 2 processor and its deployment in M6G instances, and presents extensive performance, cost, and third‑party benchmark comparisons to illustrate the emerging shift toward ARM in cloud computing.

NetEase Game Operations Platform
NetEase Game Operations Platform
NetEase Game Operations Platform
ARM Architecture CPUs and AWS Graviton: Performance and Cost Analysis of M6G Instances

New Transformation Named “ARM Architecture CPU”

Since its debut in 1985, ARM CPUs have dominated mobile chips due to low power consumption, strong functionality, and low cost.

With Moore's Law slowing, Intel’s performance gains are waning while ARM offers lower procurement costs and power consumption, opening market opportunities.

In 2019, AWS announced its ARM‑based Graviton 2 (Neoverse N1) for M6G/R6G/C6G instances, marking the first major cloud provider to use ARM in mainstream servers; Huawei’s Kunpeng 920 and Apple’s ARM‑based Macs also debuted.

The ARM‑driven shift is expanding from industrial servers to everyday laptops.

AWS and ARM CPU

Historically, cloud providers relied on Intel Xeon (and some AMD) processors; X86‑64 dominated the market.

AWS announced that Graviton 2 delivers 7× performance, 4× cores, 2× cache, and 5× faster memory compared to the first‑gen chip.

AWS Graviton Parameters

Based on author’s testing of AWS M6G instances and official N1 documentation, the Graviton 2nd‑Gen chip features a 7nm process, higher transistor density, and higher clock speeds, while consuming less power than comparable Intel CPUs.

Intel supports a broader instruction set (98 instructions) offering lower overhead for certain workloads.

Some x86‑64‑dependent applications are not yet compatible with ARM, but ecosystem growth is reducing this gap.

AWS M6G Instance Exploration

M6G Basic Information

M6G uses the Graviton 2 chip; each vCPU maps to a physical core, providing full core performance in multithreaded scenarios, surpassing M5.

Optimized Linux kernels (e.g., Debian 10) and GCC flags further boost performance.

CPU‑to‑memory ratio is 1:4, ranging from m6g.medium (1 vCPU 4 GB) to m6g.16xlarge (64 vCPU 256 GB).

M6G Cost Overhead

According to AWS data, M6G instances are about 20% cheaper than comparable M5 (Intel) instances.

Basic Performance Insights

Test Environment

All test data were collected from AWS‑provided M6G instances running Debian 10 ARM (AWS Marketplace version).

* 系统内核版本
Linux ip-172-31-1-39 4.19.0-7-arm64 #1 SMP Debian 4.19.87-1 (2019-12-03) aarch64 GNU/Linux
* 系统版本
10.2
* 网卡驱动版本
ena 2.1.0k
* CPU参数:
BogoMIPS    : 243.75
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer    : 0x41
CPU architecture: 8
CPU variant    : 0x3
CPU part    : 0xd0c
CPU revision    : 1
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              1
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           ARM
Model:               1
Stepping:            r3p1
L1d cache:           64K
L1i cache:           64K
L2 cache:            1024K
L3 cache:            32768K
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

Vertical Comparison

Performance measured with pystone shows significant gains.

Horizontal Comparison

CPU compute performance (pystone) comparison.

Network performance comparison shows comparable results to M5.

Compared to A1, M6G single‑core compute performance improves 2.6×.

Compared to M5, M6G single‑core compute is about 89% of M5 (pystone 1.1).

Network performance is on par with M5.

Simulated Business Scenarios

Running real workloads on M6G demonstrates strong performance even without ARM‑specific optimizations.

Test Results

Under equal load, CPU and memory utilization metrics were captured (images omitted).

Summary and Conclusion

Testing shows M6G matches M5’s network performance and achieves roughly 10% lower single‑core compute performance in pystone, while delivering 2.5× higher performance than the previous A1 generation.

For compute‑intensive workloads, M6G performs on par or better than M5 even without ARM‑specific tuning.

Pricing is about 20% lower than comparable M5 instances.

Third‑Party Data

KeyDB reported 1.65× faster performance on m6g.large vs m5.large, and 1.45× on xlarge.

Phoronix published horizontal benchmark results for M6G.

NGINX reported 47%–63% higher request throughput on M6G compared to M5.

performancecloud computingAWSARMcostGravitonM6G
NetEase Game Operations Platform
Written by

NetEase Game Operations Platform

The NetEase Game Automated Operations Platform delivers stable services for thousands of NetEase titles, focusing on efficient ops workflows, intelligent monitoring, and virtualization.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.