Artificial Intelligence 13 min read

Unlocking AI Scale‑Up: Inside SUE, OISA, ALS and ETH+ High‑Performance Interconnects

This article introduces four cutting‑edge AI networking technologies—SUE, OISA, ALS, and ETH+—detailing their backgrounds, architectural designs, and performance enhancements that enable ultra‑high bandwidth, low‑latency, and scalable interconnects for modern AI compute clusters.

Architects' Tech Alliance

Oct 9, 2025

Unlocking AI Scale‑Up: Inside SUE, OISA, ALS and ETH+ High‑Performance Interconnects

1. SUE (Scale Up Ethernet)

SUE, proposed by Broadcom, is a new interconnect framework that brings Ethernet advantages into AI systems' Scale‑Up domain, enabling high‑speed, reliable, open communication between XPU (including GPUs) chips.

(a) Background and Goals

The SUE framework allows XPU clusters to scale to rack or multi‑rack sizes, supporting large datasets, deep neural network training, and parallel tasks. It builds the transport and data link layers on Ethernet, moving memory transactions directly between XPU. It supports single‑hop switch or direct‑mesh topologies, with flexible port configurations (1/2/4 ports, e.g., 800 G can be split into 1×800 G, 2×400 G, or 4×200 G). Multiple SUE instances per XPU enable ultra‑high bandwidth (e.g., 64 XPU each with 12 × 800 G SUE yields 9.6 Tbps between any two XPU).

(b) Technical Architecture

SUE uses a class‑AXI full‑duplex interface, virtual channels (VC) to map transactions to traffic classes, and supports strict ordered and unordered transmission modes. Its protocol stack has three layers:

Mapping & Packing Layer: aggregates transactions to the same target (XPU, VC) into a maximum 4096‑byte SUE PDU.

Transport Layer: adds a reliability header (RH) with sequence number (PSN), VC, and acknowledgment (RPSN), plus a 32‑bit CRC. It uses a simplified Go‑Back‑N retransmission with PFC/credit‑based flow control and link‑level retry.

Network Layer: supports standard Ethernet/IPv4/IPv6/UDP, optimized AI forwarding headers (AFH Gen1) and highly compressed AFH Gen2 (6‑12 bytes) to reduce overhead.

SUE provides three interface types: XPU command interface (FIFO credit or AXI4), XPU management interface (AXI‑based register configuration), and Ethernet interface (200 G/100 G SerDes, compatible with PFC/CBFC and LLR, with dynamic fault‑link switching).

Real‑time monitoring packs multiple transactions into a single Ethernet frame (up to 2 KB) without added latency. The design targets end‑to‑end RTT below 2 µs, supports up to 1024 XPU in a single‑hop network, and achieves sub‑520 ns one‑way latency over 10 m using hollow fiber.

2. OISA (Omni‑directional Intelligent Sensing Express Architecture)

OISA, introduced by China Mobile, is an open GPU interconnect protocol aimed at breaking the communication wall in trillion‑parameter AI model training.

(a) Background and Goals

Training massive models requires frequent GPU data exchange, and communication overhead limits linear scaling of compute power. OISA seeks to provide an efficient, intelligent, flexible, and open GPU‑to‑GPU interconnect supporting large‑model training, inference, and HPC workloads.

(b) Protocol Architecture

OISA adopts a layered design: transaction layer, data layer, and physical layer.

Transaction Layer: encapsulates data, supports message, memory, and multi‑semantic modes, and introduces selective repeat (SR) retransmission for higher efficiency in Scale‑Up scenarios.

Data Layer: defines flow‑aware packet structures, enabling dynamic link resource and rate adjustments, and incorporates CBFC and PFC flow control plus data‑layer retransmission.

Physical Layer: split into logical and electrical sub‑layers; the logical layer handles encoding and timing, while the electrical layer converts to signals, ensuring compatibility.

OISA interfaces support AXI Stream for high‑speed data and AXI Lite for control, and can interoperate with vendor‑specific GPU interfaces. The architecture delivers high‑speed, low‑latency, lossless, and reliable GPU communication.

3. ALS (ALink System)

At the 2024 ODCC Open Data Center Conference, Alibaba Cloud and partners launched the AI network interconnect open ecosystem ALS, supporting the UALink protocol to address ultra‑high bandwidth and low‑latency Scale‑Up challenges.

ALS‑D (data plane) uses UALink, offering native memory‑semantic access, GPU memory sharing, and switch‑based networking, delivering ultra‑high bandwidth, ultra‑low latency, and in‑network computing features.

ALS‑M (management plane) provides standardized access for various chips, supporting both open ecosystem and proprietary interconnects, and offers flexible single‑tenant or multi‑tenant configurations for cloud‑scale AI cluster management.

4. ETH+ (High‑Throughput Ethernet)

Developed by the ETH+ Consortium (including the Institute of Computing Technology, Chinese Academy of Sciences, Alibaba Cloud, etc.), ETH+ is a new Ethernet protocol released in September 2024 (v1.0) and updated to v1.1 in August 2025.

Key enhancements:

Frame format optimization increases effective payload ratio up to 74 %.

Link‑layer and physical‑layer retransmission mechanisms provide fast loss recovery, reducing end‑to‑end latency.

Integration of RDMA and in‑network computing enables direct remote memory access and on‑the‑fly aggregation, boosting collective communication performance by over 30 % for AI workloads.

ETH+ also introduced the first domestic 400 G smart NIC chip, a 25.6 T switch chip, and silicon‑photonic components, supporting up to 64‑node super‑nodes.

These technologies collectively advance AI‑centric high‑performance networking, enabling scalable, low‑latency, and bandwidth‑rich interconnects for future AI compute clusters.

high performance computing Scale‑Up Ethernet Interconnect AI networking

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.