Cloud Computing 32 min read

Rethinking Cloud Computing: How Alibaba’s CIPU Redefines Compute Power

This article revisits cloud computing by tracing the evolution of compute power, exploring Alibaba Cloud’s infrastructure breakthroughs such as the CIPU processor and its core platforms, and analyzing how these advances reshape elastic, big‑data, high‑performance, and AI workloads while highlighting trust, cost, and self‑service challenges.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Rethinking Cloud Computing: How Alibaba’s CIPU Redefines Compute Power

Origins

The author’s renewed understanding of cloud computing stems from three touchpoints: reading two insightful articles (Daoge’s "My Understanding of Computing" and Wu Jun’s "China’s Computing Power: Risks and Opportunities"), studying Wang Jian’s book "Online", and completing Alibaba Cloud’s internal AEPC exam, which offered a comprehensive view of Alibaba’s product ecosystem.

What Is Compute Power?

Compute power is the ability of devices to process data and produce results, measured in FLOPS. It differs from "computing" itself, which is the broader concept of performing calculations. Cloud computing is fundamentally a service built around compute power.

Alibaba Cloud’s Vision

From its inception, Alibaba Cloud embraced the belief that computing should be a public service, aiming to make an entire data center behave like a single computer. Its core missions have been to expand compute‑infrastructure scale, improve compute‑management efficiency, and provide abundant compute products to simplify computation.

Evolution of Compute Power

Compute power has evolved along two dimensions:

Physical evolution : From vacuum‑tube computers (1946) to transistor (1958), integrated‑circuit (1964) and modern semiconductor chips, following Moore’s Law. Today, CPUs (x86, ARM), GPUs, ASICs, FPGAs, and DPUs constitute the hardware backbone.

Commercial evolution : From mainframe leasing, to personal PCs, to data‑center clusters (IDC), and finally to on‑demand cloud computing, with edge and IoT extending compute everywhere.

Current compute‑power categories include general‑purpose CPU‑based compute, intelligent compute (GPU/FPGA/ASIC for AI), and super‑computing clusters.

General‑purpose compute: CPU‑based servers.

Intelligent compute: Accelerated platforms for AI training and inference.

Super‑computing: High‑performance clusters for massive parallel workloads.

Key milestones:

July 2019 – Feitian basic‑compute platform scaled to 100 000 servers, forming a massive compute pool.

30 August 2022 – Alibaba Cloud launched the Feitian AI platform (15 EFLOPS), the world’s largest AI compute platform.

3 November 2022 – Alibaba announced the self‑designed CPU “Yitian‑710”, deployed at massive scale in data centers.

Evolution of Computing

Computing has progressed from ancient manual tools (abacus) to mechanical calculators (Pascal, Leibniz, Babbage), to electronic computers, and finally to modern cloud‑based architectures. Notable theoretical milestones include Hilbert’s 23 problems, Gödel’s incompleteness, Church’s lambda calculus, and Turing’s universal machine, which laid the foundation for modern computer science.

Von Neumann’s 1945 report introduced the stored‑program architecture that still underpins most computers today. Emerging paradigms such as quantum computing, photonic computing, and compute‑in‑memory are challenging the von Neumann model.

Cloud Computing Evolution

Cloud computing emerged as an internet‑scale evolution of computing, delivering shared resources over the network. Early milestones include McCarthy’s 1960s vision of computing as a public utility, the 1997 coining of "cloud computing" by Ramnath Chellappa, Amazon’s 2006 S3 launch, and the subsequent rise of AWS EC2, Google Cloud, and Alibaba Cloud.

John McCarthy (1960s): "Computers will eventually become a public utility." Ramnath Chellappa (1997): First academic use of the term "cloud computing". Amazon S3 (2006): Early IaaS offering that pioneered cloud services.

Alibaba Cloud and Cloud Computing

Alibaba Cloud’s core belief is that computing is a public service. Its infrastructure strategy focuses on three layers:

Compute (Shenlong) : Manages and schedules CPU/GPU resources with near‑zero virtualization loss.

Storage (Pangu) : Distributed, fault‑tolerant storage platform supporting billions of files and high IOPS.

Network (Luoshen) : Large‑scale virtualized L2 network enabling seamless VM migration across data centers.

The Cloud Infrastructure Processing Unit (CIPU) is a custom ASIC designed for the Feitian operating system. It abstracts physical servers into cloud‑native resources, accelerates compute, storage, and network operations, and provides elastic RDMA capabilities.

Shenlong Compute

Shenlong replaces software‑only virtualization with dedicated hardware, eliminating performance loss and enabling all physical resources to be allocated directly to user workloads.

Pangu Storage

Pangu unifies various storage services (OSS, EBS, NAS, OTS, ODPS, DFS) under a distributed architecture that offers elastic scaling, automatic load balancing, and high reliability.

Luoshen Network

Luoshen evolved from a data‑center L2 network (2010) to a global WAN (2016‑2020) and now to an intelligent cloud‑edge‑device network (2020‑present), supporting massive multi‑tenant virtualization.

Compute Workloads in Alibaba Cloud

Elastic Compute : Scalable ECS, bare‑metal, and cloud desktop services that adjust resources on demand.

Big‑Data Compute : MaxCompute (offline) and Hologres (real‑time) warehouses, integrated with Flink for stream‑batch convergence and with PAI for AI‑enhanced analytics.

High‑Performance Compute : Graphic Computing Service (GCS) for cloud gaming, metaverse rendering, and scientific visualization.

Intelligent Compute : PAI, AI chatbots, recommendation engines, and large‑model training platforms (e.g., GPT‑3.5‑scale clusters).

Complex System Compute : Edge‑cloud‑device integration covering IoT, edge cloud, and distributed AI workloads.

Trust, Cost, and Self‑Service

Cloud computing is fundamentally a trust‑based business; users must rely on providers for security, stability, and clear service boundaries. Cost efficiency is crucial—cloud should become as cheap as electricity, turning fixed IT expenses into variable, usage‑based costs. Self‑service capabilities (APIs, consoles, documentation) empower users to provision resources instantly, analogous to plugging into a power socket.

Conclusion

By examining compute power, infrastructure evolution, and Alibaba Cloud’s proprietary technologies (CIPU, Shenlong, Pangu, Luoshen), the article offers a holistic view of cloud computing’s past, present, and future, emphasizing that broader, more affordable access to compute will drive the next wave of innovation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemscloud computingInfrastructureAlibaba Cloudcompute powerCIPU
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.