Artificial Intelligence 14 min read

Thoughts on the Next‑Generation AI Infrastructure: Green and Shared Model‑as‑a‑Service

In this conference talk, He Zhengyu of Ant Group outlines the challenges of large‑model AI, proposes a green, shared, model‑centric infrastructure built on foundation models, cloud‑native MLOps, and Model‑as‑a‑Service (MaaS) to lower cost and accelerate AI adoption across industries.

AntTech

Mar 13, 2023

Thoughts on the Next‑Generation AI Infrastructure: Green and Shared Model‑as‑a‑Service

He Zhengyu, chair of Ant Group's Infrastructure Committee, delivered a keynote at the 2022 Global AI Technology Conference, sharing his perspective on the next generation of AI infrastructure based on Ant's practical experience.

The rapid growth of large models (e.g., GPT‑3 with ~200 billion parameters) has driven exponential increases in data and compute requirements, making AI development a costly "luxury" that many small‑to‑mid‑size companies cannot afford.

To lower this barrier, he proposes three core principles for AI infrastructure: ease of use through standardized APIs, efficiency via foundation (pre‑trained) models and cloud‑native distributed training, and sharing enabled by privacy‑preserving data collaboration, open‑source code, and community‑driven model contributions.

He explains that foundation models act as reusable pre‑trained backbones (e.g., Ant's OminiRec) that dramatically cut hardware costs and training time—up to a 90 % reduction compared with training from scratch—thereby accelerating iteration cycles.

The proposed Model‑as‑a‑Service (MaaS) architecture centers on a unified AI service framework called Maya , built on open‑source projects TRITON and RAY , providing standardized model inference, automatic scaling, and a high‑performance inference library.

Ant also offers an intelligent distributed training platform named Zhishen , which abstracts away hardware details and automatically selects and tunes distributed training strategies, allowing developers to focus on single‑node model code.

Cloud‑native principles are emphasized: AI workloads are designed to run natively on the cloud, leveraging elasticity, mixed training‑inference workloads, and resource‑leveling to reduce overall cost. Hardware virtualization (XPU abstraction) unifies GPUs, NPUs, TPUs, etc., enabling flexible resource scheduling across heterogeneous devices.

A practical MaaS case study—AI remote sensing for agricultural finance—is presented, showing how satellite imagery and foundation models can replace costly field visits, improve loan risk assessment, and achieve up to 50 % reductions in both data labeling and compute consumption.

He concludes that a green, shared AI infrastructure built on foundation models, cloud‑native MLOps, and MaaS can democratize AI, allowing developers of any scale to harness its benefits while minimizing environmental impact.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native MLOps AI Infrastructure foundation-models Model as a Service

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.