Artificial Intelligence 14 min read

Thoughts on the Next‑Generation AI Infrastructure: Green and Shared Model‑as‑a‑Service

In this conference talk, He Zhengyu of Ant Group outlines the challenges of large‑model AI, proposes a green, shared, model‑centric infrastructure built on foundation models, cloud‑native MLOps, and Model‑as‑a‑Service (MaaS) to lower cost and accelerate AI adoption across industries.

AntTech
AntTech
AntTech
Thoughts on the Next‑Generation AI Infrastructure: Green and Shared Model‑as‑a‑Service

He Zhengyu, chair of Ant Group's Infrastructure Committee, delivered a keynote at the 2022 Global AI Technology Conference, sharing his perspective on the next generation of AI infrastructure based on Ant's practical experience.

The rapid growth of large models (e.g., GPT‑3 with ~200 billion parameters) has driven exponential increases in data and compute requirements, making AI development a costly "luxury" that many small‑to‑mid‑size companies cannot afford.

To lower this barrier, he proposes three core principles for AI infrastructure: ease of use through standardized APIs, efficiency via foundation (pre‑trained) models and cloud‑native distributed training, and sharing enabled by privacy‑preserving data collaboration, open‑source code, and community‑driven model contributions.

He explains that foundation models act as reusable pre‑trained backbones (e.g., Ant's OminiRec) that dramatically cut hardware costs and training time—up to a 90 % reduction compared with training from scratch—thereby accelerating iteration cycles.

The proposed Model‑as‑a‑Service (MaaS) architecture centers on a unified AI service framework called Maya , built on open‑source projects TRITON and RAY , providing standardized model inference, automatic scaling, and a high‑performance inference library.

Ant also offers an intelligent distributed training platform named Zhishen , which abstracts away hardware details and automatically selects and tunes distributed training strategies, allowing developers to focus on single‑node model code.

Cloud‑native principles are emphasized: AI workloads are designed to run natively on the cloud, leveraging elasticity, mixed training‑inference workloads, and resource‑leveling to reduce overall cost. Hardware virtualization (XPU abstraction) unifies GPUs, NPUs, TPUs, etc., enabling flexible resource scheduling across heterogeneous devices.

A practical MaaS case study—AI remote sensing for agricultural finance—is presented, showing how satellite imagery and foundation models can replace costly field visits, improve loan risk assessment, and achieve up to 50 % reductions in both data labeling and compute consumption.

He concludes that a green, shared AI infrastructure built on foundation models, cloud‑native MLOps, and MaaS can democratize AI, allowing developers of any scale to harness its benefits while minimizing environmental impact.

cloud-nativeMLOpsAI infrastructureGreen computingFoundation ModelsModel-as-a-Service
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.