How GLM-Image Generates High‑Quality Text‑to‑Image on Huawei Ascend Chips

GLM-Image, a Chinese text‑to‑image model trained end‑to‑end on Huawei Ascend 800T A2 NPUs, combines an autoregressive decoder with a diffusion encoder, supports resolutions up to 2048×2048, and offers open‑source code, API access, and detailed prompts that demonstrate its strong layout and typography capabilities.

Baobao Algorithm Notes
Baobao Algorithm Notes
Baobao Algorithm Notes
How GLM-Image Generates High‑Quality Text‑to‑Image on Huawei Ascend Chips

GLM-Image is a Chinese text‑to‑image generation model released by Zhipu AI, fully trained on Huawei Ascend 800T A2 NPUs using the MindSpore framework.

Training pipeline

Data preprocessing, large‑scale pre‑training, supervised fine‑tuning (SFT) and reinforcement learning from human feedback (RLHF) are executed on the Ascend platform. Optimizations such as dynamic‑graph multi‑stage pipelining, multi‑stream parallelism, and fused operators (AdamW‑EMA, COC, RMS‑Norm) reduce host‑side bottlenecks and improve training stability and performance.

Model architecture

The architecture mixes an autoregressive decoder with a diffusion encoder, allowing global instruction understanding together with fine‑grained detail synthesis. It natively supports arbitrary resolutions from 384×384 up to 2048×2048.

Capabilities and examples

GLM‑Image can generate text‑rich posters, holiday cards, scientific infographics, and everyday scenes with accurate layout, typography, and lighting. Example prompts and their generated results include:

画一张天气卡片海报,呈现一个清晰、45 度俯视等轴测视角的微缩 3D 卡通北京场景,包含其最具标志性的地标建筑和元素。使用柔和、精致的纹理,配合写实的 PBR 材质以及温柔、逼真的光影效果。将下雪天气直接融入城市环境,营造沉浸式氛围。采用干净、极简的构图,背景为柔和的纯色。顶部中央放置大号粗体标题 “Beijing”,下方是天气图标、日期(2026-01-14)和温度(0°C),文字居中对齐。
3D 新年贺卡海报。竖构图,中心设有窗状凹槽,整体呈现毛毡与粗针织羊毛的写实质感,兼具盲盒玩具风格。主角为 Q 版毛毡小马,穿红色背心和虎头帽,周围布置红灯笼、金桔树、毛毡鞭炮等元素,文字采用 3D 流体艺术字体,渲染采用 C4D + Octane,8K 分辨率。
像素风格的 8‑bit 游戏工厂主题科普信息图,横向构图,展示 CPU 工作原理的取指、解码、执行三个阶段,使用鲜明的 8‑bit 像素艺术和中文/英文标注。

Open‑source release and access

The model weights, training scripts, and inference code are publicly available on GitHub, Hugging Face, and ModelScope, enabling researchers to reproduce the results or fine‑tune the model for specific tasks.

GitHub: https://github.com/zai-org/GLM-Image

Hugging Face: https://huggingface.co/zai-org/GLM-Image

ModelScope: https://modelscope.cn/models/ZhipuAI/GLM-Image

API usage

Through the Open Platform, an image can be generated for approximately 0.1 CNY, with a faster‑optimized version planned for release.

API documentation: https://docs.bigmodel.cn/cn/guide/models/image-generation/glm-image

text-to-imagediffusionHuawei AscendGLM-Image
Baobao Algorithm Notes
Written by

Baobao Algorithm Notes

Author of the BaiMian large model, offering technology and industry insights.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.