Tagged articles
4 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 28, 2026 · Artificial Intelligence

How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models

SenseNova U1 Lite, an 8‑billion‑parameter open‑source multimodal model from SenseTime, uses the NEO‑Unify architecture to fuse vision and language in a single space, achieving commercial‑grade efficiency and benchmark scores that surpass much larger proprietary models while supporting continuous image‑text generation.

Multimodal AINEO-UnifySenseNova U1
0 likes · 12 min read
How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models
JD Cloud Developers
JD Cloud Developers
Apr 8, 2026 · Artificial Intelligence

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

JoyAI-Image-Edit, an open‑source multimodal foundation model from JD Research Institute, integrates text‑to‑image generation, image understanding, and instruction‑driven spatial editing, achieving world‑leading spatial perception and editing capabilities that unlock new applications across e‑commerce, robotics, 3D reconstruction, and design.

Computer VisionGenerative ModelsMultimodal AI
0 likes · 7 min read
How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 27, 2026 · Artificial Intelligence

Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?

Kimi K2.5, Moonshot’s latest open‑source multimodal model trained on 15 trillion image‑text tokens, adds native vision capabilities and a 100‑agent swarm that speeds complex tasks by 4.5×, achieves top‑tier benchmark scores, and can be deployed with vLLM, while demanding significant resources and hardware.

Agent SwarmKimi-K2.5Multimodal AI
0 likes · 10 min read
Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?
Tencent Cloud Developer
Tencent Cloud Developer
May 15, 2024 · Artificial Intelligence

Tencent Open-Sources HunYuan DiT: First Chinese-Native Text-to-Image Model with 1.5B Parameters

Tencent has open‑sourced its upgraded 1.5‑billion‑parameter HunYuan DiT model—the first Chinese‑native, bilingual (Chinese‑English) text‑to‑image diffusion‑with‑transformer system—delivering about 20% visual quality improvement, multi‑round generation, video‑generation potential, and free commercial use, with full weights, inference code, and algorithms available on Hugging Face and GitHub for developers and enterprises.

Chinese-native AIDiT architectureDiffusion Transformer
0 likes · 6 min read
Tencent Open-Sources HunYuan DiT: First Chinese-Native Text-to-Image Model with 1.5B Parameters