Tagged articles

open-source model

7 articles · Page 1 of 1
21CTO
21CTO
Jun 29, 2026 · Information Security

GLM 5.2 Beats Claude in IDOR Security Benchmark with 39% F1

Semgrep’s benchmark shows that the open‑source GLM 5.2 model, using only a unified prompt and a lightweight Pydantic AI scheduler, achieves a 39% F1 score on IDOR vulnerability detection—outperforming Claude Code’s best 37.4% while costing only about $0.17 per discovered flaw.

AI securityClaudeF1 score
0 likes · 13 min read
GLM 5.2 Beats Claude in IDOR Security Benchmark with 39% F1
JD Tech Talk
JD Tech Talk
Jun 23, 2026 · Artificial Intelligence

From Q&A to Real‑Time Seeing and Speaking: JD’s World‑First Open‑Source JoyAI‑VL‑Interaction

JD’s open‑source JoyAI‑VL‑Interaction model transforms large‑language models from static question‑answering to continuous visual‑language interaction, enabling proactive judgment, instant responses, and intelligent task delegation, with benchmark win rates up to 87.9% against leading competitors and full stack code, model, and dataset released for real‑world deployment.

AI assistantBenchmarkJoyAI-VL-Interaction
0 likes · 9 min read
From Q&A to Real‑Time Seeing and Speaking: JD’s World‑First Open‑Source JoyAI‑VL‑Interaction
SuanNi
SuanNi
Jun 5, 2026 · Artificial Intelligence

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

Google’s Gemma 4 12B delivers near‑26B performance with half the memory, runs on a 16 GB laptop GPU, and uses a novel encoder‑free unified architecture that natively handles vision, audio, and text, making high‑quality multimodal AI truly local.

Gemma-4-12BMultimodal AIaudio-visual integration
0 likes · 6 min read
How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model
Machine Heart
Machine Heart
Apr 28, 2026 · Artificial Intelligence

How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models

SenseNova U1 Lite, an 8‑billion‑parameter open‑source multimodal model from SenseTime, uses the NEO‑Unify architecture to fuse vision and language in a single space, achieving commercial‑grade efficiency and benchmark scores that surpass much larger proprietary models while supporting continuous image‑text generation.

BenchmarkMultimodal AINEO-Unify
0 likes · 12 min read
How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models
JD Cloud Developers
JD Cloud Developers
Apr 8, 2026 · Artificial Intelligence

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

JoyAI-Image-Edit, an open‑source multimodal foundation model from JD Research Institute, integrates text‑to‑image generation, image understanding, and instruction‑driven spatial editing, achieving world‑leading spatial perception and editing capabilities that unlock new applications across e‑commerce, robotics, 3D reconstruction, and design.

Multimodal AIcomputer visiongenerative models
0 likes · 7 min read
How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 27, 2026 · Artificial Intelligence

Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?

Kimi K2.5, Moonshot’s latest open‑source multimodal model trained on 15 trillion image‑text tokens, adds native vision capabilities and a 100‑agent swarm that speeds complex tasks by 4.5×, achieves top‑tier benchmark scores, and can be deployed with vLLM, while demanding significant resources and hardware.

Agent SwarmBenchmarkKimi K2.5
0 likes · 10 min read
Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?
Tencent Cloud Developer
Tencent Cloud Developer
May 15, 2024 · Artificial Intelligence

Tencent Open-Sources HunYuan DiT: First Chinese-Native Text-to-Image Model with 1.5B Parameters

Tencent has open‑sourced its upgraded 1.5‑billion‑parameter HunYuan DiT model—the first Chinese‑native, bilingual (Chinese‑English) text‑to‑image diffusion‑with‑transformer system—delivering about 20% visual quality improvement, multi‑round generation, video‑generation potential, and free commercial use, with full weights, inference code, and algorithms available on Hugging Face and GitHub for developers and enterprises.

Chinese-native AIDiT architectureMultimodal Generation
0 likes · 6 min read
Tencent Open-Sources HunYuan DiT: First Chinese-Native Text-to-Image Model with 1.5B Parameters