Tagged articles

open-source model

7 articles · Page 1 of 1

Jun 29, 2026 · Information Security

GLM 5.2 Beats Claude in IDOR Security Benchmark with 39% F1

Semgrep’s benchmark shows that the open‑source GLM 5.2 model, using only a unified prompt and a lightweight Pydantic AI scheduler, achieves a 39% F1 score on IDOR vulnerability detection—outperforming Claude Code’s best 37.4% while costing only about $0.17 per discovered flaw.

AI securityClaudeF1 score

0 likes · 13 min read

GLM 5.2 Beats Claude in IDOR Security Benchmark with 39% F1

JD Tech Talk

Jun 23, 2026 · Artificial Intelligence

From Q&A to Real‑Time Seeing and Speaking: JD’s World‑First Open‑Source JoyAI‑VL‑Interaction

JD’s open‑source JoyAI‑VL‑Interaction model transforms large‑language models from static question‑answering to continuous visual‑language interaction, enabling proactive judgment, instant responses, and intelligent task delegation, with benchmark win rates up to 87.9% against leading competitors and full stack code, model, and dataset released for real‑world deployment.

AI assistantBenchmarkJoyAI-VL-Interaction

0 likes · 9 min read

From Q&A to Real‑Time Seeing and Speaking: JD’s World‑First Open‑Source JoyAI‑VL‑Interaction

SuanNi

Jun 5, 2026 · Artificial Intelligence

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

Google’s Gemma 4 12B delivers near‑26B performance with half the memory, runs on a 16 GB laptop GPU, and uses a novel encoder‑free unified architecture that natively handles vision, audio, and text, making high‑quality multimodal AI truly local.

Gemma-4-12BMultimodal AIaudio-visual integration

0 likes · 6 min read

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

Machine Heart

Apr 28, 2026 · Artificial Intelligence

How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models

SenseNova U1 Lite, an 8‑billion‑parameter open‑source multimodal model from SenseTime, uses the NEO‑Unify architecture to fuse vision and language in a single space, achieving commercial‑grade efficiency and benchmark scores that surpass much larger proprietary models while supporting continuous image‑text generation.

BenchmarkMultimodal AINEO-Unify

0 likes · 12 min read

How SenseNova U1’s Unified Architecture Eliminates Multimodal ‘Frankenstein’ Models

JD Cloud Developers

Apr 8, 2026 · Artificial Intelligence

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

JoyAI-Image-Edit, an open‑source multimodal foundation model from JD Research Institute, integrates text‑to‑image generation, image understanding, and instruction‑driven spatial editing, achieving world‑leading spatial perception and editing capabilities that unlock new applications across e‑commerce, robotics, 3D reconstruction, and design.

Multimodal AIcomputer visiongenerative models

0 likes · 7 min read

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

Old Zhang's AI Learning

Jan 27, 2026 · Artificial Intelligence

Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?

Kimi K2.5, Moonshot’s latest open‑source multimodal model trained on 15 trillion image‑text tokens, adds native vision capabilities and a 100‑agent swarm that speeds complex tasks by 4.5×, achieves top‑tier benchmark scores, and can be deployed with vLLM, while demanding significant resources and hardware.

Agent SwarmBenchmarkKimi K2.5

0 likes · 10 min read

Can Kimi K2.5’s Visual Agent Swarm Make It the New Open‑Source AI King?

Tencent Cloud Developer

May 15, 2024 · Artificial Intelligence

Tencent Open-Sources HunYuan DiT: First Chinese-Native Text-to-Image Model with 1.5B Parameters

Tencent has open‑sourced its upgraded 1.5‑billion‑parameter HunYuan DiT model—the first Chinese‑native, bilingual (Chinese‑English) text‑to‑image diffusion‑with‑transformer system—delivering about 20% visual quality improvement, multi‑round generation, video‑generation potential, and free commercial use, with full weights, inference code, and algorithms available on Hugging Face and GitHub for developers and enterprises.

Chinese-native AIDiT architectureMultimodal Generation