Data Party THU
Data Party THU
Sep 26, 2025 · Artificial Intelligence

How Keye‑VL‑1.5 Redefines Video Understanding with Slow‑Fast Encoding

Keye‑VL‑1.5, an 8‑billion‑parameter multimodal large language model, introduces a Slow‑Fast video encoding strategy, a four‑stage progressive pre‑training pipeline with 128K context, and a sophisticated post‑training regime that together achieve state‑of‑the‑art performance on video and vision‑language benchmarks while maintaining strong general capabilities.

benchmarklarge language modelmultimodal LLM
0 likes · 21 min read
How Keye‑VL‑1.5 Redefines Video Understanding with Slow‑Fast Encoding
Kuaishou Large Model
Kuaishou Large Model
Sep 8, 2025 · Artificial Intelligence

Keye-VL-1.5-8B: The New Multimodal LLM That Beats GPT-4o on Vision Benchmarks

Kwai's newly released Keye-VL-1.5-8B multimodal large language model dramatically improves visual, reasoning, and temporal understanding, achieving top scores on public video benchmarks and surpassing closed‑source models like GPT‑4o, while offering an open‑source release and detailed technical documentation.

Vision-Languagebenchmark performancemultimodal LLM
0 likes · 11 min read
Keye-VL-1.5-8B: The New Multimodal LLM That Beats GPT-4o on Vision Benchmarks
Kuaishou Tech
Kuaishou Tech
Sep 5, 2025 · Artificial Intelligence

How Keye‑VL‑1.5‑8B Sets New Benchmarks in Multimodal AI

Fast‑search platform Kwai has open‑sourced the 8‑billion‑parameter multimodal LLM Keye‑VL‑1.5, which introduces a slow‑fast frame encoding, a progressive four‑stage pre‑training pipeline, and an automated data construction workflow, achieving state‑of‑the‑art results on video and vision‑language benchmarks and surpassing many closed‑source models.

benchmark performancelarge language modelmultimodal AI
0 likes · 12 min read
How Keye‑VL‑1.5‑8B Sets New Benchmarks in Multimodal AI