Tagged articles
3 articles
Page 1 of 1
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 7, 2026 · Artificial Intelligence

vLLM 0.17.0 Release: Full Qwen 3.5 Support and Anthropic API Compatibility

The vLLM 0.17.0 release brings FlashAttention 4 integration, a mature Model Runner V2, complete Qwen 3.5 series support, a one‑click performance‑mode flag, Anthropic API compatibility, advanced weight‑offloading, broader hardware support beyond NVIDIA, ASR model integration, and detailed upgrade and installation guidance.

ASRAnthropic APIFlashAttention
0 likes · 12 min read
vLLM 0.17.0 Release: Full Qwen 3.5 Support and Anthropic API Compatibility
Java Architecture Diary
Java Architecture Diary
Apr 2, 2025 · Artificial Intelligence

Run AI Models Locally with Docker Model Runner and Java Integration

This article explains how Docker Model Runner enables effortless local execution of AI models, details platform support, provides a full command reference, shows how to use the REST endpoint, and demonstrates integration with Java via LangChain4j, including code examples and a feature comparison with Ollama.

AIDockerLangChain4j
0 likes · 9 min read
Run AI Models Locally with Docker Model Runner and Java Integration