Old Zhang's AI Learning
Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

229
Articles
0
Likes
713
Views
0
Comments
Recent Articles

Latest from Old Zhang's AI Learning

100 recent articles max
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Artificial Intelligence

Google Boosts Gemma 4 Inference Speed Up to 3× with MTP Drafter and Day‑0 vLLM Support

Google’s new Multi‑Token Prediction (MTP) drafter for Gemma 4 delivers up to three‑fold inference speedups across hardware and frameworks—validated by official benchmarks and independent DGX Spark tests—while preserving identical output quality, and is immediately usable via Hugging Face, vLLM, MLX, Ollama and edge‑device runtimes.

Apple SiliconGemma 4LLM inference
0 likes · 9 min read
Google Boosts Gemma 4 Inference Speed Up to 3× with MTP Drafter and Day‑0 vLLM Support
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Information Security

Why Large‑Model AI Agents Need Strict Security Controls

The article compares AWS Rex, which enforces Cedar policies on Rhai scripts, with Vercel deepsec, which lets powerful coding agents hunt vulnerabilities, showing how both defensive and offensive approaches are shaping the emerging security model for AI agents in production.

AI agentsCedarRex
0 likes · 12 min read
Why Large‑Model AI Agents Need Strict Security Controls
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Artificial Intelligence

GPT-5.5 Instant Arrives: Smarter, Clearer, More Personalized AI

OpenAI has silently replaced the default ChatGPT model with GPT‑5.5 Instant, delivering a 52.5% drop in hallucinations, 30% shorter responses, deeper personalization via memory sources, and higher benchmark scores across a range of professional tasks, while rolling out new pricing and usage tiers.

AI benchmarksChatGPTGPT-5.5
0 likes · 11 min read
GPT-5.5 Instant Arrives: Smarter, Clearer, More Personalized AI
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Artificial Intelligence

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

RAG and agent contexts suffer from stale data, not chunking or reranking, and CocoIndex—a Rust‑based incremental engine with a declarative Python API—offers fresh, delta‑processed context, automatic schema evolution, and production‑grade features, demonstrated through PDF‑to‑Markdown pipelines and a podcast knowledge‑graph case study.

Agent ContextKnowledge GraphPython
0 likes · 13 min read
Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Frontend Development

Testing Open‑Slide: A React‑Based PPT Framework Built for AI Agents

Open‑slide is a React and Tailwind powered slide framework designed for AI coding agents such as Claude Code, allowing natural‑language prompts to generate 1920×1080 decks with agent‑native authoring, inspector comments, asset management, presenter mode, static deployment, and a hands‑on evaluation of its strengths and limitations.

AI agentsClaude CodeFrontend
0 likes · 11 min read
Testing Open‑Slide: A React‑Based PPT Framework Built for AI Agents
Old Zhang's AI Learning
Old Zhang's AI Learning
May 5, 2026 · Artificial Intelligence

Claude Enters Finance: 10 Open‑Source Financial Agent Templates Unveiled

Anthropic released ten ready‑to‑use financial Agent templates that bundle skills, data connectors and sub‑agents, can run natively in Excel, PowerPoint, Word and Outlook, are open‑sourced on GitHub, support two deployment modes, score 64.37% on the Vals AI finance benchmark, and integrate dozens of market data sources, while offering both strengths and notable limitations.

Agent TemplatesClaudeData Connectors
0 likes · 14 min read
Claude Enters Finance: 10 Open‑Source Financial Agent Templates Unveiled
Old Zhang's AI Learning
Old Zhang's AI Learning
May 5, 2026 · Artificial Intelligence

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

The vLLM 0.20.1 patch, released shortly after 0.20.0, consolidates stability fixes and performance optimizations for DeepSeek V4, adds several bug fixes, updates installation instructions, and provides targeted upgrade recommendations for different user scenarios.

Bug FixDeepSeek V4GPU inference
0 likes · 9 min read
vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4
Old Zhang's AI Learning
Old Zhang's AI Learning
May 4, 2026 · Artificial Intelligence

How DeepSeek’s New Paper Redefines Multimodal Reasoning with Visual Primitives

DeepSeek’s new paper "Thinking with Visual Primitives" tackles the reference gap in multimodal models by introducing points and boxes as reasoning units, achieving up to 8× token efficiency and leading benchmark scores in counting, spatial reasoning, and maze navigation compared with GPT‑5.4, Claude‑Sonnet‑4.6 and Gemini‑3‑Flash.

Chain-of-ThoughtDeepSeekVisual Primitives
0 likes · 10 min read
How DeepSeek’s New Paper Redefines Multimodal Reasoning with Visual Primitives
Old Zhang's AI Learning
Old Zhang's AI Learning
May 3, 2026 · Artificial Intelligence

One‑Command Setup of Reusable Claude Code Configurations (Full Toolkit)

The article reviews the GitHub project claude-code-templates, which aggregates over 100 reusable Claude Code assets—including agents, commands, MCPs, settings, hooks, and skills—into an npm‑like repository and a web dashboard, showing how a single npx command can install a complete development stack, detailing usage examples, pros, cons, and target audiences.

AI codingCLIClaude Code
0 likes · 9 min read
One‑Command Setup of Reusable Claude Code Configurations (Full Toolkit)