Old Zhang's AI Learning
Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

141
Articles
0
Likes
3
Views
0
Comments
Recent Articles

Latest from Old Zhang's AI Learning

100 recent articles max
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 20, 2026 · Artificial Intelligence

Auto‑Detect Which LLMs Your PC Can Run and Launch a Coding Agent

This article shows how the HF‑agent plugin uses llmfit to analyze your hardware, recommends runnable large language models, starts a llama.cpp server, and automatically launches the Pi coding agent, with step‑by‑step commands and a real‑world test on an M2 MacBook Air.

HF-agentcoding agentllama.cpp
0 likes · 5 min read
Auto‑Detect Which LLMs Your PC Can Run and Launch a Coding Agent
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 19, 2026 · Artificial Intelligence

Testing the Hot oMLX on Mac: Claude‑Opus‑4.6 Distilled and Qwen3.5‑9B Performance Review

The article evaluates oMLX, a Mac‑only LLM runtime built on Apple Silicon and MLX, by walking through installation, UI features, memory usage, single‑request speed, benchmark results for Claude‑Opus‑4.6 and Qwen3.5‑9B, continuous batch processing gains, Claude Code optimizations, multi‑model support, and the failure to run a 27B model.

Apple SiliconClaude OpusMLX
0 likes · 9 min read
Testing the Hot oMLX on Mac: Claude‑Opus‑4.6 Distilled and Qwen3.5‑9B Performance Review
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 18, 2026 · Artificial Intelligence

Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance

The article details a hands‑on test of the Claude‑Opus‑4.6‑distilled Qwen3.5 27B model running on a single RTX 4090 via llama.cpp, showing a steady 46 tokens per second generation speed, a 64K context window, and a step‑by‑step Docker‑based setup while comparing it to GLM‑4.7‑Flash‑AWQ‑4bit and discussing llama.cpp’s limitations for multi‑GPU inference.

Claude OpusDockerLLM inference
0 likes · 5 min read
Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 16, 2026 · Artificial Intelligence

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

The article evaluates the GGUF‑quantized Claude‑Opus‑4.6 distilled Qwen3.5 9B model on a 16 GB Mac Mini M4 using LM Studio, detailing model sizes, performance metrics, deployment steps, API integration with Claude Code, and concluding that while the 9B version is usable, its capabilities remain limited compared to larger models.

Claude OpusGGUFLM Studio
0 likes · 12 min read
Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 15, 2026 · Artificial Intelligence

How Claude Code + Obsidian Automate Your Knowledge Management

Claude Code can read your project’s code, git history, and structure, then automatically create or update Obsidian notes, generate dev logs, and even produce architecture Canvas files with a single /obsidian command, offering a local‑first, plugin‑free workflow for AI‑driven knowledge management.

AI automationCanvasClaude Code
0 likes · 10 min read
How Claude Code + Obsidian Automate Your Knowledge Management
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 13, 2026 · Artificial Intelligence

OpenClaw v3.12: Revamped Dashboard, 20+ Security Fixes & Fast Mode

OpenClaw v3.12 introduces a completely rebuilt Dashboard, a unified Fast Mode switch, a provider‑plugin architecture for easy model integration, extensive security hardening across command execution, permissions and webhooks, plus new iOS/macOS UI upgrades and Kubernetes deployment guides.

AI agentsDashboardFast Mode
0 likes · 10 min read
OpenClaw v3.12: Revamped Dashboard, 20+ Security Fixes & Fast Mode
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 13, 2026 · Artificial Intelligence

Nvidia’s New OpenClaw‑Optimized Model Cracks Top‑5 on PinchBench – Free to Use

Nvidia’s open‑source Nemotron‑3‑Super model achieves an 85.6% success rate on the PinchBench OpenClaw benchmark, ranking in the top five (the only open‑source entry), and the article explains its architecture, quantization, training pipeline, performance numbers, usage options, and practical limitations.

AI coding agentMoENVFP4
0 likes · 10 min read
Nvidia’s New OpenClaw‑Optimized Model Cracks Top‑5 on PinchBench – Free to Use