Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

141

Articles

Likes

Views

Comments

Latest from Old Zhang's AI Learning

100 recent articles max

Old Zhang's AI Learning

Mar 5, 2026 · Artificial Intelligence

Timber: The “Ollama” for Traditional Machine Learning Models

Timber is a multi‑pass compiler that transforms classic ML models such as XGBoost and LightGBM into zero‑dependency C99 binaries, offering microsecond‑level inference latency, HTTP‑compatible serving, and substantial performance gains over Python runtimes, making it ideal for high‑throughput, low‑latency production scenarios.

LightGBMML compilerTimber

0 likes · 8 min read

Timber: The “Ollama” for Traditional Machine Learning Models

Old Zhang's AI Learning

Mar 4, 2026 · Artificial Intelligence

How to Turn Thinking Mode On or Off for Qwen3.5 Models in Ollama, LM Studio, llama.cpp, and vLLM

This guide shows step‑by‑step how to enable or disable the thinking mode of Qwen3.5 series large language models across Ollama, LM Studio (GGUF and MLX), llama.cpp, and vLLM/SGLang using command‑line flags, custom model YAML files, and API parameters.

LM StudioOllamaQwen3.5

0 likes · 4 min read

How to Turn Thinking Mode On or Off for Qwen3.5 Models in Ollama, LM Studio, llama.cpp, and vLLM

Old Zhang's AI Learning

Mar 4, 2026 · Artificial Intelligence

Unlock the Full Power of LM Studio for Local LLM Deployment

This article explores LM Studio’s evolution into a complete local AI development platform, detailing version 0.4’s architectural overhaul, headless daemon, parallel request handling, stateful REST API, UI refresh, and a suite of hidden developer features such as OpenAI‑compatible, Anthropic‑compatible APIs, CLI tools, native SDKs, and the LM Link remote‑model solution.

Anthropic APICLILM Link

0 likes · 12 min read

Unlock the Full Power of LM Studio for Local LLM Deployment

Old Zhang's AI Learning

Mar 3, 2026 · Artificial Intelligence

A New Approach: Graph‑Database‑Powered Text‑to‑SQL Tool QueryWeaver

QueryWeaver, an open‑source Text‑to‑SQL solution from FalkorDB, leverages a graph database to model schema relationships, offering instant natural‑language querying, MCP integration for AI IDEs, and a detailed comparison with Wren AI highlighting its fast, out‑of‑the‑box experience.

AI IDEMCPOpenAI

0 likes · 13 min read

A New Approach: Graph‑Database‑Powered Text‑to‑SQL Tool QueryWeaver

Old Zhang's AI Learning

Mar 3, 2026 · Artificial Intelligence

How to Deploy and Fine‑Tune Qwen3.5 Small Models (0.8B‑9B) Locally

This guide walks you through deploying Qwen3.5's 0.8B, 2B, 4B and 9B models on CPUs or modest GPUs using Unsloth's GGUF quantization, explains hardware requirements, shows how to run them with llama.cpp, llama‑server, vLLM or SGLang, and provides a free Colab fine‑tuning workflow with export options.

AI ModelsFine-tuningGGUF

0 likes · 19 min read

How to Deploy and Fine‑Tune Qwen3.5 Small Models (0.8B‑9B) Locally

Old Zhang's AI Learning

Mar 2, 2026 · Artificial Intelligence

Qwen3.5 Small Models Unveiled: From 0.8B to 9B with Full Capabilities

The article introduces the newly released Qwen3.5 small model series (0.8B, 2B, 4B, 9B), explains their shared Gated Delta Networks architecture, early multimodal token fusion, 201‑language support and up to 1 million‑token context, and presents benchmark data that show the 9B model rivaling much larger LLMs, followed by practical guidance on model selection and deployment.

Gated Delta NetworksLocal DeploymentQwen3.5

0 likes · 10 min read

Qwen3.5 Small Models Unveiled: From 0.8B to 9B with Full Capabilities

Old Zhang's AI Learning

Mar 2, 2026 · Artificial Intelligence

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

The author reviews the Qwen3.5 model family, showing that the 27‑billion‑parameter dense Qwen3.5-27B offers the best balance of size, stability, low‑cost local deployment, and comprehensive capabilities, making it the default pick for most users.

AI benchmarkingLarge Language ModelLocal Deployment

0 likes · 6 min read

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

Old Zhang's AI Learning

Mar 2, 2026 · Operations

7 One-Click Automation Scenarios with Obsidian CLI to Supercharge Your Knowledge Management

This guide introduces the newly released Obsidian CLI, showing how to configure it and leverage seven automation scenarios—from instant idea capture and Git workflow integration to meeting timers, AI assistant linking, daily reviews, tmux shortcuts, and fuzzy file search—enabling rapid, command‑line‑driven knowledge management.

AutomationCLIObsidian

0 likes · 10 min read

7 One-Click Automation Scenarios with Obsidian CLI to Supercharge Your Knowledge Management

Old Zhang's AI Learning

Mar 1, 2026 · Artificial Intelligence

OpenWork: Open‑Source Alternative to Claude Cowork with Full‑Feature Windows Client

OpenWork is an open‑source, locally‑first replacement for Claude Cowork that packages AI agents into a desktop app usable by non‑technical teammates, offering multi‑threaded execution, automation, a reusable Skills system, native Slack/Telegram integration, and a clear comparison against Claude Cowork and Codex.

AI agentsAutomationClaude Cowork

0 likes · 11 min read

OpenWork: Open‑Source Alternative to Claude Cowork with Full‑Feature Windows Client

Old Zhang's AI Learning

Feb 28, 2026 · Artificial Intelligence

How OpenAI Engineers Leverage Codex: 6 Proven Best Practices

The article reveals how OpenAI’s engineering teams integrate Codex into daily workflows, detailing seven core application scenarios—from code understanding and refactoring to performance optimization and flow maintenance—and presents six concrete best‑practice guidelines for maximizing AI‑assisted development efficiency.

AI code generationCodexPerformance optimization

0 likes · 7 min read

How OpenAI Engineers Leverage Codex: 6 Proven Best Practices