Old Zhang's AI Learning
Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

141
Articles
0
Likes
3
Views
0
Comments
Recent Articles

Latest from Old Zhang's AI Learning

100 recent articles max
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 19, 2026 · Artificial Intelligence

8 Hard-Hitting AI Career Tips from Andrew Ng’s Stanford Lecture

In a dense 1‑hour‑44‑minute Stanford talk, Andrew Ng outlines eight actionable insights for AI professionals—including the rapid acceleration of AI capabilities, the shift from coding to product decisions, the importance of product intuition, rapid iteration, staying on cutting‑edge tools, leveraging supportive communities, and evaluating AI‑generated code debt.

AIAI toolsAgentic AI
0 likes · 8 min read
8 Hard-Hitting AI Career Tips from Andrew Ng’s Stanford Lecture
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 18, 2026 · Artificial Intelligence

How to Run MiniMax‑M2.7 on Mac: Comparing Two Quantization Paths

This article explains why standard uniform quantization fails for the 228‑billion‑parameter MiniMax‑M2.7 MoE model on macOS, and compares two practical solutions—JANGTQ + MLX Studio with 2‑bit mixed‑precision achieving 91.5 % MMLU using 56.5 GB, and LM Studio + GGUF which is easier but requires at least 138 GB RAM and yields lower accuracy.

JANGTQLM StudioMLX Studio
0 likes · 8 min read
How to Run MiniMax‑M2.7 on Mac: Comparing Two Quantization Paths
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 18, 2026 · Artificial Intelligence

NVIDIA Nemotron 3 Super: 7× Faster Than Qwen3.5 – Inside Hybrid Mamba‑Attention, LatentMoE, and MTP

NVIDIA’s Nemotron 3 Super, a 120.6 B‑parameter flagship model supporting 1 M‑token context, combines Hybrid Mamba‑Attention, LatentMoE, and Multi‑Token Prediction to achieve up to 7.5× higher inference throughput than Qwen3.5 while matching or surpassing its accuracy across a range of benchmarks.

Hybrid Mamba-AttentionLarge Language ModelLatentMoE
0 likes · 11 min read
NVIDIA Nemotron 3 Super: 7× Faster Than Qwen3.5 – Inside Hybrid Mamba‑Attention, LatentMoE, and MTP
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 17, 2026 · Artificial Intelligence

Four Powerful Projects to Supercharge Your Claude Code

This article reviews four high‑quality open‑source Claude Code ecosystem projects—Everything Claude Code, GacUI CLAUDE.md, Waza, and Ars Contexta—detailing their core capabilities, installation steps, unique workflows, and practical recommendations for different developer needs.

AI AgentClaude Codeknowledge management
0 likes · 13 min read
Four Powerful Projects to Supercharge Your Claude Code
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 17, 2026 · Artificial Intelligence

Google Strikes Back: Gemini’s New Features Take on Claude Code

The article reviews Google Gemini’s three‑pronged rollout— a Mac desktop app with global shortcuts and window‑sharing, a Gemini CLI enhanced with Subagents that keep context clean and enable parallel expert tasks, and the Gemini 3.1 Flash TTS model with Audio Tags—comparing each to competitors and highlighting practical use cases and limitations.

AI codingArtificial IntelligenceGemini CLI
0 likes · 12 min read
Google Strikes Back: Gemini’s New Features Take on Claude Code
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 17, 2026 · Artificial Intelligence

How DFlash Achieves 8× Lossless Acceleration for Large‑Model Inference (Qwen3.5‑27B Example)

The article explains how DFlash’s block‑diffusion draft model and KV Injection boost speculative decoding speed by 5‑8× without sacrificing output quality, and how DDTree further raises the gain to over 8×, backed by benchmark results and integration guides for major inference frameworks.

DDTreeDFlashLarge Language Model Inference
0 likes · 7 min read
How DFlash Achieves 8× Lossless Acceleration for Large‑Model Inference (Qwen3.5‑27B Example)
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 16, 2026 · Artificial Intelligence

Claude Opus 4.7 Arrives with a Massive Leap in Programming Power

Claude Opus 4.7 dramatically outperforms Opus 4.6 and rivals GPT‑5.4 and Gemini 3.1 Pro across benchmarks, boosts programming task success by up to 13%, triples bug‑fixing on SWE‑bench, raises visual resolution three‑fold, adds a finer‑grained xhigh effort level, tightens security controls, and keeps pricing unchanged.

AI modelClaudeOpus 4.7
0 likes · 10 min read
Claude Opus 4.7 Arrives with a Massive Leap in Programming Power
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 16, 2026 · Artificial Intelligence

How a 24/7 Online AI Assistant Transforms My Workflow

The article reviews iFlytek’s cloud‑based AstronClaw and desktop AI companion Loomy, detailing their installation steps, six core use‑cases, built‑in skills, model options, and permission settings, and concludes with a side‑by‑side comparison that helps readers decide which 24/7 AI agent best fits their workflow.

AI AgentAstronClawLoomy
0 likes · 8 min read
How a 24/7 Online AI Assistant Transforms My Workflow
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 15, 2026 · Industry Insights

Is the Era of Commercial‑Ready Chinese Open‑Source LLMs Ending? MiniMax M2.7 License Update

The MiniMax M2.7 model switched its open‑source license to forbid commercial use, igniting a heated debate about what constitutes commercial activity, prompting a community clarification that self‑hosted coding remains free, and leading to a revised license that explicitly permits personal, academic, and non‑profit uses while highlighting broader market pressures from cloud providers that are reshaping the open‑source LLM ecosystem.

AI industryKimiLLM licensing
0 likes · 10 min read
Is the Era of Commercial‑Ready Chinese Open‑Source LLMs Ending? MiniMax M2.7 License Update