Wu Shixiong's Large Model Academy
Author

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

111
Articles
0
Likes
237
Views
0
Comments
Recent Articles

Latest from Wu Shixiong's Large Model Academy

100 recent articles max
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
May 26, 2026 · Artificial Intelligence

Why Anthropic Caps SKILL.md at Under 5K Tokens and How to Structure Yours

The article explains Anthropic's official 5K‑token limit for SKILL.md files, breaks down the three‑level loading architecture, demonstrates progressive disclosure with concrete token calculations, and provides a step‑by‑step refactoring guide that reduces token usage while improving skill accuracy.

AI engineeringAnthropicClaude
0 likes · 16 min read
Why Anthropic Caps SKILL.md at Under 5K Tokens and How to Structure Yours
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
May 13, 2026 · Artificial Intelligence

How to Explain a Jump from 71% to 94% Tool‑Calling Accuracy in a JD Interview

The article walks through a JD interview scenario where a candidate explains how a tool‑calling accuracy metric rose from 71% to 94% by detailing the full SFT data‑engineering pipeline, teacher‑model trajectory generation, quality validation, evaluation methodology, and interview‑ready talking points.

Data engineeringEvaluationInterview Preparation
0 likes · 19 min read
How to Explain a Jump from 71% to 94% Tool‑Calling Accuracy in a JD Interview
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 30, 2026 · Artificial Intelligence

When Is Claude Code’s Memory Injected into system_prompt? Interview Insight

The article explains that Claude Code loads persisted memory once at REPL startup via _build_system(), inserts it as the 10th segment of system_prompt, enforces a 200‑line limit on MEMORY.md, deliberately avoids side‑effects in get_memory_dir(), and only refreshes the prompt with the /model command.

Claude CodeInterview PreparationLLM
0 likes · 11 min read
When Is Claude Code’s Memory Injected into system_prompt? Interview Insight
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 29, 2026 · Interview Experience

ByteDance Interviewer Asks: What Rank r Do You Use for LoRA? I Said 64—He Said I'm Wasting GPU Memory

The article examines a common interview scenario where candidates are asked about LoRA rank selection, outlines two typical mistakes—guessing or staying silent—and presents a three‑step strategy of honest boundary setting, logical derivation, and asking a focused question, illustrating the approach with concrete LoRA calculations and a vLLM case study.

AI engineeringLoRAinterview strategy
0 likes · 13 min read
ByteDance Interviewer Asks: What Rank r Do You Use for LoRA? I Said 64—He Said I'm Wasting GPU Memory
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 28, 2026 · Artificial Intelligence

Why Bigger Context Fails for Deep Research Agents and How IterResearch Fixes It

Interviewers point out that simply enlarging the LLM’s context window cannot prevent forgetting early conclusions in long‑step Deep Research tasks; the article explains the ReAct context issues, introduces the IterResearch framework with evolving reports, and compares its accuracy, cost, and scalability against ReAct and ReSum.

Context managementDeep ResearchIterResearch
0 likes · 17 min read
Why Bigger Context Fails for Deep Research Agents and How IterResearch Fixes It
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 27, 2026 · Artificial Intelligence

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

The article walks through the practical challenges of turning a RAG demo into a production system for 5,000 insurance documents, covering knowledge‑base chunking, embedding model selection, recall‑threshold tuning, hybrid vector‑BM25 retrieval, intent‑aware query routing, prompt constraints, confidence scoring, and operational scaling, with concrete metrics and code examples.

EmbeddingHybrid retrievalIntent Recognition
0 likes · 16 min read
Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 23, 2026 · Industry Insights

Should You Take a Tencent AI Internship? Key Factors to Consider

The article examines whether a Tencent AI internship is worth pursuing by analyzing the program’s growth stage, unique user ecosystem, mentorship structure, compensation model, and early‑year advantages, illustrated with real intern case studies, to help students decide what they aim to gain from the experience.

AI internshipAI researchCareer Guidance
0 likes · 14 min read
Should You Take a Tencent AI Internship? Key Factors to Consider
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 22, 2026 · Artificial Intelligence

How to Classify and Manage Agent Memories for Better Retrieval

This article dissects Claude Code's memory system, explains why unstructured memory degrades performance, introduces four distinct memory types with concrete examples and schema, shows how to handle expiration and retrieval strategies, and provides step‑by‑step implementation code to improve agent reliability.

Agent MemoryLLMPython
0 likes · 19 min read
How to Classify and Manage Agent Memories for Better Retrieval
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 21, 2026 · Artificial Intelligence

When Should an LLM Agent Extract Memory? A Deep Dive into Trigger Strategies

The article analyzes why memory extraction in LLM‑driven agents incurs cost, compares four frameworks—Claude Code, Generative Agents, MemGPT, and Mem0—detailing their trigger mechanisms, concurrency handling, and trade‑offs, and offers practical guidance for choosing the right strategy in real‑time, social, or batch‑processing scenarios.

AI engineeringAgent designLLM
0 likes · 18 min read
When Should an LLM Agent Extract Memory? A Deep Dive into Trigger Strategies
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 20, 2026 · Artificial Intelligence

Why Java Skills Alone Won’t Cut It for LLM Application Engineering

The article debunks the myth that Java developers only need a bit of AI knowledge to succeed in LLM application roles, explaining the full engineering stack—from retrieval and prompt design to deployment and performance tuning—through real‑world examples, metrics, and interview‑ready advice.

AI engineeringBackendInterview Preparation
0 likes · 13 min read
Why Java Skills Alone Won’t Cut It for LLM Application Engineering