Wu Shixiong's Large Model Academy
Author

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

107
Articles
0
Likes
33
Views
0
Comments
Recent Articles

Latest from Wu Shixiong's Large Model Academy

100 recent articles max
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 1, 2025 · Artificial Intelligence

Why ReAct Is the Dominant Framework for Building Reliable AI Agents

The article explains why the ReAct (Reason + Act) framework outperforms simple Chain‑of‑Thought prompting by adding executable actions, environment state awareness, and feedback loops, making large language models into controllable, reproducible, and error‑recoverable agents suitable for real‑world applications and interview discussions.

Function CallInterview TipsReAct
0 likes · 9 min read
Why ReAct Is the Dominant Framework for Building Reliable AI Agents
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 27, 2025 · Artificial Intelligence

Why Your Enterprise AI Agent Fails and How to Fix the Four Biggest Pitfalls

This article explains why many enterprise AI agents break down in real projects, identifies four common pitfalls—including mistaking agents for chatbots, lacking schema‑level tool logic, missing memory and variable injection, and absent end‑to‑end pipelines—and offers concrete engineering solutions to build robust, task‑driven agents.

AI AgentEnd-to-End PipelineEnterprise AI
0 likes · 8 min read
Why Your Enterprise AI Agent Fails and How to Fix the Four Biggest Pitfalls
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 24, 2025 · Artificial Intelligence

Why Dynamic Function Routing Is the Key to Stable LLM Agents

In real‑world LLM agents, giving the model too many tools at once leads to frequent function‑call errors, but applying dynamic function routing to narrow the candidate set dramatically reduces the error rate—from over 20% down to around 1%—and provides clear guidelines on when and how to implement it.

AgentDynamic RoutingFunction Calling
0 likes · 9 min read
Why Dynamic Function Routing Is the Key to Stable LLM Agents
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 22, 2025 · Artificial Intelligence

Why Your RAG System Slows Down Over Time and How to Fix It

The article explains why a production Retrieval‑Augmented Generation (RAG) system becomes slower as it runs—due to growing embedding costs, expanding vector databases, heavier re‑ranking, and larger prompts—and provides concrete engineering optimizations such as batching, async concurrency, caching, partitioned retrieval, HNSW tuning, replica scaling, answer caching, and prompt sparsification to keep performance stable.

AI engineeringPerformance optimizationRAG
0 likes · 10 min read
Why Your RAG System Slows Down Over Time and How to Fix It
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 21, 2025 · Artificial Intelligence

How to Build a Multi‑Layer Cache for Dynamic RAG Systems

This article explains why dynamic Retrieval‑Augmented Generation (RAG) requires a layered caching strategy rather than simple result caching, details a four‑level cache architecture—including embedding, search, answer, and pipeline caches—provides practical key‑generation and TTL guidelines, and outlines dirty‑data defenses to keep caches consistent and performant.

AI engineeringCachingLLM
0 likes · 10 min read
How to Build a Multi‑Layer Cache for Dynamic RAG Systems
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 20, 2025 · Artificial Intelligence

How to Build a Quantifiable Data Quality Framework for Dynamic Incremental RAG

This article explains why static RAG metrics don’t apply to dynamic pipelines, introduces five essential dimensions—Parseability, Deduplication, Relevance, Chunk Quality, and Freshness—and shows how to combine them into a weighted score that enables monitoring, alerts, and continuous improvement of dynamic RAG systems.

Data qualityDynamic RAGMetrics
0 likes · 10 min read
How to Build a Quantifiable Data Quality Framework for Dynamic Incremental RAG
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 19, 2025 · Artificial Intelligence

How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data

This article explains why dynamic incremental RAG is harder than static RAG, identifies the three main points where recall accuracy breaks, and presents a three‑stage engineering pipeline—including a quality‑control layer, two‑stage retrieval, and reference‑injection generation—to keep real‑time data retrieval both accurate and robust.

AIDynamic DataQuality Control
0 likes · 13 min read
How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 18, 2025 · Artificial Intelligence

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

This article breaks down why function‑call reliability is the biggest bottleneck for LLM agents and presents a systematic five‑step loop—schema quality, prompt context, sampling, training data, and runtime defenses—plus concrete optimization techniques such as dynamic tool routing, plan‑execute, validation layers, memory injection, and log‑driven tuning, illustrated with real‑world cases.

AgentFunction CallLLM
0 likes · 12 min read
How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 16, 2025 · Artificial Intelligence

How to Slash RAG First‑Token Latency: Practical Engineering Strategies

This guide breaks down the three layers of a RAG pipeline—embedding, vector retrieval, and system architecture—and provides concrete engineering tactics such as batch embedding, async concurrency, caching, ANN indexing, partitioning, connection pooling, and async pipelines to dramatically reduce Time‑to‑First‑Token latency.

Async PipelineEmbeddingRAG
0 likes · 10 min read
How to Slash RAG First‑Token Latency: Practical Engineering Strategies
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 15, 2025 · Artificial Intelligence

How to Build Robust Function Call Training Data for LLM Agents

This article explains why function call capabilities in large language model agents require dedicated training, outlines the four core abilities to teach, describes the structure and sources of effective training data, and compares lightweight LoRA fine‑tuning with full supervised fine‑tuning approaches.

Data GenerationFine-tuningLLM training
0 likes · 11 min read
How to Build Robust Function Call Training Data for LLM Agents