Author

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

111

Articles

Likes

242

Views

Comments

Latest from Wu Shixiong's Large Model Academy

100 recent articles max

Wu Shixiong's Large Model Academy

Nov 19, 2025 · Artificial Intelligence

How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data

This article explains why dynamic incremental RAG is harder than static RAG, identifies the three main points where recall accuracy breaks, and presents a three‑stage engineering pipeline—including a quality‑control layer, two‑stage retrieval, and reference‑injection generation—to keep real‑time data retrieval both accurate and robust.

AIDynamic DataRAG

0 likes · 13 min read

How to Build a Reliable Dynamic Incremental RAG Pipeline for Real‑Time Data

Wu Shixiong's Large Model Academy

Nov 18, 2025 · Artificial Intelligence

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

This article breaks down why function‑call reliability is the biggest bottleneck for LLM agents and presents a systematic five‑step loop—schema quality, prompt context, sampling, training data, and runtime defenses—plus concrete optimization techniques such as dynamic tool routing, plan‑execute, validation layers, memory injection, and log‑driven tuning, illustrated with real‑world cases.

AgentLLMTool Routing

0 likes · 12 min read

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

Wu Shixiong's Large Model Academy

Nov 16, 2025 · Artificial Intelligence

How to Slash RAG First‑Token Latency: Practical Engineering Strategies

This guide breaks down the three layers of a RAG pipeline—embedding, vector retrieval, and system architecture—and provides concrete engineering tactics such as batch embedding, async concurrency, caching, ANN indexing, partitioning, connection pooling, and async pipelines to dramatically reduce Time‑to‑First‑Token latency.

Async PipelineEmbeddingRAG

0 likes · 10 min read

How to Slash RAG First‑Token Latency: Practical Engineering Strategies

Wu Shixiong's Large Model Academy

Nov 15, 2025 · Artificial Intelligence

How to Build Robust Function Call Training Data for LLM Agents

This article explains why function call capabilities in large language model agents require dedicated training, outlines the four core abilities to teach, describes the structure and sources of effective training data, and compares lightweight LoRA fine‑tuning with full supervised fine‑tuning approaches.

Agent SystemsData GenerationLLM training

0 likes · 11 min read

How to Build Robust Function Call Training Data for LLM Agents

Wu Shixiong's Large Model Academy

Nov 14, 2025 · Artificial Intelligence

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

This article explains why function‑call accuracy is critical for LLM agents, identifies four common failure causes, and presents a systematic, five‑step engineering framework—including dynamic routing, chain‑of‑thought planning, result validation, memory injection, and log‑driven optimization—backed by concrete examples and quantitative improvements.

Agent EngineeringInterview PreparationLLM

0 likes · 10 min read

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

Wu Shixiong's Large Model Academy

Nov 12, 2025 · Artificial Intelligence

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

This article breaks down the memory systems behind LLM‑based agents, explaining why persistent memory is needed, the differences between short‑term context buffers and long‑term vector stores, practical implementation choices, maintenance strategies, and how to articulate these concepts effectively in technical interviews.

AgentLLMRetrieval

0 likes · 14 min read

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

Wu Shixiong's Large Model Academy

Nov 6, 2025 · Artificial Intelligence

How to Optimize RAG Knowledge Base Construction: Parsing, Chunking, and Retrieval

This article explains why building a high‑quality RAG knowledge base is critical, outlines offline parsing techniques for multi‑format documents, presents semantic chunking strategies that preserve structure and context, and shows how to answer interview questions with a robust, production‑ready pipeline.

AI InterviewChunkingKnowledge Base

0 likes · 8 min read

How to Optimize RAG Knowledge Base Construction: Parsing, Chunking, and Retrieval

Wu Shixiong's Large Model Academy

Nov 5, 2025 · Artificial Intelligence

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

Building a Retrieval‑Augmented Generation (RAG) system may be straightforward in code, but making it reliable, accurate, and scalable in production involves challenges across data preparation, vector retrieval, query rewriting, generation control, and system integration, turning a demo into a truly useful AI service.

AILLMPrompt Engineering

0 likes · 8 min read

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

Wu Shixiong's Large Model Academy

Nov 4, 2025 · Artificial Intelligence

Why Financial RAG Fails and How to Solve Its Core Challenges

This article explains why Retrieval‑Augmented Generation (RAG) projects in the financial sector often underperform, highlighting data‑structure complexities, document‑parsing hurdles, chunking strategies, compliance constraints, evaluation metrics, and engineering requirements, and offers practical solutions and code examples.

ChunkingComplianceEngineering

0 likes · 10 min read

Why Financial RAG Fails and How to Solve Its Core Challenges

Wu Shixiong's Large Model Academy

Nov 2, 2025 · Artificial Intelligence

Why Document Parsing Is the Real Bottleneck in RAG Projects (And How to Fix It)

The article explains that in Retrieval‑Augmented Generation projects the hardest challenge lies in robust document parsing—handling PDFs, PPTs, scanned contracts, OCR errors, and preserving structure—to ensure high‑quality retrieval and avoid hallucinations.

AIOCRRAG

0 likes · 10 min read

Why Document Parsing Is the Real Bottleneck in RAG Projects (And How to Fix It)