Baobao Algorithm Notes
Author

Baobao Algorithm Notes

Author of the BaiMian large model, offering technology and industry insights.

295
Articles
0
Likes
379
Views
0
Comments
Recent Articles

Latest from Baobao Algorithm Notes

100 recent articles max
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 15, 2025 · Artificial Intelligence

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

This article systematically adapts classic deep reinforcement‑learning techniques—such as multi‑step returns, TD(λ)/GAE, V‑trace corrections, uncertainty‑aware weighting, safety constraints, distribution‑robust optimization, and value‑guided decoding—to improve large language model training and inference, providing concrete formulas, implementation tips, and empirical results.

Deep RLGAELLM
0 likes · 17 min read
Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 14, 2025 · Artificial Intelligence

Why Standard SFT Fails to Generalize and How One‑Line Dynamic Fine‑Tuning Fixes It

The article analyzes the poor generalization of supervised fine‑tuning (SFT) for large language models, reveals its gradient as a high‑variance inverse‑probability policy gradient, proposes a one‑line Dynamic Fine‑Tuning correction, and shows substantial gains on challenging math and offline RL benchmarks.

Dynamic Fine-TuningGeneralizationLLM alignment
0 likes · 7 min read
Why Standard SFT Fails to Generalize and How One‑Line Dynamic Fine‑Tuning Fixes It
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 11, 2025 · Industry Insights

Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo

In a WAIC 2025 interview, Zhu Yibo, co‑founder of Jiejie Xingchen, shares deep insights on AI infrastructure, covering its evolution, the need for tight model‑hardware co‑design, cost‑efficiency metrics, industry challenges, and future directions for large‑scale AI systems.

AI Infrastructurehardware optimizationindustry insights
0 likes · 36 min read
Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 4, 2025 · Artificial Intelligence

Why GPT‑OSS Chooses a 64‑Dimensional Attention Head and 2880 Hidden Size

This article analyzes the surprising design choices of the rumored GPT‑OSS 120B model, explaining the rationale behind a 64‑dimensional attention head, the equal hidden and intermediate sizes, and other quirky parameters such as MLP bias and KV‑sink SWA, backed by theoretical formulas and empirical benchmarks.

Attention HeadGPT-OSSMLP Ratio
0 likes · 13 min read
Why GPT‑OSS Chooses a 64‑Dimensional Attention Head and 2880 Hidden Size
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 1, 2025 · Artificial Intelligence

Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide

The article introduces Qwen3‑Coder‑30B‑A3B‑Instruct (aka Qwen3‑Coder‑Flash), detailing its architecture, 256K‑to‑1M token context, agentic coding capabilities, installation steps with Transformers, sample code for tool use, optimal sampling parameters, and deployment tips across various runtimes.

AI coding assistantAgentic CodingQwen3
0 likes · 6 min read
Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 29, 2025 · Artificial Intelligence

Qwen3‑30B‑A3B‑Instruct‑2507: New Instruction Model with Boosted General and Multilingual Skills

The Qwen3‑30B‑A3B‑Instruct‑2507 model, an updated non‑thinking version of Qwen3‑30B‑A3B, delivers significant gains in instruction following, reasoning, multilingual knowledge coverage, and 256K context length, and its performance is benchmarked against leading LLMs across a wide range of tasks.

Instruction TuningMixture‑of‑ExpertsQwen3
0 likes · 6 min read
Qwen3‑30B‑A3B‑Instruct‑2507: New Instruction Model with Boosted General and Multilingual Skills
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 28, 2025 · Industry Insights

Why AWS Bedrock AgentCore Signals a New Era for Agentic AI Infrastructure

The article analyzes AWS Bedrock AgentCore and related hardware and software requirements for Agentic AI, covering runtime isolation with microVMs, memory architectures, identity and gateway design, zero‑trust networking, and the challenges of multi‑tenant KVCache and context engineering.

AWS BedrockInfrastructureMicroVM
0 likes · 15 min read
Why AWS Bedrock AgentCore Signals a New Era for Agentic AI Infrastructure
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 18, 2025 · Artificial Intelligence

30+ Expert Q&A on Large Language Model Architecture, Training, and Deployment

This article compiles more than thirty interview‑style questions and detailed answers covering large‑model fundamentals such as encoder‑decoder trade‑offs, self‑attention versus RNN, context length, tokenization, embedding strategies, FlashAttention, RoPE, prompt design, retrieval‑augmented generation, safety measures, fine‑tuning, and model distillation, providing a comprehensive technical reference for practitioners.

attention mechanismretrieval-augmented generation
0 likes · 53 min read
30+ Expert Q&A on Large Language Model Architecture, Training, and Deployment
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 17, 2025 · Artificial Intelligence

How QK-Clip Tames MaxLogit Explosions in Trillion‑Parameter LLMs

The article introduces QK-Clip, a lightweight per‑head weight‑clipping technique that uses the MaxLogit signal to prevent uncontrolled logit growth in massive LLMs, explains its design, compares it with prior methods, and shows that it stabilizes training without harming model performance.

Attention stabilityLLM trainingMaxLogit
0 likes · 15 min read
How QK-Clip Tames MaxLogit Explosions in Trillion‑Parameter LLMs