Author

Baobao Algorithm Notes

Author of the BaiMian large model, offering technology and industry insights.

296

Articles

Likes

713

Views

Comments

Latest from Baobao Algorithm Notes

100 recent articles max

Baobao Algorithm Notes

Jul 2, 2026 · Artificial Intelligence

How to Connect Chinese LLMs to Codex: A Hands‑On Tutorial

This article walks through installing Codex, adding the open‑source CC Switch tool, and configuring Chinese large language models such as Kimi so they can serve as the backend for Codex’s AI agent, with step‑by‑step screenshots and performance examples.

AI AgentCC SwitchChinese LLM

0 likes · 11 min read

How to Connect Chinese LLMs to Codex: A Hands‑On Tutorial

Baobao Algorithm Notes

Jun 2, 2026 · Artificial Intelligence

MiniMax M3: How a 1M‑Token, Multimodal Agent Reproduces ICLR Research and Automates Kaggle Competitions

The MiniMax M3 model combines a 1‑million‑token context window, native multimodal training and a new MiniMax Sparse Attention architecture that cuts token compute to one‑twentieth of its predecessor, achieving up to 15× faster decoding, while its interactive user‑simulator training enables fully autonomous agents that can reproduce ICLR‑2025 research and tackle Auto‑Kaggle competitions at a fraction of the cost of Western models.

Auto KaggleM3MiniMax

0 likes · 9 min read

MiniMax M3: How a 1M‑Token, Multimodal Agent Reproduces ICLR Research and Automates Kaggle Competitions

Baobao Algorithm Notes

May 26, 2026 · Artificial Intelligence

How On-Policy Distillation (OPD) Solves Core Challenges in Large-Model Post-Training

The article explains how On-Policy Distillation (OPD) combines on‑policy sampling with dense teacher feedback via reverse KL to address low signal density, distribution shift, and capability interference in large‑model post‑training, and compares implementations by Qwen3, GLM‑5, MiMo‑V2 and DeepSeek‑V4.

Model CompressionOPDReverse KL

0 likes · 20 min read

How On-Policy Distillation (OPD) Solves Core Challenges in Large-Model Post-Training

Baobao Algorithm Notes

May 22, 2026 · Artificial Intelligence

How LiteScale Cuts Wait Times in Large‑Model Post‑Training with Gradient Accumulation

The article examines the bottleneck of synchronous rollout in large‑model post‑training, proposes an asynchronous design using gradient accumulation and a global micro‑batch count to preserve loss equivalence, and introduces LogitsExpress for efficient top‑K knowledge‑distillation communication, all implemented in the lightweight LiteScale framework.

Distributed TrainingPost-Trainingasynchronous rollout

0 likes · 16 min read

How LiteScale Cuts Wait Times in Large‑Model Post‑Training with Gradient Accumulation

Baobao Algorithm Notes

Apr 27, 2026 · Artificial Intelligence

DeepDive into DeepSeek‑V4: Efficient Million‑Token Context, Hybrid Attention, and Muon Optimizer

The article provides an in‑depth technical analysis of DeepSeek‑V4, detailing its novel hybrid attention architecture (CSA and HCA), the manifold‑constrained hyper‑connection (mHC), massive KV‑cache reductions, FLOPs savings across token lengths, and the Muon optimizer with Newton‑Schulz orthogonalization, all backed by concrete benchmark tables and code snippets.

DeepSeekEfficient AttentionKV cache reduction

0 likes · 61 min read

DeepDive into DeepSeek‑V4: Efficient Million‑Token Context, Hybrid Attention, and Muon Optimizer

Baobao Algorithm Notes

Apr 20, 2026 · Industry Insights

From Prompt Writer to Harness Architect: Redefining the Algorithm Engineer in the LLM Era

The article analyzes how the rise of foundation models shifts algorithm engineers from hand‑crafting models to building robust Harness environments, detailing OpenAI’s agent‑first experiments, the new "Model + Harness" formula, and practical steps for staying valuable in a prompt‑centric world.

AI engineeringHarness architectureLLM

0 likes · 9 min read

From Prompt Writer to Harness Architect: Redefining the Algorithm Engineer in the LLM Era

Baobao Algorithm Notes

Apr 14, 2026 · Industry Insights

Why Mastering AI Agents Is the Most Critical Skill Right Now

The article argues that leveraging AI agents like Claude Code is now the top priority for developers, explaining how agents boost productivity, the importance of their operating environment, and why embracing them is essential for future success in the AI-driven workplace.

Agent TrainingClaude CodeLLM

0 likes · 10 min read

Why Mastering AI Agents Is the Most Critical Skill Right Now

Baobao Algorithm Notes

Mar 20, 2026 · Artificial Intelligence

Can AI Self‑Iterate? Inside MiniMax M2.7’s Self‑Improving Magic

The article examines MiniMax M2.7’s claim of self‑iteration, its impressive Kaggle record, and a series of technical tests—including code refactoring, real‑time chart generation, futures backtesting, business analysis, PPT creation, and news tracking—to evaluate the model’s practical AI self‑evolution capabilities.

AIAutoMLKaggle

0 likes · 8 min read

Can AI Self‑Iterate? Inside MiniMax M2.7’s Self‑Improving Magic

Baobao Algorithm Notes

Mar 3, 2026 · Artificial Intelligence

Boosting LLM Post-Training with RL: Tips for Efficiency and Stability

This article shares practical insights and pitfalls from six months of applying reinforcement learning to fine‑tune large language models, covering exploration efficiency, training stability, model selection, and special considerations for thinking‑oriented agents.

AILLMPost-Training

0 likes · 12 min read

Boosting LLM Post-Training with RL: Tips for Efficiency and Stability

Baobao Algorithm Notes

Mar 2, 2026 · Artificial Intelligence

How “Skills” Turn LLM Prompts into Portable, Engineered Workflows

This article dissects the evolution of LLM prompts into structured, version‑controlled skill packages, explains the AgentSkills specification, details OpenClaw’s implementation, compares prompts, memory, MCP and skills, and provides end‑to‑end examples with code, flowcharts and best‑practice recommendations.

Agent SkillsAutomationLLM

0 likes · 40 min read

How “Skills” Turn LLM Prompts into Portable, Engineered Workflows