Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 16, 2024 · Artificial Intelligence

What Do Leading Open‑Source LLMs Do After Pretraining? A Deep Dive into Post‑Training Strategies

This article surveys the post‑training pipelines of major open‑source large language models released this year, detailing their alignment algorithms, data synthesis, reward modeling, DPO/GRPO variants, long‑context handling, tool use, and model‑averaging techniques, and highlights emerging trends such as data‑centric pipelines and iterative weak‑to‑strong alignment.

AI researchLLMalignment
0 likes · 99 min read
What Do Leading Open‑Source LLMs Do After Pretraining? A Deep Dive into Post‑Training Strategies
NewBeeNLP
NewBeeNLP
Oct 11, 2024 · Artificial Intelligence

Inside Llama 3: Training, Architecture, and Performance Secrets

An extensive review of Meta’s Llama 3 model breaks down its pre‑training data pipeline, scaling laws, architectural tweaks like GQA and RoPE, post‑training methods such as SFT, DPO, and reward modeling, and evaluates benchmark results, offering practical insights for researchers and engineers building large language models.

Llama 3Quantizationbenchmarking
0 likes · 32 min read
Inside Llama 3: Training, Architecture, and Performance Secrets
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 25, 2024 · Artificial Intelligence

Why LLaMA 3 405B Matches GPT‑4o: Architecture, Training, and Industry Impact

The article provides an in‑depth analysis of LLaMA 3 405B, covering its dense Transformer architecture, three‑stage pre‑training (initial, long‑context, annealing), iterative post‑training with RM‑guided rejection sampling, the decision against MOE, and the broader implications for both large and small model development.

405BSynthetic Datamodel architecture
0 likes · 17 min read
Why LLaMA 3 405B Matches GPT‑4o: Architecture, Training, and Industry Impact