Author

NewBeeNLP

Always insightful, always fun

119

Articles

Likes

Views

Comments

Latest from NewBeeNLP

100 recent articles max

NewBeeNLP

Jul 31, 2024 · Artificial Intelligence

How Continual Pre‑Training Boosts Llama‑3’s Chinese and Scientific Reasoning

This report presents a continual pre‑training approach that significantly enhances Llama‑3 (8B)’s Chinese language proficiency and scientific reasoning by using a carefully mixed corpus of existing and synthetic data, detailing the bilingual adaptation and synthetic‑enhancement stages, data‑mixing and curriculum strategies, and demonstrating strong results across multilingual and scientific benchmarks without sacrificing original capabilities.

LLMLlama-3benchmarking

0 likes · 9 min read

How Continual Pre‑Training Boosts Llama‑3’s Chinese and Scientific Reasoning

NewBeeNLP

Jul 31, 2024 · Artificial Intelligence

Training 7B–13B LLMs: Practical Tips, Hyperparameters, and Scaling Challenges

The article shares hands‑on experience training 7‑ and 13‑billion‑parameter language models, covering essential hyper‑parameters, hardware requirements, data quality considerations, open dataset resources, and the systemic difficulties that arise when scaling to trillion‑parameter models.

LLM traininghyperparameterslarge language models

0 likes · 8 min read

Training 7B–13B LLMs: Practical Tips, Hyperparameters, and Scaling Challenges

NewBeeNLP

Jul 26, 2024 · Industry Insights

What the Leaked Llama 3.1 405B Reveals About Meta’s Newest LLM

A leaked 405‑billion‑parameter Llama 3.1 model shows mixed benchmark results—outperforming GPT‑4o on some tasks while lagging on others—along with massive hardware requirements, extensive training data, and new safety considerations that could reshape AI deployment.

Llama 3.1Meta

0 likes · 11 min read

What the Leaked Llama 3.1 405B Reveals About Meta’s Newest LLM

NewBeeNLP

Jul 25, 2024 · Artificial Intelligence

Llama 3.1 Unveiled: How the New Open‑Source Giant Matches GPT‑4o and Claude 3.5

Meta has officially released Llama 3.1, a 405‑billion‑parameter open‑source model that matches or surpasses GPT‑4o and Claude 3.5 on over 150 benchmarks, expands context to 128 K tokens, supports eight languages, and is accompanied by a detailed 100‑page paper describing its data, training stack, architecture, quantization, safety measures, and ecosystem support.

AI safetyLlama 3.1Meta

0 likes · 15 min read

Llama 3.1 Unveiled: How the New Open‑Source Giant Matches GPT‑4o and Claude 3.5

NewBeeNLP

Jul 24, 2024 · Industry Insights

From Black Iron to Silver: The Evolution of Large Model Infrastructure (2019‑2024)

The article traces the evolution of large‑model training and inference infrastructure from the early “black‑iron” era (2019‑2021) through the “golden” boom (2022‑2023) to the emerging “silver” phase (2024‑), highlighting key research breakthroughs, open‑source frameworks, hardware trends, market dynamics, and practical challenges for engineers entering the field.

AI infrastructureIndustry trendsInference

0 likes · 22 min read

From Black Iron to Silver: The Evolution of Large Model Infrastructure (2019‑2024)

NewBeeNLP

Jul 22, 2024 · Artificial Intelligence

How Meta Scales User Modeling for Ads: Inside the SUM Framework

This article examines Meta's SUM (Scaling User Modeling) system, detailing its upstream‑downstream architecture, the SOAP online asynchronous serving platform, production optimizations, and extensive offline and online experiments that demonstrate significant gains in ad personalization performance.

MetaRecommendation Systemsdeep learning

0 likes · 19 min read

How Meta Scales User Modeling for Ads: Inside the SUM Framework

NewBeeNLP

Jul 16, 2024 · Artificial Intelligence

Can Item Language Models Bridge LLMs and Collaborative Filtering for Conversational Recommendation?

This paper identifies three challenges of applying large language models to recommendation systems and proposes an Item Language Model that combines an item encoder with a frozen LLM, demonstrating through extensive experiments that language‑item alignment and interaction knowledge significantly improve conversational recommendation performance.

Q-Formercollaborative filteringconversational recommendation

0 likes · 10 min read

Can Item Language Models Bridge LLMs and Collaborative Filtering for Conversational Recommendation?

NewBeeNLP

Jul 15, 2024 · Artificial Intelligence

How to Build and Train Sub‑1B Language Models from Scratch: Resources & Tips

This guide compiles open‑source repositories, research papers, and practical tricks for training miniature large‑language models under 1 billion parameters, helping readers learn by reproducing models like nanoGPT, tinyLlama, Phi‑1.5, and more.

LLMnanoGPTopen-source

0 likes · 7 min read

How to Build and Train Sub‑1B Language Models from Scratch: Resources & Tips

NewBeeNLP

Jul 10, 2024 · Artificial Intelligence

Can Large Language Models Master Co‑Temporal Reasoning? Introducing COTEMPQA

This article presents the COTEMPQA benchmark for evaluating large language models on co‑temporal reasoning, details its four scenario types, construction pipeline, experimental results across models, error analysis, and proposes the MR‑COT strategy that leverages mathematical reasoning to significantly improve performance.

LLM evaluationMR-COTbenchmark dataset

0 likes · 11 min read

Can Large Language Models Master Co‑Temporal Reasoning? Introducing COTEMPQA

NewBeeNLP

Jul 8, 2024 · Artificial Intelligence

How LLMs Transform Recommendation Systems: The LEARN Framework Explained

This article reviews the Kuaishou paper on adapting large language models for recommendation, detailing the LEARN framework's dual‑tower architecture, embedding generation, loss functions, and experimental results that address cold‑start and long‑tail challenges in modern recommender systems.

InfoNCELLMLong Tail

0 likes · 8 min read

How LLMs Transform Recommendation Systems: The LEARN Framework Explained