Tagged articles

11 articles

Page 1 of 1

Machine Learning Algorithms & Natural Language Processing

Mar 14, 2026 · Artificial Intelligence

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

A recent study shows that pre‑training Transformers on synthetic, non‑language data generated by Neural Cellular Automata can boost language‑model performance by up to 6%, accelerate convergence by 40%, and improve downstream reasoning, even outperforming models trained on massive natural‑text corpora.

In-Context LearningNeural Cellular AutomataPre‑training

0 likes · 12 min read

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

AI Architecture Hub

Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM

0 likes · 9 min read

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

AI Algorithm Path

Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM

0 likes · 5 min read

Master the Three Essential LLM Training Stages for 2025

Architect

Feb 19, 2025 · Artificial Intelligence

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

The article critically examines whether the pre‑training Scaling Law still applies to Grok 3, compares its compute usage and model size with DeepSeek and OpenAI models, evaluates the cost‑effectiveness of pre‑training, RL and test‑time scaling, and explores how these insights shape future large‑language‑model development strategies.

Grok-3Large Language ModelsPre‑training

0 likes · 11 min read

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

Fighter's World

Dec 21, 2024 · Artificial Intelligence

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

The article examines Ilya Sutskever’s claim that pre‑training will end, argues that scaling laws still hold and data is not yet a bottleneck, highlights the scarcity of high‑quality frontier data, and explains why the industry is shifting toward inference‑time compute (o1) as a more sustainable path for large language models.

AI trendsData WallInference‑time Compute

0 likes · 13 min read

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

DataFunTalk

Jul 2, 2024 · Artificial Intelligence

Application of Large Language Models in Recommendation Systems: Overview and Future Directions

This article provides a comprehensive overview of how large language models (LLMs) are applied in recommendation systems, covering two main paradigms—LLM+RS as a component and LLM as a standalone recommender—detailing their impact on pre‑training, fine‑tuning, prompting, and future research challenges.

Fine-tuningFuture DirectionsLLM

0 likes · 6 min read

Application of Large Language Models in Recommendation Systems: Overview and Future Directions

DataFunSummit

Feb 17, 2024 · Artificial Intelligence

When to Pre‑Train Graph Neural Networks: Data‑Active Pre‑Training and a Graph Generator Framework

This article examines the conditions under which graph neural network pre‑training is beneficial, proposes a data‑centric generator framework to assess transferability, introduces a data‑active pre‑training strategy that selects informative graphs, and presents experimental results showing that using less, well‑chosen data can outperform full‑scale pre‑training.

Pre‑trainingdata selectiongraph generator

0 likes · 16 min read

When to Pre‑Train Graph Neural Networks: Data‑Active Pre‑Training and a Graph Generator Framework

Rare Earth Juejin Tech Community

Jul 24, 2023 · Artificial Intelligence

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

This article provides a detailed overview of large language models (LLMs), tracing their evolution from statistical and neural language models to modern pre‑trained transformers, discussing scaling, training, adaptation, utilization, evaluation methods, available resources, and outlining current challenges and future research directions.

Large Language ModelsModel ScalingPre‑training

0 likes · 26 min read

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

Tencent Tech

Jun 14, 2023 · Artificial Intelligence

How Tencent’s Robot Dog Max Gains Human‑Like Decision‑Making with Pre‑trained AI and RL

Tencent Robotics X unveiled how its robot dog Max combines pre‑trained AI models with reinforcement learning across three learning stages, enabling it to acquire, store, and apply skills for autonomous decision‑making in complex tasks such as the World Chase Tag competition.

AIPre‑trainingReinforcement Learning

0 likes · 6 min read

How Tencent’s Robot Dog Max Gains Human‑Like Decision‑Making with Pre‑trained AI and RL

Baobao Algorithm Notes

Mar 24, 2022 · Artificial Intelligence

Exploring WuDaoMM: A 650M Chinese‑English Multimodal Dataset for Pre‑training

The article introduces WuDaoMM and WuDaoCorpora 2.0, massive Chinese‑English multimodal datasets—including 650 million image‑text pairs, 3 TB of text, 93 TB of images, and 181 GB of dialogue—detailing their composition, formats, access options, and potential research applications.

Chinese AIPre‑trainingWuDaoMM

0 likes · 6 min read

Exploring WuDaoMM: A 650M Chinese‑English Multimodal Dataset for Pre‑training

DataFunTalk

Feb 28, 2019 · Artificial Intelligence

A Comprehensive Introduction to BERT: Architecture, Pre‑training, and Implementation

This article provides an in‑depth overview of BERT, covering its NLP background, GLUE benchmark achievements, Transformer‑based architecture, pre‑training strategies (MLM and NSP), downstream fine‑tuning methods, and includes detailed PyTorch code implementations of its core components.

BERTNLPPre‑training

0 likes · 19 min read

A Comprehensive Introduction to BERT: Architecture, Pre‑training, and Implementation