Tagged articles
11 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 14, 2026 · Artificial Intelligence

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

A recent study shows that pre‑training Transformers on synthetic, non‑language data generated by Neural Cellular Automata can boost language‑model performance by up to 6%, accelerate convergence by 40%, and improve downstream reasoning, even outperforming models trained on massive natural‑text corpora.

In-Context LearningNeural Cellular AutomataPre‑training
0 likes · 12 min read
Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path
AI Architecture Hub
AI Architecture Hub
Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM
0 likes · 9 min read
From LLMs to Autonomous Agents: The Three Evolution Stages of AI
AI Algorithm Path
AI Algorithm Path
Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM
0 likes · 5 min read
Master the Three Essential LLM Training Stages for 2025
Architect
Architect
Feb 19, 2025 · Artificial Intelligence

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

The article critically examines whether the pre‑training Scaling Law still applies to Grok 3, compares its compute usage and model size with DeepSeek and OpenAI models, evaluates the cost‑effectiveness of pre‑training, RL and test‑time scaling, and explores how these insights shape future large‑language‑model development strategies.

Grok-3Large Language ModelsPre‑training
0 likes · 11 min read
Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics
Fighter's World
Fighter's World
Dec 21, 2024 · Artificial Intelligence

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

The article examines Ilya Sutskever’s claim that pre‑training will end, argues that scaling laws still hold and data is not yet a bottleneck, highlights the scarcity of high‑quality frontier data, and explains why the industry is shifting toward inference‑time compute (o1) as a more sustainable path for large language models.

AI trendsData WallInference‑time Compute
0 likes · 13 min read
Is Pre‑training Coming to an End? Evaluating Data Sufficiency
DataFunTalk
DataFunTalk
Jul 2, 2024 · Artificial Intelligence

Application of Large Language Models in Recommendation Systems: Overview and Future Directions

This article provides a comprehensive overview of how large language models (LLMs) are applied in recommendation systems, covering two main paradigms—LLM+RS as a component and LLM as a standalone recommender—detailing their impact on pre‑training, fine‑tuning, prompting, and future research challenges.

Fine-tuningFuture DirectionsLLM
0 likes · 6 min read
Application of Large Language Models in Recommendation Systems: Overview and Future Directions
DataFunSummit
DataFunSummit
Feb 17, 2024 · Artificial Intelligence

When to Pre‑Train Graph Neural Networks: Data‑Active Pre‑Training and a Graph Generator Framework

This article examines the conditions under which graph neural network pre‑training is beneficial, proposes a data‑centric generator framework to assess transferability, introduces a data‑active pre‑training strategy that selects informative graphs, and presents experimental results showing that using less, well‑chosen data can outperform full‑scale pre‑training.

Pre‑trainingdata selectiongraph generator
0 likes · 16 min read
When to Pre‑Train Graph Neural Networks: Data‑Active Pre‑Training and a Graph Generator Framework
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 24, 2023 · Artificial Intelligence

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

This article provides a detailed overview of large language models (LLMs), tracing their evolution from statistical and neural language models to modern pre‑trained transformers, discussing scaling, training, adaptation, utilization, evaluation methods, available resources, and outlining current challenges and future research directions.

Large Language ModelsModel ScalingPre‑training
0 likes · 26 min read
Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions