Tagged articles

LLM scaling

8 articles · Page 1 of 1

Jun 24, 2026 · Artificial Intelligence

Why 1M Context Length Matters: Inside GLM 5.2’s New Techniques

The article examines how 1‑million‑token context has become a standard feature in modern LLMs, explains the compute and memory challenges it brings, reviews the sparse‑attention and token‑selection tricks (including GLM 5.2’s IndexShare and LayerSplit), and outlines practical evaluation methods for measuring long‑context effectiveness.

1M contextGLM-5.2IndexShare

0 likes · 10 min read

Why 1M Context Length Matters: Inside GLM 5.2’s New Techniques

SuanNi

May 20, 2026 · Industry Insights

Why Karpathy’s Sudden Move to Anthropic Could Shift the AI IPO Landscape

Andrej Karpathy announced his return to frontline AI research by joining Anthropic just as both companies prepare for IPOs, a move that leverages his extensive background, reflects shifting LLM scaling priorities, and signals a strategic talent and technology win for Anthropic in the competitive AI market.

AI industryAI talentAndrej Karpathy

0 likes · 12 min read

Why Karpathy’s Sudden Move to Anthropic Could Shift the AI IPO Landscape

Wuming AI

Apr 19, 2026 · Artificial Intelligence

Why Bigger LLMs Aren’t Smarter: Karpathy Blames Junk Training Data

Karpathy argues that the rapid growth of large language models is driven more by noisy, low‑quality training data than by a need for greater intelligence, urging a split between clean cognition cores and external memory to achieve smarter, more efficient AI.

AI model efficiencyKarpathyLLM scaling

0 likes · 5 min read

Why Bigger LLMs Aren’t Smarter: Karpathy Blames Junk Training Data

Machine Heart

Apr 19, 2026 · Artificial Intelligence

Are Small Models the Core Component of Agent Systems?

The article analyzes how advancing small‑model capabilities are shifting agent system design from merely checking if a model can run under resource limits to evaluating its suitability for specific tasks, thereby redefining model selection logic and workflow partitioning.

Agent systemsLLM scalingmodel selection

0 likes · 7 min read

Are Small Models the Core Component of Agent Systems?

AI Frontier Lectures

Jan 10, 2026 · Artificial Intelligence

How Monadic Context Engineering Transforms AI Agent Reliability and Scaling

This article examines recent research on Monadic Context Engineering and Recursive Language Models, explaining how monadic abstractions can improve error handling, state management, and parallel execution in AI agents, and how REPL‑based recursive language models address long‑context limitations through divide‑and‑conquer and token‑as‑instruction techniques.

AI agentsContext EngineeringFunctional Programming

0 likes · 15 min read

How Monadic Context Engineering Transforms AI Agent Reliability and Scaling

PaperAgent

Jan 6, 2026 · Artificial Intelligence

How Recursive Language Models Enable Unlimited Context for LLMs

Recursive Language Models (RLM) offer a cost‑effective alternative to expanding LLM context windows by storing prompts as variables and enabling recursive calls, allowing models to process over 100,000 tokens, with experiments showing superior performance and lower median costs compared to baseline approaches.

AI researchLLM scalingLong Context

0 likes · 5 min read

How Recursive Language Models Enable Unlimited Context for LLMs

Baobao Algorithm Notes

Sep 5, 2024 · Artificial Intelligence

Why Small LLMs Are the Secret Weapon for Scaling Large Model Research

The article explains how homologous small language models—trained on the same tokenizer and data as their large counterparts—serve as cheap, fast experimental platforms that can predict large‑model performance, guide pre‑training decisions, and support techniques like distillation and reward modeling.

AI researchLLM scalingQwen2

0 likes · 13 min read

Why Small LLMs Are the Secret Weapon for Scaling Large Model Research

Rare Earth Juejin Tech Community

Aug 31, 2024 · Artificial Intelligence

Apple Intelligence and the Scaling Landscape of Large Language Models: Trends, Costs, and Deployment Considerations

An in‑depth analysis of Apple Intelligence and the broader LLM ecosystem, covering recent model scaling breakthroughs, data and compute requirements, pricing dynamics, hardware trends, on‑device versus cloud deployment, and strategic implications for developers, product managers, and AI practitioners.

AI hardwareApple IntelligenceLLM scaling

0 likes · 58 min read

Apple Intelligence and the Scaling Landscape of Large Language Models: Trends, Costs, and Deployment Considerations