Tagged articles

compute efficiency

7 articles · Page 1 of 1

Jun 7, 2026 · Artificial Intelligence

ChatGPT’s Dreaming V3 Memory Upgrade: Free for a Billion Users

OpenAI unveiled Dreaming V3, a new memory architecture that lets ChatGPT silently replay and consolidate daily conversations, achieving 82.8% context recall, 71.3% preference compliance, five‑fold compute savings, and free access for billions while offering a transparent memory‑summary interface.

AI memoryChatGPTDreaming V3

0 likes · 9 min read

ChatGPT’s Dreaming V3 Memory Upgrade: Free for a Billion Users

Machine Heart

Apr 6, 2026 · Industry Insights

Why Cutting Claude Subscriptions Won’t Fix Token Costs – Smarter Compute Is the Answer

Anthropic’s decision to block third‑party Agent frameworks from Claude’s subscription model exposes unsustainable token pricing, highlights massive compute waste caused by poor context handling, and argues that the real solution lies in smarter, more efficient agent design rather than cheaper tokens.

AI pricingAgentAnthropic

0 likes · 8 min read

Why Cutting Claude Subscriptions Won’t Fix Token Costs – Smarter Compute Is the Answer

SuanNi

Mar 20, 2026 · Artificial Intelligence

How SkillCraft Shows AI Agents Can Cut Compute Costs by Up to 80%

SkillCraft, a new benchmark from Oxford and partner institutions, evaluates whether AI agents can autonomously combine basic tools into reusable skills, revealing that stronger models dramatically improve task success rates while slashing compute consumption by up to 80%, and exposing the limits of hierarchical skill nesting and cross‑model skill sharing.

AI benchmarkSkillCraftcompute efficiency

0 likes · 15 min read

How SkillCraft Shows AI Agents Can Cut Compute Costs by Up to 80%

ShiZhen AI

Mar 17, 2026 · Artificial Intelligence

Kimi’s Attention Residuals Swap a Decade-Old Residual Trick for 1.25× Faster 48B MoE

The Kimi team introduces Attention Residuals, a softmax‑based replacement for the uniform residual connections used in Transformers for a decade, enabling selective aggregation of layer histories, reducing hidden‑state growth, and achieving a 1.25× compute‑efficiency gain on a 48‑billion‑parameter MoE model with less than 2% inference latency increase.

Attention ResidualsDeep LearningMoE

0 likes · 10 min read

Kimi’s Attention Residuals Swap a Decade-Old Residual Trick for 1.25× Faster 48B MoE

Machine Learning Algorithms & Natural Language Processing

Mar 3, 2026 · Artificial Intelligence

Beyond Dense and MoE: JTok Module Cuts Compute by One‑Third as a New Scaling Path

The paper introduces JTok and its dynamic variant JTok‑M, a token‑indexed parameter scaling method that decouples model capacity from compute, achieving up to 35% compute reduction while delivering consistent performance gains across a wide range of downstream tasks and model sizes.

JTokToken-indexed scalingTransformer

0 likes · 16 min read

Beyond Dense and MoE: JTok Module Cuts Compute by One‑Third as a New Scaling Path

Baobao Algorithm Notes

Nov 21, 2023 · Artificial Intelligence

How Much Data Do You Need for a 10B LLM? Decoding Scaling Laws

This article explains how scaling laws can answer common LLM development questions—such as the data required for a 10B model, the model size achievable with 1 TB of data, and the optimal compute‑data‑model trade‑off for a fixed GPU budget—by presenting core formulas, practical derivations, and insights from OpenAI, DeepMind and Google.

Data RequirementsLarge Language ModelsModel Size

0 likes · 12 min read

How Much Data Do You Need for a 10B LLM? Decoding Scaling Laws

DataFunTalk

Aug 10, 2021 · Artificial Intelligence

A Comprehensive Review of Industrial-Scale Deep Learning for Click-Through Rate Prediction in Online Advertising

This article provides an extensive retrospective and forward‑looking analysis of the evolution of click‑through‑rate prediction technologies in online advertising, covering shallow‑learning era challenges, the rise of industrial‑scale deep learning, system‑level innovations such as recall, coarse‑ranking, fine‑ranking, bidding, and the emerging co‑design of algorithms, compute, and architecture.

Algorithmic OptimizationCTR Predictionadvertising systems

0 likes · 65 min read

A Comprehensive Review of Industrial-Scale Deep Learning for Click-Through Rate Prediction in Online Advertising