Tagged articles
16 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 11, 2026 · Artificial Intelligence

Heuristic Learning: A New Reinforcement Learning Paradigm for Continual Learning

The article proposes Heuristic Learning (HL) as a way to tackle continual learning’s catastrophic forgetting by using coding agents that iteratively refine rule‑based policies, showing empirical gains on Atari, MuJoCo, and VizDoom tasks and outlining HL’s benefits, challenges, and future integration with neural networks.

LLMcoding agentscontinual learning
0 likes · 15 min read
Heuristic Learning: A New Reinforcement Learning Paradigm for Continual Learning
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 9, 2026 · Artificial Intelligence

Heuristic Learning: Reinforcement Without Parameter Updates via .py File

OpenAI researcher Yong Jiayi introduces Heuristic Learning, a reinforcement paradigm that replaces gradient‑based neural network updates with code‑editing driven by GPT‑5.4, achieving the theoretical 864‑point Atari Breakout score and matching or surpassing PPO on multiple Atari and robot tasks.

Atari BenchmarkGPT-5.4Robot Control
0 likes · 8 min read
Heuristic Learning: Reinforcement Without Parameter Updates via .py File
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Apr 10, 2026 · Artificial Intelligence

Agent-Dice: Geometric Consensus Filtering Beats Catastrophic Forgetting in LLM Agents

Agent-Dice introduces a geometric consensus filtering and curvature‑based importance weighting framework that disentangles knowledge updates, preventing catastrophic forgetting in large‑language‑model agents while enhancing plasticity, and demonstrates superior stability‑plasticity trade‑offs on GUI and tool‑use benchmarks across multiple base models.

AgentCatastrophic ForgettingGUI
0 likes · 8 min read
Agent-Dice: Geometric Consensus Filtering Beats Catastrophic Forgetting in LLM Agents
AI Tech Publishing
AI Tech Publishing
Apr 8, 2026 · Artificial Intelligence

How Model, Harness, and Memory Enable Continual Learning for AI Agents

The article breaks down AI agent continual learning into three layers—model, harness, and context—explains their distinct challenges, shows how traces link them, and argues that focusing on harness and context yields faster, more practical improvements than merely retraining models.

AI agentsModel Trainingcontext memory
0 likes · 9 min read
How Model, Harness, and Memory Enable Continual Learning for AI Agents
PaperAgent
PaperAgent
Mar 22, 2026 · Artificial Intelligence

Can LLM Agents Self‑Evolve Without Retraining? Inside Memento‑Skills

The article analyzes the Memento‑Skills framework, which treats external memory as executable skills to enable deployment‑time continual learning for frozen LLM agents, detailing its read‑write reflective loop, skill‑as‑memory design, behavior‑trained skill router, experimental validation on GAIA and HLE benchmarks, and theoretical guarantees without gradient updates.

AIAgentLLM
0 likes · 9 min read
Can LLM Agents Self‑Evolve Without Retraining? Inside Memento‑Skills
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 5, 2026 · Artificial Intelligence

Can AI Self‑Improve? Inside a Stanford PhD Defense on Continually Self‑Improving AI

Zitong Yang’s Stanford PhD defense introduced “continually self‑improving AI,” a system that autonomously refines its own parameters, generates synthetic training data, and even designs its own learning algorithms, with experiments on synthetic continual training, synthetic‑bootstrap pre‑training, and AI‑design‑AI demonstrating measurable gains over static baselines.

AI researchcontinual learningpretraining
0 likes · 35 min read
Can AI Self‑Improve? Inside a Stanford PhD Defense on Continually Self‑Improving AI
Data Party THU
Data Party THU
Oct 2, 2025 · Artificial Intelligence

Bridging Human and Machine Learning: Meta Prompt Tuning and Lifelong Few-Shot Language Models

This article presents a comprehensive study on enhancing language models with few‑shot and continual learning techniques, introducing Meta Prompt Tuning, Dynamic Module Expansion, and the LFPT5 framework to achieve more human‑like, efficient, and adaptable learning across evolving tasks.

Lifelong Learningcontinual learninglanguage models
0 likes · 8 min read
Bridging Human and Machine Learning: Meta Prompt Tuning and Lifelong Few-Shot Language Models
Data Party THU
Data Party THU
Sep 28, 2025 · Artificial Intelligence

Can the OaK Architecture Unlock General AI? A Deep Dive into Continuous Learning and Planning

The article presents Richard Sutton’s OaK architecture—a domain‑general, empirical, open‑ended framework that equips agents with continuously learnable components, meta‑learned step‑sizes, and a five‑stage FC‑STOMP pipeline to build world models, generate sub‑problems, learn options, and plan at run‑time.

AI ArchitectureWorld Modelscontinual learning
0 likes · 22 min read
Can the OaK Architecture Unlock General AI? A Deep Dive into Continuous Learning and Planning
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Feb 24, 2025 · Artificial Intelligence

Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?

This paper proposes a multi‑teacher distillation framework for continual learning that combines active data rehearsal with feature‑decoupled distillation, demonstrating superior performance on PASCAL VOC and COCO benchmarks while mitigating catastrophic forgetting and balancing stability‑plasticity trade‑offs.

AICatastrophic Forgettingactive rehearsal
0 likes · 12 min read
Can Multi‑Teacher Distillation Overcome Catastrophic Forgetting in Continual Learning?
Cognitive Technology Team
Cognitive Technology Team
Feb 7, 2025 · Artificial Intelligence

Knowledge Distillation: Concepts, Techniques, Applications, and Future Directions

This article explains knowledge distillation—a technique introduced by Geoffrey Hinton that transfers knowledge from large teacher models to compact student models—covering its core concepts, loss functions, various distillation strategies, notable applications in edge computing, federated learning, continual learning, and emerging research directions.

Deep LearningEdge ComputingFederated Learning
0 likes · 7 min read
Knowledge Distillation: Concepts, Techniques, Applications, and Future Directions
DataFunTalk
DataFunTalk
Dec 1, 2022 · Artificial Intelligence

Advances and Challenges in Controllable Text Generation with Pretrained Language Models

This report reviews the background, recent research progress, practical applications, and future directions of controllable text generation using transformer‑based pretrained language models, highlighting methods such as decoding strategies, prompt learning, memory networks, continual learning, contrastive training, and knowledge integration.

continual learningcontrastive trainingcontrollable text generation
0 likes · 13 min read
Advances and Challenges in Controllable Text Generation with Pretrained Language Models
Alimama Tech
Alimama Tech
Sep 14, 2022 · Artificial Intelligence

Streaming Graph Neural Networks via Generative Replay

The paper introduces SGNN‑GR, a framework that pairs a graph neural network with a GAN‑based generative model to replay synthetic historical nodes, enabling continual learning on evolving graphs without storing raw data, achieving near‑retraining accuracy while being 3–6× faster per iteration.

Incremental Learningcontinual learninggenerative replay
0 likes · 10 min read
Streaming Graph Neural Networks via Generative Replay
DaTaobao Tech
DaTaobao Tech
Aug 30, 2022 · Artificial Intelligence

CTNet: Continual Transfer Learning for Cross-Domain Recommendation

CTNet is a continual transfer learning framework that uses a lightweight Adapter to map source‑domain features onto evolving target‑domain recommendation tasks, preserving all model parameters to avoid catastrophic forgetting and delivering substantial gains in click‑through rate, conversion, and overall business performance in Taobao’s cross‑domain e‑commerce scenario.

Adapter ModuleRecommendation Systemscontinual learning
0 likes · 12 min read
CTNet: Continual Transfer Learning for Cross-Domain Recommendation
Meituan Technology Team
Meituan Technology Team
Oct 14, 2021 · Artificial Intelligence

Deep Learning Advances for Click‑Through Rate Prediction in Meituan's Location‑Based Advertising

Meituan's ad team uses deep learning to handle LBS distance constraints and long‑term periodic behavior, introducing DPIN for position/context bias, an ultra‑long sequence encoder with spatiotemporal activator, dynamic candidate generation, and memory‑augmented continual learning, boosting RPM 2‑20% and enabling sub‑millisecond inference.

AdvertisingCTR predictionDeep Learning
0 likes · 29 min read
Deep Learning Advances for Click‑Through Rate Prediction in Meituan's Location‑Based Advertising