Author

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

170

Articles

Likes

Views

Comments

Latest from PaperAgent

100 recent articles max

PaperAgent

Mar 5, 2026 · Artificial Intelligence

Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework

Claw‑R1, a new reinforcement‑learning framework from the USTC Cognitive Intelligence Lab, integrates the OpenClaw Agent Runtime with RL training to enable agents to learn directly in real environments, addressing the gap between simulated tasks and true tool‑calling, multi‑step reasoning, and stable long‑task execution.

AI infrastructureClaw-R1OpenClaw

0 likes · 10 min read

Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework

PaperAgent

Mar 4, 2026 · Artificial Intelligence

How Doubao-Seed-2.0 Redefines Native Multimodal Agents and Coding

Doubao-Seed-2.0 showcases a native multimodal architecture that unifies vision and language, delivers state‑of‑the‑art visual‑language performance, and dramatically improves code generation for front‑end, bug‑fixing, and research‑assistant tasks, illustrating the shift toward truly functional AI agents.

AI research assistantAgent ModelsDoubao

0 likes · 9 min read

How Doubao-Seed-2.0 Redefines Native Multimodal Agents and Coding

PaperAgent

Mar 3, 2026 · Artificial Intelligence

How CharacterFlywheel Scales Engaging LLMs: 15 Iterations of Production Optimization

The article presents CharacterFlywheel, a 15‑generation flywheel methodology that iteratively improves social‑dialogue LLMs in production using data‑driven reward models, rejection sampling, and a mix of SFT, DPO, and RL, with detailed experiments and best‑practice insights.

AI safetyLLM optimizationdata pipeline

0 likes · 12 min read

How CharacterFlywheel Scales Engaging LLMs: 15 Iterations of Production Optimization

PaperAgent

Mar 3, 2026 · Information Security

What 11 Critical Security Flaws Were Uncovered in OpenClaw AI Agents?

A comprehensive study of the OpenClaw framework reveals eleven severe security vulnerabilities in multi‑agent AI systems, ranging from over‑reactive data deletion to identity‑spoofing attacks, resource‑exhaustion loops, and covert manipulation, highlighting systemic social‑coherence failures and the need for robust agent governance.

AI agentsLLM securityOpenClaw

0 likes · 14 min read

What 11 Critical Security Flaws Were Uncovered in OpenClaw AI Agents?

PaperAgent

Mar 2, 2026 · Artificial Intelligence

SKILLRL: Boosting LLM Agents with Skill Distillation and Recursive Evolution

SKILLRL introduces a novel framework that transforms raw LLM agent trajectories into compact, reusable skills via experience‑driven distillation, hierarchical skill banks, and recursive skill evolution, achieving up to 90% success on ALFWorld and 73% on WebShop while reducing token usage by over 10% compared to memory‑based baselines.

LLM agentsSKILLRLhierarchical skill bank

0 likes · 10 min read

SKILLRL: Boosting LLM Agents with Skill Distillation and Recursive Evolution

PaperAgent

Mar 1, 2026 · Artificial Intelligence

How On-Policy Context Distillation Enables LLMs to Retain Experience Forever

On-Policy Context Distillation (OPCD) compresses transient in‑context knowledge into LLM parameters, allowing models to permanently retain problem‑solving experience without ground‑truth labels; the article details the OPCD framework, training steps, teacher‑student configurations, and experimental results on math, games, and system‑prompt tasks, highlighting its advantages over traditional context distillation.

Artificial IntelligenceLLMOPCD

0 likes · 8 min read

How On-Policy Context Distillation Enables LLMs to Retain Experience Forever

PaperAgent

Feb 27, 2026 · Artificial Intelligence

How DualPath Eliminates Storage Bandwidth Bottlenecks in Agentic LLM Inference

This article analyzes the DualPath architecture that redesigns KV‑Cache data paths to overcome storage‑NIC saturation in Prefill‑Decode LLM systems, presenting theoretical proofs, detailed engineering solutions, and extensive offline and online benchmarks that demonstrate up to 2.25× performance gains.

DualPathLLM inferencePerformance optimization

0 likes · 9 min read

How DualPath Eliminates Storage Bandwidth Bottlenecks in Agentic LLM Inference

PaperAgent

Feb 27, 2026 · Artificial Intelligence

How HyperRAG Uses N‑ary Hypergraphs to Overcome Binary KG Limitations

HyperRAG introduces an n‑ary hypergraph retrieval framework that replaces binary knowledge‑graph triples with hyperedges, addressing semantic fragmentation and path‑explosion while delivering superior accuracy and efficiency across multiple closed‑ and open‑domain QA benchmarks.

HyperRAGHypergraphKnowledge Graph

0 likes · 6 min read

How HyperRAG Uses N‑ary Hypergraphs to Overcome Binary KG Limitations

PaperAgent

Feb 26, 2026 · Industry Insights

What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power

Recent reports indicate that DeepSeek's unreleased V4 Lite model, featuring a 1‑million‑token context window and native multimodal reasoning, has been leaked online, with Huawei gaining early access while Nvidia is excluded, and the model demonstrates impressive spatial reasoning in generated SVG examples.

DeepSeekLarge Language ModelV4 Lite

0 likes · 3 min read

What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power

PaperAgent

Feb 26, 2026 · Artificial Intelligence

How In-Context Co‑Player Inference and LLM‑Driven Evolution Are Redefining Multi‑Agent RL

This article analyzes two recent Google papers—one introducing context‑based co‑player inference for robust multi‑agent cooperation and the other presenting AlphaEvolve, an LLM‑guided evolutionary framework that automatically discovers novel multi‑agent learning algorithms—detailing their methods, experimental findings, and broader implications for AI research.

AlphaEvolveLLM-driven algorithm discoveryPredictive Policy Improvement

0 likes · 11 min read

How In-Context Co‑Player Inference and LLM‑Driven Evolution Are Redefining Multi‑Agent RL