PaperAgent
Author

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

170
Articles
0
Likes
19
Views
0
Comments
Recent Articles

Latest from PaperAgent

100 recent articles max
PaperAgent
PaperAgent
Mar 26, 2026 · Artificial Intelligence

TurboQuant: How Google’s New Vector Quantization Cuts KV Memory 6× and Boosts Speed

TurboQuant, presented at ICLR 2026, introduces a theoretically grounded vector quantization technique that reduces large‑language‑model key‑value cache memory by at least six times, achieves up to eight‑fold speedups, and maintains zero accuracy loss by combining PolarQuant’s polar‑coordinate compression with a 1‑bit QJL error‑correction step, as demonstrated on benchmarks such as LongBench and GloVe.

AI inferenceTurboQuantbenchmarking
0 likes · 10 min read
TurboQuant: How Google’s New Vector Quantization Cuts KV Memory 6× and Boosts Speed
PaperAgent
PaperAgent
Mar 22, 2026 · Artificial Intelligence

How AI Agents Like OpenClaw Turn LLMs into Autonomous Assistants

This article explains what AI agents are, how they differ from ordinary language‑model interfaces, and walks through OpenClaw’s workflow, tool usage, security challenges, memory handling, and advanced features such as sub‑agents and context compaction, offering practical insights for building safe autonomous AI systems.

AI AgentContext EngineeringLarge Language Model
0 likes · 27 min read
How AI Agents Like OpenClaw Turn LLMs into Autonomous Assistants
PaperAgent
PaperAgent
Mar 22, 2026 · Artificial Intelligence

Can LLM Agents Self‑Evolve Without Retraining? Inside Memento‑Skills

The article analyzes the Memento‑Skills framework, which treats external memory as executable skills to enable deployment‑time continual learning for frozen LLM agents, detailing its read‑write reflective loop, skill‑as‑memory design, behavior‑trained skill router, experimental validation on GAIA and HLE benchmarks, and theoretical guarantees without gradient updates.

AIAgentContinual Learning
0 likes · 9 min read
Can LLM Agents Self‑Evolve Without Retraining? Inside Memento‑Skills
PaperAgent
PaperAgent
Mar 21, 2026 · Artificial Intelligence

How Cursor’s Composer 2 Leverages Self‑Summarization and RL for Long‑Horizon Tasks

The article examines Cursor’s Composer 2 model, detailing its self‑summarization reinforcement‑learning workflow, the limitations of traditional compression methods, token‑efficient results on the CursorBench benchmark, and a challenging Terminal‑Bench case study that demonstrates dramatically reduced token usage while improving performance.

Agentic AIComposer 2Compression
0 likes · 9 min read
How Cursor’s Composer 2 Leverages Self‑Summarization and RL for Long‑Horizon Tasks
PaperAgent
PaperAgent
Mar 21, 2026 · Artificial Intelligence

Can AI Truly Be Creative? Inside the CreativeBench Benchmark

This article examines the CreativeBench benchmark, which redefines machine creativity by measuring both the quality and novelty of generated solutions, explains its combinatorial and exploratory task designs, details the self‑evolving task construction process, and discusses key findings and the EvoRePE enhancement method.

AI benchmarkEvoRePElarge language models
0 likes · 18 min read
Can AI Truly Be Creative? Inside the CreativeBench Benchmark
PaperAgent
PaperAgent
Mar 21, 2026 · Artificial Intelligence

Can Peer Review Boost Large Language Model Ensembles? Introducing LLM‑PeerReview

This article analyzes the unsupervised LLM‑PeerReview framework, which uses a peer‑review inspired scoring, reasoning, and selection pipeline—including a novel flipped‑triple scoring trick—to combine multiple large language models and achieve significant performance gains over existing ensemble and collaboration baselines.

Artificial IntelligenceFlipped Triple ScoringLLM Ensemble
0 likes · 11 min read
Can Peer Review Boost Large Language Model Ensembles? Introducing LLM‑PeerReview
PaperAgent
PaperAgent
Mar 19, 2026 · Artificial Intelligence

How Scale‑SWE’s Real‑World Software Engineering Dataset Supercharges AI Models

The Scale‑SWE project releases a 100k‑task real software‑engineering dataset built with a sandboxed multi‑agent workflow, demonstrating that models fine‑tuned on this data achieve 64% on SWE‑bench‑Verified and surpass leading industrial baselines, highlighting the critical value of authentic SWE data.

AI agentsQwen3-30A3B-InstructScale-SWE
0 likes · 7 min read
How Scale‑SWE’s Real‑World Software Engineering Dataset Supercharges AI Models
PaperAgent
PaperAgent
Mar 19, 2026 · Artificial Intelligence

How MDER‑DR Boosts Multi‑Hop KG QA with Entity‑Centric Summaries

The article presents the MDER‑DR two‑stage framework that tackles semantic loss in knowledge‑graph triple indexing by generating context‑aware entity summaries and using an LLM‑driven decompose‑parse retrieval loop, achieving up to 66% performance gains on multi‑hop question answering benchmarks.

Entity SummarizationKG QAKnowledge Graph
0 likes · 5 min read
How MDER‑DR Boosts Multi‑Hop KG QA with Entity‑Centric Summaries
PaperAgent
PaperAgent
Mar 17, 2026 · Artificial Intelligence

Can Attention Replace Fixed Residuals? Inside the ‘Attention Residuals’ Breakthrough

This article analyzes the newly released Attention Residuals paper, explaining how learnable attention weighting replaces fixed residual addition to mitigate information dilution in deep LLMs, detailing the proposed Block AttnRes design, engineering trade‑offs, experimental results, and its significance for foundational model architecture.

AttentionBlock AttentionLLM
0 likes · 9 min read
Can Attention Replace Fixed Residuals? Inside the ‘Attention Residuals’ Breakthrough
PaperAgent
PaperAgent
Mar 16, 2026 · Artificial Intelligence

How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer

The article details how the newly released GLM-5-Turbo "lobster" model powers an AI research Lab that automatically generates a complete OpenClaw survey paper—from topic brainstorming and literature mining to outline drafting, manuscript writing, and AAAI‑style submission—within an hour, showcasing benchmark results, prompt templates, and practical skill installations.

AI research automationAutoClawGLM-5-Turbo
0 likes · 10 min read
How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer