Tagged articles
16 articles
Page 1 of 1
Machine Heart
Machine Heart
May 18, 2026 · Artificial Intelligence

ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents

The paper introduces Native Parallel Reasoner (NPR), a framework that lets language agents generate and maintain multiple reasoning paths using a three‑stage self‑distillation and parallel reinforcement‑learning training paradigm, achieving up to 4.6× speedup and significant accuracy gains across eight reasoning benchmarks.

AI reasoningLarge Language ModelsNative Parallel Reasoner
0 likes · 18 min read
ICML 2026: From Single‑Threaded Thinking to Native Parallel Reasoning in Agents
PaperAgent
PaperAgent
May 13, 2026 · Artificial Intelligence

One-for-All Multi-Agent Collaboration: Adaptive Cross-Task Topology Design

The paper introduces OFA-MAS, a one‑for‑all multi‑agent system that learns a universal topology designer using task‑aware graph encoding and a Mixture‑of‑Experts generator, achieving superior performance, OOD generalization, robustness, and efficiency across six major benchmarks.

LLMMixture of ExpertsTask-Aware Graph Encoder
0 likes · 14 min read
One-for-All Multi-Agent Collaboration: Adaptive Cross-Task Topology Design
Machine Heart
Machine Heart
May 12, 2026 · Artificial Intelligence

DECS Cuts Overthinking in Models: Halve Inference Tokens and Raise Accuracy

DECS, a novel training framework introduced by researchers from Fudan, Shanghai Jiao Tong, and the Shanghai AI Lab, theoretically exposes the flaws of length‑penalty rewards and, through token‑level reward decoupling and dynamic batch scheduling, reduces inference token counts by over 50% while improving accuracy across multiple benchmarks.

DECSLarge Language Modelsbenchmark evaluation
0 likes · 9 min read
DECS Cuts Overthinking in Models: Halve Inference Tokens and Raise Accuracy
Machine Heart
Machine Heart
May 5, 2026 · Artificial Intelligence

Agent-World: Scaling Real-World Environments for Co‑Evolving Agents and Their Worlds

Agent-World introduces a universal training arena that automatically mines real‑world data from the internet to build over 1,900 diverse environments and 19,800 tools, then generates long‑horizon tasks through graph‑based and programmatic synthesis, creating a self‑evolving loop where agents are evaluated, diagnosed, and the environment is refined, achieving state‑of‑the‑art results on 23 benchmarks.

AI agentsAgent-WorldLarge-Scale Training
0 likes · 14 min read
Agent-World: Scaling Real-World Environments for Co‑Evolving Agents and Their Worlds
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Apr 14, 2026 · Artificial Intelligence

How Self‑Supervised HINTS Extracts Human Insights from Time Series to Boost Forecast Accuracy

The paper introduces HINTS, a two‑stage self‑supervised framework that leverages Friedkin‑Johnsen opinion dynamics to mine latent human‑driven factors from time‑series residuals, integrates them via attention into state‑of‑the‑art predictors, and demonstrates consistent accuracy gains and interpretability across nine benchmark and real‑world datasets.

Attention MechanismFriedkin-Johnsen modelbenchmark evaluation
0 likes · 17 min read
How Self‑Supervised HINTS Extracts Human Insights from Time Series to Boost Forecast Accuracy
SuanNi
SuanNi
Apr 3, 2026 · Artificial Intelligence

How GEMS Lets a 6B Open‑Source Model Beat Top Closed‑Source Image Generators

The article presents the GEMS (Agent‑Native Multimodal Generation with Memory and Skills) framework, detailing its multi‑agent loop, hierarchical memory compression, on‑demand skill modules, and extensive benchmark results that show a lightweight 6B model surpassing larger proprietary systems on complex image‑generation tasks.

GEMSImage GenerationMultimodal AI
0 likes · 14 min read
How GEMS Lets a 6B Open‑Source Model Beat Top Closed‑Source Image Generators
SuanNi
SuanNi
Mar 20, 2026 · Artificial Intelligence

How XSKILL Lets Multimodal AI Agents Learn Without Updating Parameters

XSKILL introduces a dual‑stream framework that separates task‑level skills stored as Markdown and action‑level experiences stored as JSON, enabling multimodal large language model agents to continuously improve by extracting, summarizing, and reusing knowledge from past trajectories without modifying model parameters, achieving significant gains across visual tool, multimodal search, and integrated benchmarks.

Agent FrameworkMultimodal AIbenchmark evaluation
0 likes · 12 min read
How XSKILL Lets Multimodal AI Agents Learn Without Updating Parameters
Instant Consumer Technology Team
Instant Consumer Technology Team
Dec 18, 2025 · Artificial Intelligence

How a Multi‑Agent Framework Boosts Graph Chain‑of‑Thought Reasoning Efficiency

The paper introduces GLM, a multi‑agent Graph‑CoT framework with an optimized LLM serving architecture that dramatically improves accuracy, reduces token consumption, lowers latency, and increases throughput across diverse domains, as demonstrated by extensive GRBench evaluations.

LLM optimizationMulti-AgentToken efficiency
0 likes · 10 min read
How a Multi‑Agent Framework Boosts Graph Chain‑of‑Thought Reasoning Efficiency
AntTech
AntTech
Oct 14, 2025 · Artificial Intelligence

How Ring-1T Achieves Trillion-Scale Deep Thinking and Competitive Benchmarks

The Ring-1T model, a trillion-parameter AI system released as open source, leverages advanced reinforcement learning techniques, extensive benchmark evaluations, and custom training frameworks to deliver balanced performance across math, code, reasoning, and creative tasks while highlighting current limitations and future development plans.

AI modelReinforcement Learningbenchmark evaluation
0 likes · 8 min read
How Ring-1T Achieves Trillion-Scale Deep Thinking and Competitive Benchmarks
DataFunTalk
DataFunTalk
Jun 17, 2025 · Artificial Intelligence

MiniMax M1: Open‑Source LLM That Rivals Gemini 2.5 Pro in Long‑Context Benchmarks

MiniMax’s newly released open‑source M1 model, built on the Lightning Attention‑enhanced MiniMax‑01 base, delivers up to 1 million token context, achieves near‑state‑of‑the‑art performance on MRCR and other long‑context benchmarks, and showcases impressive multilingual translation, code completion, and creative applications.

Lightning AttentionMiniMaxbenchmark evaluation
0 likes · 11 min read
MiniMax M1: Open‑Source LLM That Rivals Gemini 2.5 Pro in Long‑Context Benchmarks
AI Frontier Lectures
AI Frontier Lectures
Apr 6, 2025 · Artificial Intelligence

Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?

A new study from the a‑m‑team introduces “Think Twice”, a test‑time multi‑round reasoning technique that, without additional training or model changes, repeatedly prompts large language models to self‑correct, yielding notable accuracy gains across benchmarks such as AIME, MATH‑500, GPQA‑Diamond and LiveCodeBench, while also producing shorter, more confident answers.

Artificial IntelligenceLLMMulti-round reasoning
0 likes · 6 min read
Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 29, 2025 · Artificial Intelligence

How DistilQwen2.5‑R1 Boosts Small‑Model Reasoning with Innovative Knowledge Distillation

The article introduces the DistilQwen2.5‑R1 series, which leverages a novel knowledge‑distillation pipeline—including CoT data evaluation, improvement, and validation—to transfer deep reasoning abilities from large models like DeepSeek‑R1 to compact models, achieving superior performance across math, code, and scientific benchmarks and providing open‑source checkpoints and deployment guides for practical use.

AI inferenceLarge Language Modelsbenchmark evaluation
0 likes · 17 min read
How DistilQwen2.5‑R1 Boosts Small‑Model Reasoning with Innovative Knowledge Distillation
Baobao Algorithm Notes
Baobao Algorithm Notes
Jun 28, 2024 · Artificial Intelligence

What Makes Gemma 2 a Competitive Open‑Source LLM? Architecture, Training, and Evaluation Insights

The article provides a detailed technical overview of Gemma 2, covering its decoder‑only transformer design, novel attention mechanisms, logit soft‑capping, RMSNorm, knowledge‑distillation training on trillions of tokens, extensive pre‑training infrastructure, and benchmark evaluations that demonstrate its competitiveness against larger proprietary models.

AIGemma 2Model architecture
0 likes · 14 min read
What Makes Gemma 2 a Competitive Open‑Source LLM? Architecture, Training, and Evaluation Insights
AntTech
AntTech
Apr 17, 2024 · Artificial Intelligence

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

LLMRG introduces a novel framework that leverages large language models to construct personalized reasoning graphs, integrating chain reasoning, self‑verification, divergent extension, and knowledge‑base self‑improvement, thereby enhancing recommendation accuracy, interpretability, and performance across multiple benchmark datasets without additional user or item information.

AIInterpretabilityLarge Language Models
0 likes · 9 min read
LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs