Tagged articles
2014 articles
Page 7 of 21
AI Engineering
AI Engineering
Jan 30, 2026 · Artificial Intelligence

Why Letting LLMs Argue Improves Their Reasoning Quality

Google’s recent study of over 8,000 reasoning tasks shows that advanced LLMs like DeepSeek‑R1 spontaneously develop multiple internal “expert” personas that debate, and that activating a discovered “social switch” dramatically raises accuracy, revealing that engineered conflict can enhance AI reasoning.

AI debateFeature ControlLLM
0 likes · 8 min read
Why Letting LLMs Argue Improves Their Reasoning Quality
PaperAgent
PaperAgent
Jan 30, 2026 · Artificial Intelligence

How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training

The LLM‑in‑Sandbox framework places large language models inside a virtual machine that provides external tool access, persistent storage, and code execution, yielding up to a 24.2% performance boost across six benchmark tasks without additional training, and it scales from zero‑shot to reinforcement‑learning‑enhanced agents while remaining cost‑effective.

Agentic AILLMReinforcement Learning
0 likes · 6 min read
How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training
Wuming AI
Wuming AI
Jan 29, 2026 · Artificial Intelligence

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

This article explains how to keep essential information from lengthy AI chat histories by using an intelligent summarization prompt, injecting the summary as a system message, and applying a sliding‑window strategy that retains the last three exchanges, thereby reducing token cost and preserving context continuity.

LLMPrompt engineeringc++
0 likes · 11 min read
How to Compress Long LLM Conversations with Smart Summarization and Sliding Window
AI Engineering
AI Engineering
Jan 29, 2026 · Artificial Intelligence

Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution

Andrej Karpathy recounts how, within weeks, he shifted from 80% manual coding to 80% AI‑generated code, highlighting AI’s new logical flaws, its tireless persistence, expanded capabilities beyond speed, practical tips, skill erosion, and a 2026 forecast of ubiquitous AI‑produced content.

AI CodingAndrej KarpathyLLM
0 likes · 7 min read
Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 28, 2026 · Artificial Intelligence

How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts

The HiveMind framework introduces a contribution‑guided online prompt optimization (CG‑OPO) that quantifies each LLM‑driven agent’s impact with Shapley values and uses a DAG‑Shapley algorithm to efficiently attribute credit, enabling real‑time adaptive optimization of multi‑agent stock‑trading systems and achieving superior returns with far fewer LLM calls.

DAG-ShapleyFinancial TradingLLM
0 likes · 15 min read
How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts
Amap Tech
Amap Tech
Jan 28, 2026 · Artificial Intelligence

Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL

This article introduces the Agents‑Companion paradigm for Text‑to‑SQL, detailing how self‑describing database agents autonomously mine schema, statistics and semantics to generate high‑quality evidence, thereby bridging the gap between academic research and industrial deployment and significantly improving query accuracy.

Database MiningLLMText-to-SQL
0 likes · 8 min read
Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL
PaperAgent
PaperAgent
Jan 27, 2026 · Artificial Intelligence

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

This article analyzes the Agentic‑R framework, which upgrades traditional single‑hop Retrieval‑Augmented Generation by introducing dual‑perspective scoring and a bidirectional flywheel, resulting in 2–3 absolute EM improvements across seven QA datasets and a 10–15% reduction in search rounds.

LLMRAGagentic search
0 likes · 6 min read
How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 27, 2026 · Artificial Intelligence

DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow

DeepSeek-OCR 2 introduces Visual Causal Flow and a LLM‑based visual encoder, achieving 91.09% accuracy on OmniDocBench v1.5, while providing detailed installation, two inference modes (vLLM and Transformers), and an analysis of its strengths and limitations for complex document processing.

DeepEncoder V2DeepSeek-OCR 2LLM
0 likes · 9 min read
DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow
AI Tech Publishing
AI Tech Publishing
Jan 27, 2026 · Artificial Intelligence

Step‑by‑Step: Adding Skill Capabilities to Your Agent System

This article walks through the design patterns, three‑level loading mechanism, and practical implementation steps for integrating reusable, domain‑specific Skills into an existing Agent system, covering both local and distributed deployments with Redis‑based versioning and sandboxed execution.

LLMMeta-Tool PatternProgressive Disclosure
0 likes · 14 min read
Step‑by‑Step: Adding Skill Capabilities to Your Agent System
AI Cyberspace
AI Cyberspace
Jan 26, 2026 · Artificial Intelligence

How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX

This article explains the NVFP4 4‑bit floating‑point quantization technique, shows how to deploy Qwen3‑30B‑A3B models with TensorRT‑LLM and vLLM, compares performance across NVFP4, AWQ and INT8 quantizations, and provides practical profiling commands for NVIDIA DGX systems.

InferenceLLMNVFP4
0 likes · 23 min read
How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 26, 2026 · Artificial Intelligence

How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance

This article details the engineering challenges and solutions for deploying a 3.5 billion‑parameter MoE LLM in Taobao's search relevance pipeline, covering large‑batch scheduling, dynamic load balancing, intra‑batch KV‑Cache reuse, and MoE kernel tuning to meet sub‑second latency requirements.

Inference OptimizationKV cacheLLM
0 likes · 15 min read
How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance
Fun with Large Models
Fun with Large Models
Jan 25, 2026 · Artificial Intelligence

Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code

This article explains the three‑layer Agent Skills architecture, demonstrates step‑by‑step creation and configuration of a Skill using Claude Code—including metadata, instruction, and resource layers, advanced scripting integration, and a detailed comparison with MCP, highlighting token savings and use‑case differences.

AI AgentAgent SkillsClaude Code
0 likes · 18 min read
Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code
AI Frontier Lectures
AI Frontier Lectures
Jan 25, 2026 · Artificial Intelligence

Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough

Render‑of‑Thought (RoT) proposes a novel visual‑latent reasoning framework that compresses textual chain‑of‑thought into dense image embeddings, achieving faster inference, better interpretability, and plug‑and‑play integration without costly pre‑training, as demonstrated on multiple math and logic benchmarks.

Chain-of-ThoughtImplicit CoTInference Acceleration
0 likes · 11 min read
Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough
PaperAgent
PaperAgent
Jan 25, 2026 · Artificial Intelligence

How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search

Deep GraphRAG tackles the three‑fold dilemma of traditional Retrieval‑Augmented Generation by introducing hierarchical global‑to‑local retrieval, a beam‑search dynamic reordering that cuts latency, and a DW‑GRPO reinforcement‑learning module that adaptively weights rewards, achieving near‑state‑of‑the‑art performance with up to 86% faster inference.

AI researchGraphRAGHierarchical Retrieval
0 likes · 5 min read
How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 24, 2026 · Artificial Intelligence

What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?

After DPO, the typical research trajectory moves through GRPO, DAPO, GSPO, and SAPO, each introducing new optimization objectives, sampling strategies, and reward‑shaping techniques that aim to reduce memory usage, improve gradient stability, and enhance the efficiency of large‑model reinforcement learning.

DAPOGRPOGSPO
0 likes · 6 min read
What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?
Tech Verticals & Horizontals
Tech Verticals & Horizontals
Jan 23, 2026 · Artificial Intelligence

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

This article provides an in‑depth comparison of nine mainstream AI agent development frameworks—Pydantic AI, SmolAgents, DeepAgents, LlamaIndex, CAMEL, AutoGen, CrewAI, LangGraph, and OpenAI Agents SDK—detailing their design principles, strengths, weaknesses, typical scenarios, and guidance for selecting or mixing them in production.

Agent FrameworksComparisonLLM
0 likes · 30 min read
Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit
PaperAgent
PaperAgent
Jan 23, 2026 · Artificial Intelligence

Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More

AAAI 2026 in Singapore showcased 23,680 submissions, highlighting breakthrough papers such as ReconVLA’s reconstructive vision‑language‑action model, LLM2CLIP’s language‑enhanced multimodal representation, a sheaflet‑based hypergraph neural network design, advances in description logic modeling, and a novel causal discovery method for dynamical systems.

AAAI 2026AI PapersLLM
0 likes · 7 min read
Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More
Data STUDIO
Data STUDIO
Jan 23, 2026 · Artificial Intelligence

Choosing the Best AI Agent Framework: A Practical Guide

This article explains the core AI agent loop, why dedicated frameworks are needed, compares eight popular frameworks—including RelevanceAI, smolagents, PhiData, LangChain, LlamaIndex, CrewAI, AutoGen, and LangGraph—offers selection criteria, and provides hands‑on code demos for AutoGen and LangGraph.

AI agentsAutoGenLLM
0 likes · 19 min read
Choosing the Best AI Agent Framework: A Practical Guide
Node.js Tech Stack
Node.js Tech Stack
Jan 23, 2026 · Backend Development

Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response

Bun introduces the --cpu-prof-md flag that outputs CPU profiling data as structured Markdown for large language models, earning praise from Vue creator Evan You and inspiring Node.js core contributor Matteo Collina to release a pprof‑to‑md converter, highlighting a shift toward AI‑oriented CLI tools.

AI debuggingBunCLI tools
0 likes · 7 min read
Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response
Architecture Digest
Architecture Digest
Jan 22, 2026 · Artificial Intelligence

Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide

WeKnora is an open‑source LLM‑driven framework that transforms complex, multi‑format documents into searchable semantic knowledge, offering features such as Agent mode, hybrid retrieval, secure private deployment, and an easy‑to‑use web UI, with step‑by‑step installation instructions and demo screenshots.

LLMWeKnoraai
0 likes · 7 min read
Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide
DeWu Technology
DeWu Technology
Jan 21, 2026 · Artificial Intelligence

Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs

By integrating large language models to dynamically construct user knowledge graphs and applying two‑hop reasoning, the authors enhance serendipity in a large‑scale e‑commerce community recommendation system, achieving significant online gains in diversity, novelty, and user engagement metrics.

Industrial DeploymentLLMSerendipity
0 likes · 17 min read
Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs
AI Frontier Lectures
AI Frontier Lectures
Jan 21, 2026 · Artificial Intelligence

How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization

The paper introduces AP2O‑Coder, an adaptive progressive preference optimization framework that systematically captures error types, progressively refines LLM code generation, and dynamically adapts training data, achieving up to a 3% pass@k improvement across multiple open‑source models while reducing data requirements.

AP2O-CoderLLMPreference Optimization
0 likes · 11 min read
How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jan 21, 2026 · Artificial Intelligence

Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG

This article details how to deploy the 235‑billion‑parameter Qwen3‑235B model using PD‑separation and MoE techniques, explains the associated challenges, and demonstrates a production‑grade solution built on the high‑performance SGLang inference engine and the RoleBasedGroup (RBG) orchestration framework, complete with benchmark results and best‑practice YAML examples.

InferenceKubernetesLLM
0 likes · 21 min read
Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG
Data Party THU
Data Party THU
Jan 21, 2026 · Artificial Intelligence

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Analyzing recent DeepSeek flashmla repository commits, the article uncovers that the mysterious Model1 likely corresponds to DeepSeek‑V4, detailing architectural shifts to a 512‑dimensional head, full support for NVIDIA Blackwell GPUs, token‑level sparse MLA, and new mechanisms such as Value Vector Position Awareness and Engram.

DeepSeekDeepSeek-V4GPU Optimization
0 likes · 6 min read
What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM
Zhihu Tech Column
Zhihu Tech Column
Jan 20, 2026 · Artificial Intelligence

How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study

This article details a multi‑year, data‑driven transformation in which a product‑research team leveraged large‑model AI and agentic workflows to automate repetitive coding, streamline hot‑topic discussion creation, and replace a seven‑person outsourcing crew, achieving up to 38.6% project‑time reduction, a 22.5‑25 PD weekly capacity gain, and a dramatic drop in marginal costs.

Cost reductionGoogle ADKLLM
0 likes · 29 min read
How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study
PaperAgent
PaperAgent
Jan 20, 2026 · Artificial Intelligence

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %​

Google DeepMind's new "Intrinsic Self‑Critique" method lets large language models iteratively self‑evaluate and rewrite their plans, raising Blocksworld planning accuracy from 49.8% to 89.3% and setting new records across multiple planning benchmarks.

AI researchLLMPlanning
0 likes · 5 min read
How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %​
AI Tech Publishing
AI Tech Publishing
Jan 20, 2026 · Artificial Intelligence

10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering

The article presents a ten‑step architecture for implementing scalable LLM Skills, covering a meta‑tool pattern to avoid tool explosion, progressive three‑level loading to save tokens, script execution outside the LLM context, Redis‑based storage with pub/sub updates, version locking, dynamic addition, batch loading, and file‑system strategies.

Context EngineeringLLMMeta-Tool
0 likes · 10 min read
10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering
Data Party THU
Data Party THU
Jan 19, 2026 · Artificial Intelligence

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

The article introduces Huawei's VersatileFFN, an adaptive wide‑and‑deep feed‑forward design for large language models that reuses parameters to slash memory consumption while delivering stronger inference, detailing its dual‑system inspiration, technical mechanisms, experimental gains, and implications for efficient LLM deployment.

Adaptive ComputationLLMTransformer
0 likes · 8 min read
How VersatileFFN Cuts Memory Use While Boosting LLM Performance
PaperAgent
PaperAgent
Jan 19, 2026 · Artificial Intelligence

How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions

Recent research shows that applying reinforcement learning to large language models can dramatically improve inference performance, but its effectiveness depends on the token distribution produced during pre‑training, prompting a novel rewrite of cross‑entropy as a single‑step policy gradient with controllable entropy parameters.

LLMModel OptimizationRL
0 likes · 6 min read
How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions
AI Engineering
AI Engineering
Jan 18, 2026 · Artificial Intelligence

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

The BU Browser Use team open‑sourced bu‑agent‑sdk, a minimal LLM agent framework that treats the agent as a simple for‑loop and adds explicit done tools, context compression, ephemeral messages, and a unified LLM interface, enabling flexible, low‑overhead AI applications.

Agent FrameworkLLMPython
0 likes · 7 min read
Why a Single For Loop Powers BU’s Open‑Source Agent Framework
MaGe Linux Operations
MaGe Linux Operations
Jan 18, 2026 · Artificial Intelligence

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

This guide walks through building a production‑grade Kubernetes GPU cluster for large language model inference, covering hardware sizing, GPU resource scheduling, model storage options, automated scaling with HPA, health checks, monitoring, troubleshooting, and multi‑model deployment strategies.

DockerGPUInference
0 likes · 49 min read
How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling
PaperAgent
PaperAgent
Jan 17, 2026 · Artificial Intelligence

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

This article explains how representing multi‑component scientific knowledge as hyperedges, rather than traditional triples, enables large language models to traverse complex material interactions, reduce hallucinations, and generate verifiable experimental designs, demonstrated through a large hypergraph built from thousands of scaffold papers.

AI reasoningHypergraphLLM
0 likes · 7 min read
Hypergraphs Turn LLMs into Reliable Material Discovery Agents
macrozheng
macrozheng
Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

LLMRAGWeKnora
0 likes · 7 min read
Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework
php Courses
php Courses
Jan 16, 2026 · Artificial Intelligence

From Coding to Validation: How AI Is Redefining the Developer’s Role

The rise of large language models has shifted software development from manual coding to AI‑generated drafts, making verification, security, and business alignment the core responsibilities of modern engineers, and outlining the skills, workflows, and challenges needed to thrive in this new paradigm.

LLMaicode-generation
0 likes · 11 min read
From Coding to Validation: How AI Is Redefining the Developer’s Role
Ops Development & AI Practice
Ops Development & AI Practice
Jan 15, 2026 · Artificial Intelligence

Why Rapid Experimentation Beats Token‑Saving in LLM Development

The article explains how AI development with large language models differs from traditional software engineering, why developers feel abstract and uncertain, and offers actionable strategies—such as micro‑prototyping, tiered model usage, simple evaluation sheets, and embracing throwaway code—to accelerate learning despite token costs.

LLMRapid Prototypingtoken management
0 likes · 7 min read
Why Rapid Experimentation Beats Token‑Saving in LLM Development
PaperAgent
PaperAgent
Jan 15, 2026 · Artificial Intelligence

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

The article presents GAG, a third‑generation framework that injects proprietary domain knowledge into frozen large language models using a single token, eliminating retrieval, avoiding base model updates, and maintaining constant inference budget while delivering strong performance on private QA and public benchmarks.

AI AlignmentGAGLLM
0 likes · 8 min read
How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs
HyperAI Super Neural
HyperAI Super Neural
Jan 15, 2026 · Artificial Intelligence

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

A joint Princeton and Colorado School of Mines team introduced MOFSeq‑LMM, a large‑language‑model‑based framework that leverages a million‑scale MOF dataset and a novel string representation to predict free energy with MAE 0.789 kJ/mol and synthesizeability with 97% F1, dramatically accelerating high‑throughput MOF screening.

LLMMOFsMaterials Informatics
0 likes · 15 min read
97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability
Sohu Tech Products
Sohu Tech Products
Jan 14, 2026 · Artificial Intelligence

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

This guide walks through building an open‑source Retrieval‑Augmented Generation (RAG) system that indexes local files with Everything, uses hybrid BM25‑vector search via Elasticsearch, and answers questions with a local LLM, covering architecture, core techniques, deployment steps, performance tweaks, and common pitfalls.

ElasticsearchLLMPython
0 likes · 11 min read
Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 14, 2026 · Artificial Intelligence

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

DataAgent, built on Spring AI Alibaba, tackles the "last mile" of AI data analysis by combining deterministic workflow orchestration with large‑model reasoning, offering human‑in‑the‑loop feedback, dynamic prompt configuration, hybrid retrieval, containerized Python execution, streaming SSE, multi‑model scheduling, multi‑source connectivity, and secure API‑key management to deliver instant, insight‑rich reports for business users.

AnalyticsDataAgentLLM
0 likes · 11 min read
How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jan 14, 2026 · Artificial Intelligence

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

At EMNLP 2025, the BUPT NIRC team presented a paper that introduces the ARR metric to quantitatively separate latent reasoning from factual shortcuts in LLMs, using Logit Lens and Attention Knockout to reveal distinct internal pathways and shares their conference experience.

ARR metricAttention KnockoutEMNLP2025
0 likes · 6 min read
From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models
Data Party THU
Data Party THU
Jan 13, 2026 · Artificial Intelligence

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

DeepSeek’s newly open‑sourced Engram module introduces a scalable lookup‑based memory that separates knowledge retrieval from computation, enabling O(1) deterministic access and significantly improving large language model performance on knowledge‑heavy, reasoning, code, and math tasks without extra FLOPs.

LLMLookupMemory Architecture
0 likes · 10 min read
How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance
AI Tech Publishing
AI Tech Publishing
Jan 12, 2026 · Artificial Intelligence

Ralph Loop: Engineering Continuous Iteration for AI Agents

Ralph Loop introduces an externalized iterative loop that forces AI agents to keep working until objective completion criteria are met, dramatically extending effective runtime from hours to a full day or more and shifting human‑agent collaboration from frequent supervision to efficient delegation.

AI AgentIterative AutomationLLM
0 likes · 17 min read
Ralph Loop: Engineering Continuous Iteration for AI Agents
Design Hub
Design Hub
Jan 12, 2026 · Artificial Intelligence

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

The article introduces a visual AI prompt editor that transforms lengthy, complex prompt strings into modular, editable Chinese sections, demonstrating the workflow with two examples—converting a “California girl” portrait to an Asian style and re‑imagining a cinematic skyscraper scene—while detailing step‑by‑step usage and JSON export options.

AI prompt engineeringJSON exportLLM
0 likes · 11 min read
Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 11, 2026 · Artificial Intelligence

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

FinRpt introduces a novel multi‑agent pipeline that builds a high‑quality stock research report (ERR) dataset from six financial data sources, defines a comprehensive 11‑metric evaluation suite, and demonstrates that supervised‑fine‑tuned and reinforcement‑learned LLM agents significantly outperform single LLM baselines in both accuracy and efficiency.

DatasetFinRptLLM
0 likes · 14 min read
FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Jan 10, 2026 · Artificial Intelligence

Build and Test a Multi‑Agent AI System with MetaGPT

This guide walks through the MetaGPT framework—explaining its multi‑agent architecture, core concepts, predefined roles, team setup, environment preparation, installation, configuration, and troubleshooting steps—so you can quickly build, run, and validate a collaborative AI software‑company simulation.

AI agentsLLMMetaGPT
0 likes · 14 min read
Build and Test a Multi‑Agent AI System with MetaGPT
AI Engineering
AI Engineering
Jan 10, 2026 · Artificial Intelligence

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

Alibaba's new AgeMem framework turns long‑term and short‑term memory management for large language model agents into a learnable reinforcement‑learning task, replacing handcrafted rules with a three‑stage training process and achieving significant benchmark gains.

AgeMemGRPOLLM
0 likes · 9 min read
Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules
JD Tech Talk
JD Tech Talk
Jan 9, 2026 · Artificial Intelligence

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

JoyCode Agent leverages a patch‑test co‑generation and iterative validation framework to achieve a 74.6% Pass@1 score on the SWE‑bench Verified benchmark, reducing resource consumption by 30‑50% and introducing a closed‑loop multi‑agent pipeline that integrates testing, patch generation, trajectory compression, similarity retrieval, and decision arbitration.

LLMMulti-AgentSWE-bench
0 likes · 41 min read
How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop
PaperAgent
PaperAgent
Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG
0 likes · 7 min read
Why Traditional RAG Breaks the Chain and How SentGraph Fixes It
Meituan Technology Team
Meituan Technology Team
Jan 8, 2026 · Artificial Intelligence

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

This article curates eight AAAI 2026 papers authored by the Meituan research team, covering verifiable stepwise rewards for LLM reasoning, annealing strategies in large‑scale training, process reward models, competence‑difficulty sampling, high‑fidelity visual text rendering, counterfactual fusion, compress‑then‑rank reranking, and cross‑modal quantization for generative recommendation, with direct PDF links for each work.

AAAI2026CounterfactualLLM
0 likes · 14 min read
Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More
Kuaishou Tech
Kuaishou Tech
Jan 8, 2026 · Artificial Intelligence

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Kuaishou secured 12 papers at AAAI 2026, covering advances in search and recommendation systems, multi‑camera video generation, multimodal understanding, generative model fundamentals, video large language models, experimental design, and LLM latent‑space reasoning, with three papers highlighted as oral presentations.

LLMVideo Generationai
0 likes · 22 min read
Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 8, 2026 · Artificial Intelligence

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

This article explains how to integrate a Human‑In‑The‑Loop (HITL) mechanism into ReactAgent, detailing the motivation, design of interaction, tool description, XML‑based UI rendering, Redis‑driven waiting loop, and the broader architectural parallels with design patterns and other agent frameworks.

Design PatternsHITLHuman-in-the-Loop
0 likes · 14 min read
How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent
AndroidPub
AndroidPub
Jan 8, 2026 · Artificial Intelligence

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

This article explains Anthropic’s open‑standard Agent Skill, how it serves as a reusable task specification for Claude, walks through creating a skill with metadata, instructions, and advanced Reference/Script features, and compares Skill with MCP to help developers choose the right tool.

AI automationAgent SkillAnthropic
0 likes · 11 min read
Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps
Sohu Tech Products
Sohu Tech Products
Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

Knowledge BaseLLMRAG
0 likes · 14 min read
Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation
DaTaobao Tech
DaTaobao Tech
Jan 7, 2026 · Artificial Intelligence

5 Design Patterns to Control LLM Output in Generative AI Applications

The article presents five design patterns—Logits Masking, Grammar, Style Transfer, Reverse Neutralization, and Content Optimization—for steering the output of generative AI models, compares their suitable scenarios, advantages, drawbacks, and anti‑patterns, and provides concrete implementation steps, code snippets, and flowcharts to help developers reliably enforce style, format, and compliance constraints.

LLMPrompt engineeringgenerative AI
0 likes · 20 min read
5 Design Patterns to Control LLM Output in Generative AI Applications
Tencent Cloud Developer
Tencent Cloud Developer
Jan 7, 2026 · Artificial Intelligence

How Context Engineering Powers the Next Generation of AI Agents

Transitioning from simple chatbots to sophisticated agents, this article explains how expanding context becomes a core variable, detailing the evolution from prompt engineering to context engineering, the challenges of managing growing context, and practical solutions like structured context, tool integration, and the MCP framework for reliable AI systems.

LLMReliabilityagent
0 likes · 20 min read
How Context Engineering Powers the Next Generation of AI Agents
Wuming AI
Wuming AI
Jan 6, 2026 · Artificial Intelligence

Top LLM Leaderboards Explained: How to Choose the Right Model

This article surveys the most popular large‑language‑model leaderboards—including lmarena, Artificial Analysis, SuperCLUE, and llm‑stats—detailing their evaluation methods, coverage areas, URLs, and practical usage tips, while warning readers that rankings are only a reference and real‑world performance may vary.

AI benchmarkingArtificial IntelligenceLLM
0 likes · 5 min read
Top LLM Leaderboards Explained: How to Choose the Right Model
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 6, 2026 · Artificial Intelligence

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

FinRS integrates hierarchical market analysis, dual decision agents, and multi‑time‑scale reward feedback to enable risk‑aware multi‑stage trading, achieving higher cumulative returns, better Sharpe ratios, and lower maximum drawdowns than existing LLM‑based and reinforcement‑learning baselines across diverse stocks.

FinRSLLMReinforcement Learning
0 likes · 14 min read
FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets
PMTalk Product Manager Community
PMTalk Product Manager Community
Jan 6, 2026 · Industry Insights

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

This article provides a multi‑dimensional strategic analysis of three representative AI‑focused platforms—Dify, n8n, and ComfyUI—examining their product positioning, architecture, interaction models, commercialization strategies, and agent capabilities, and offers concrete recommendations for product managers on choosing the right tool based on ease of use, control, scalability, and total cost of ownership.

AI PlatformsLLMProduct Comparison
0 likes · 35 min read
Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation
PaperAgent
PaperAgent
Jan 6, 2026 · Artificial Intelligence

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

This article examines the shortcomings of naïve GraphRAG implementations on clinical data and explains how an ontology‑driven, zero‑noise GraphRAG architecture can create self‑improving, conflict‑free knowledge graphs for AI applications.

Data QualityGraphRAGLLM
0 likes · 3 min read
How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs
PaperAgent
PaperAgent
Jan 5, 2026 · Artificial Intelligence

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

QuCo‑RAG introduces a dynamic retrieval‑augmented generation framework that quantifies uncertainty using pre‑training corpus statistics, replacing unreliable model confidence with objective frequency and co‑occurrence evidence, achieving millisecond‑level hallucination detection, superior multi‑hop QA performance, and cross‑model transferability across various LLMs.

Dynamic RetrievalLLMRetrieval Augmented Generation
0 likes · 9 min read
How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations
AI Insight Log
AI Insight Log
Jan 4, 2026 · Artificial Intelligence

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

The open‑source ‘Agent Skills for Context Engineering’ project, which amassed over 4,100 stars in a week, demonstrates why managing a model’s attention budget—through foundational, operational, and development‑methodology skills—is essential as context windows grow, and provides platform‑agnostic instructions for Claude Code, Cursor and other AI tools.

Agent SkillsClaude CodeContext Engineering
0 likes · 7 min read
Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 4, 2026 · Artificial Intelligence

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

The VTA framework integrates large language model reasoning with textual annotation of technical indicators, employs a Time‑GRPO reinforcement‑learning objective and multi‑stage joint conditional training, and achieves state‑of‑the‑art accuracy and expert‑rated interpretability on US, Chinese and European stock datasets.

LLMReinforcement LearningStock Prediction
0 likes · 19 min read
How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting
AI Insight Log
AI Insight Log
Jan 4, 2026 · Artificial Intelligence

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

The article examines the open‑source ai‑goofish‑monitor project, which combines Playwright‑driven browsing with large‑language‑model analysis to continuously scan Xianyu listings, filter out junk, and highlight high‑quality items, while also discussing its AI‑generated code, benefits, limitations, and security risks.

LLMPlaywrightWeb Scraping
0 likes · 7 min read
How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt
PaperAgent
PaperAgent
Jan 4, 2026 · Artificial Intelligence

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

The article presents Sophia, a System 3‑enabled persistent agent framework that adds a meta‑cognitive layer to LLM‑based agents, enabling identity continuity, self‑scheduled learning, real‑time self‑checks, and autonomous task generation, and validates its benefits through a 24‑hour continuous‑run experiment.

AI agentsAutonomous AgentsLLM
0 likes · 7 min read
How Sophia’s System 3 Turns LLM Agents into Persistent Learners
Architect
Architect
Jan 3, 2026 · Artificial Intelligence

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys the emerging field of AI agent memory, presenting a three‑dimensional taxonomy of memory forms, detailing functional categories such as factual, experiential, and working memory, and outlining dynamic processes of formation, evolution, and retrieval, while also highlighting benchmarks, open‑source frameworks, and future research directions.

AI agentsLLMMemory Architecture
0 likes · 7 min read
Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics
NetEase LeiHuo Testing Center
NetEase LeiHuo Testing Center
Jan 2, 2026 · Artificial Intelligence

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

The article explains why traditional chat‑based AI tools are limited to advice, introduces next‑generation LLM‑native applications that can understand, plan, and act, and provides a step‑by‑step guide on designing AI workflows, autonomous agents, hybrid architectures, and the Model Context Protocol (MCP) using LangChain.

AI agentsLLMLangChain
0 likes · 36 min read
From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain
IT Services Circle
IT Services Circle
Jan 2, 2026 · Artificial Intelligence

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

This article surveys the most popular open‑source replacements for Google NotebookLM, detailing each project's star count, supported AI models, multimodal input capabilities, Docker deployment options, and unique features such as multi‑speaker podcast generation, semantic search, and collaborative knowledge‑base integration.

DockerLLMMultimodal
0 likes · 8 min read
Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools
AI Architecture Hub
AI Architecture Hub
Dec 31, 2025 · Artificial Intelligence

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

This article explains the motivation behind LangGraph, walks through a quick start, details its core syntax and state management, demonstrates conditional branching, parallel execution, tool integration, multi‑agent orchestration, and real‑time monitoring, and finally discusses future directions for the framework.

LLMLangGraphParallel Execution
0 likes · 32 min read
Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration
Data Party THU
Data Party THU
Dec 29, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

This article reviews the survey "Memory in the Age of AI Agents," presenting a comprehensive taxonomy that classifies agent memory by its forms, functions, and dynamic mechanisms, and explores future directions such as generative memory, reinforcement‑learning‑driven management, multimodal storage, and trustworthy handling.

AI agentsAgent ArchitectureFuture AI
0 likes · 14 min read
Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 29, 2025 · Artificial Intelligence

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

This article details the architecture and implementation of Tair KVCache Manager, an enterprise‑grade service that centralises KVCache metadata, decouples inference engines from storage, provides elastic scaling, multi‑tenant isolation, high availability, and performance‑optimised cache management for large‑scale LLM inference workloads.

Cache ManagementKVCacheLLM
0 likes · 28 min read
How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management
MaGe Linux Operations
MaGe Linux Operations
Dec 27, 2025 · Artificial Intelligence

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

This guide walks you through deploying large language models such as ChatGLM and Llama in production, covering environment setup, model quantization, dynamic batching, service configuration, Nginx load balancing, monitoring, troubleshooting, and best‑practice recommendations for high‑performance, cost‑effective AI inference.

GPUInferenceLLM
0 likes · 48 min read
How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide
AI Architecture Hub
AI Architecture Hub
Dec 27, 2025 · Artificial Intelligence

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

GraphRAG extends traditional Retrieval‑Augmented Generation by building a knowledge graph from documents, extracting entities and relationships, performing community detection, and supporting both local and global searches, offering detailed step‑by‑step guidance, code examples, configuration tips, and a comparison with classic RAG approaches.

GraphRAGLLMNeo4j
0 likes · 28 min read
How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs
Alibaba Cloud Native
Alibaba Cloud Native
Dec 27, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

This article explains how AI agents overcome context window limits by using memory systems, distinguishes short‑term (session) and long‑term (cross‑session) memory, compares implementations in Google ADK, LangChain and AgentScope, and outlines context‑engineering techniques, core components, challenges, and emerging trends.

AI memoryAgent FrameworksContext Engineering
0 likes · 20 min read
Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

This article explains the challenges of token explosion in long‑running AI agent dialogues and introduces AutoContextMemory, a Java component that automatically compresses, offloads, and summarizes conversation history to dramatically reduce token usage, speed up responses, and preserve critical information.

AgentScopeLLMcontext management
0 likes · 12 min read
How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations
360 Tech Engineering
360 Tech Engineering
Dec 26, 2025 · Artificial Intelligence

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

This article presents fifteen practical chunking techniques—ranging from line‑by‑line and fixed‑size chunking to semantic and hierarchical methods—explaining their principles, ideal use‑cases, concrete input examples, chunk outputs, and key advantages or cautions for improving Retrieval‑Augmented Generation with large language models.

Data RetrievalLLMRAG
0 likes · 28 min read
15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG
0 likes · 15 min read
How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python
Architect
Architect
Dec 25, 2025 · Artificial Intelligence

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

This article explains why traditional RAG suffers from hallucinations, introduces GraphRAG’s knowledge‑graph‑based approach, walks through its indexing and query pipelines—including text splitting, entity‑relation extraction, graph construction, community detection, and local vs. global retrieval—provides practical setup commands, Neo4j visualization steps, and compares its performance with classic RAG.

EmbeddingGraphRAGLLM
0 likes · 27 min read
How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide
360 Tech Engineering
360 Tech Engineering
Dec 25, 2025 · Artificial Intelligence

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

LangChain 1.0 replaces fragmented agent code with a production‑ready framework that unifies model outputs, simplifies tool integration, introduces content_blocks for consistent response handling, and adds a middleware system for privacy, summarization, and human‑in‑the‑loop safety, dramatically improving developer efficiency and reliability.

LLMLangChainPython
0 likes · 13 min read
Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable
AI Architecture Hub
AI Architecture Hub
Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM
0 likes · 9 min read
From LLMs to Autonomous Agents: The Three Evolution Stages of AI
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM
0 likes · 11 min read
Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection