Tagged articles

2014 articles

Page 7 of 21

Jan 30, 2026 · Artificial Intelligence

Deploy Kimi 2.5 LLM on Alibaba Cloud with SGLang, RBG, and Openclaw

This guide walks through preparing the Kimi 2.5 model, uploading it to OSS, configuring persistent storage, and using SGLang, RoleBasedGroup, and Openclaw to deploy a production‑grade inference service on Alibaba Cloud Kubernetes with step‑by‑step commands and YAML examples.

DeploymentKimiKubernetes

0 likes · 14 min read

Deploy Kimi 2.5 LLM on Alibaba Cloud with SGLang, RBG, and Openclaw

AI Engineering

Jan 30, 2026 · Artificial Intelligence

Why Letting LLMs Argue Improves Their Reasoning Quality

Google’s recent study of over 8,000 reasoning tasks shows that advanced LLMs like DeepSeek‑R1 spontaneously develop multiple internal “expert” personas that debate, and that activating a discovered “social switch” dramatically raises accuracy, revealing that engineered conflict can enhance AI reasoning.

AI debateFeature ControlLLM

0 likes · 8 min read

Why Letting LLMs Argue Improves Their Reasoning Quality

PaperAgent

Jan 30, 2026 · Artificial Intelligence

How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training

The LLM‑in‑Sandbox framework places large language models inside a virtual machine that provides external tool access, persistent storage, and code execution, yielding up to a 24.2% performance boost across six benchmark tasks without additional training, and it scales from zero‑shot to reinforcement‑learning‑enhanced agents while remaining cost‑effective.

Agentic AILLMReinforcement Learning

0 likes · 6 min read

How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training

Wuming AI

Jan 29, 2026 · Artificial Intelligence

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

This article explains how to keep essential information from lengthy AI chat histories by using an intelligent summarization prompt, injecting the summary as a system message, and applying a sliding‑window strategy that retains the last three exchanges, thereby reducing token cost and preserving context continuity.

LLMPrompt engineeringc++

0 likes · 11 min read

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

AI Engineering

Jan 29, 2026 · Artificial Intelligence

Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution

Andrej Karpathy recounts how, within weeks, he shifted from 80% manual coding to 80% AI‑generated code, highlighting AI’s new logical flaws, its tireless persistence, expanded capabilities beyond speed, practical tips, skill erosion, and a 2026 forecast of ubiquitous AI‑produced content.

AI CodingAndrej KarpathyLLM

0 likes · 7 min read

Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution

Bighead's Algorithm Notes

Jan 28, 2026 · Artificial Intelligence

How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts

The HiveMind framework introduces a contribution‑guided online prompt optimization (CG‑OPO) that quantifies each LLM‑driven agent’s impact with Shapley values and uses a DAG‑Shapley algorithm to efficiently attribute credit, enabling real‑time adaptive optimization of multi‑agent stock‑trading systems and achieving superior returns with far fewer LLM calls.

DAG-ShapleyFinancial TradingLLM

0 likes · 15 min read

How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts

Amap Tech

Jan 28, 2026 · Artificial Intelligence

Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL

This article introduces the Agents‑Companion paradigm for Text‑to‑SQL, detailing how self‑describing database agents autonomously mine schema, statistics and semantics to generate high‑quality evidence, thereby bridging the gap between academic research and industrial deployment and significantly improving query accuracy.

Database MiningLLMText-to-SQL

0 likes · 8 min read

Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL

Ops Development Stories

Jan 28, 2026 · Artificial Intelligence

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

This article systematically explains the concepts of MCP, Agent, Skill, and Rule from an engineering viewpoint, highlighting their roles, differences from traditional API calls, and how they enable large language models to safely and autonomously interact with external tools.

AI ArchitectureLLMMCP

0 likes · 8 min read

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

PaperAgent

Jan 27, 2026 · Artificial Intelligence

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

This article analyzes the Agentic‑R framework, which upgrades traditional single‑hop Retrieval‑Augmented Generation by introducing dual‑perspective scoring and a bidirectional flywheel, resulting in 2–3 absolute EM improvements across seven QA datasets and a 10–15% reduction in search rounds.

LLMRAGagentic search

0 likes · 6 min read

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

Old Zhang's AI Learning

Jan 27, 2026 · Artificial Intelligence

DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow

DeepSeek-OCR 2 introduces Visual Causal Flow and a LLM‑based visual encoder, achieving 91.09% accuracy on OmniDocBench v1.5, while providing detailed installation, two inference modes (vLLM and Transformers), and an analysis of its strengths and limitations for complex document processing.

DeepEncoder V2DeepSeek-OCR 2LLM

0 likes · 9 min read

DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow

AI Tech Publishing

Jan 27, 2026 · Artificial Intelligence

Step‑by‑Step: Adding Skill Capabilities to Your Agent System

This article walks through the design patterns, three‑level loading mechanism, and practical implementation steps for integrating reusable, domain‑specific Skills into an existing Agent system, covering both local and distributed deployments with Redis‑based versioning and sandboxed execution.

LLMMeta-Tool PatternProgressive Disclosure

0 likes · 14 min read

Step‑by‑Step: Adding Skill Capabilities to Your Agent System

AI Cyberspace

Jan 26, 2026 · Artificial Intelligence

How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX

This article explains the NVFP4 4‑bit floating‑point quantization technique, shows how to deploy Qwen3‑30B‑A3B models with TensorRT‑LLM and vLLM, compares performance across NVFP4, AWQ and INT8 quantizations, and provides practical profiling commands for NVIDIA DGX systems.

InferenceLLMNVFP4

0 likes · 23 min read

How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX

Alibaba Cloud Developer

Jan 26, 2026 · Artificial Intelligence

How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance

This article details the engineering challenges and solutions for deploying a 3.5 billion‑parameter MoE LLM in Taobao's search relevance pipeline, covering large‑batch scheduling, dynamic load balancing, intra‑batch KV‑Cache reuse, and MoE kernel tuning to meet sub‑second latency requirements.

Inference OptimizationKV cacheLLM

0 likes · 15 min read

How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance

Fun with Large Models

Jan 25, 2026 · Artificial Intelligence

Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code

This article explains the three‑layer Agent Skills architecture, demonstrates step‑by‑step creation and configuration of a Skill using Claude Code—including metadata, instruction, and resource layers, advanced scripting integration, and a detailed comparison with MCP, highlighting token savings and use‑case differences.

AI AgentAgent SkillsClaude Code

0 likes · 18 min read

Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code

AI Frontier Lectures

Jan 25, 2026 · Artificial Intelligence

Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough

Render‑of‑Thought (RoT) proposes a novel visual‑latent reasoning framework that compresses textual chain‑of‑thought into dense image embeddings, achieving faster inference, better interpretability, and plug‑and‑play integration without costly pre‑training, as demonstrated on multiple math and logic benchmarks.

Chain-of-ThoughtImplicit CoTInference Acceleration

0 likes · 11 min read

Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough

PaperAgent

Jan 25, 2026 · Artificial Intelligence

How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search

Deep GraphRAG tackles the three‑fold dilemma of traditional Retrieval‑Augmented Generation by introducing hierarchical global‑to‑local retrieval, a beam‑search dynamic reordering that cuts latency, and a DW‑GRPO reinforcement‑learning module that adaptively weights rewards, achieving near‑state‑of‑the‑art performance with up to 86% faster inference.

AI researchGraphRAGHierarchical Retrieval

0 likes · 5 min read

How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search

Baobao Algorithm Notes

Jan 24, 2026 · Artificial Intelligence

What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?

After DPO, the typical research trajectory moves through GRPO, DAPO, GSPO, and SAPO, each introducing new optimization objectives, sampling strategies, and reward‑shaping techniques that aim to reduce memory usage, improve gradient stability, and enhance the efficiency of large‑model reinforcement learning.

DAPOGRPOGSPO

0 likes · 6 min read

What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?

Tech Verticals & Horizontals

Jan 23, 2026 · Artificial Intelligence

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

This article provides an in‑depth comparison of nine mainstream AI agent development frameworks—Pydantic AI, SmolAgents, DeepAgents, LlamaIndex, CAMEL, AutoGen, CrewAI, LangGraph, and OpenAI Agents SDK—detailing their design principles, strengths, weaknesses, typical scenarios, and guidance for selecting or mixing them in production.

Agent FrameworksComparisonLLM

0 likes · 30 min read

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

PaperAgent

Jan 23, 2026 · Artificial Intelligence

Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More

AAAI 2026 in Singapore showcased 23,680 submissions, highlighting breakthrough papers such as ReconVLA’s reconstructive vision‑language‑action model, LLM2CLIP’s language‑enhanced multimodal representation, a sheaflet‑based hypergraph neural network design, advances in description logic modeling, and a novel causal discovery method for dynamical systems.

AAAI 2026AI PapersLLM

0 likes · 7 min read

Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More

Data STUDIO

Jan 23, 2026 · Artificial Intelligence

Choosing the Best AI Agent Framework: A Practical Guide

This article explains the core AI agent loop, why dedicated frameworks are needed, compares eight popular frameworks—including RelevanceAI, smolagents, PhiData, LangChain, LlamaIndex, CrewAI, AutoGen, and LangGraph—offers selection criteria, and provides hands‑on code demos for AutoGen and LangGraph.

AI agentsAutoGenLLM

0 likes · 19 min read

Choosing the Best AI Agent Framework: A Practical Guide

Node.js Tech Stack

Jan 23, 2026 · Backend Development

Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response

Bun introduces the --cpu-prof-md flag that outputs CPU profiling data as structured Markdown for large language models, earning praise from Vue creator Evan You and inspiring Node.js core contributor Matteo Collina to release a pprof‑to‑md converter, highlighting a shift toward AI‑oriented CLI tools.

AI debuggingBunCLI tools

0 likes · 7 min read

Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response

Architecture Digest

Jan 22, 2026 · Artificial Intelligence

Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide

WeKnora is an open‑source LLM‑driven framework that transforms complex, multi‑format documents into searchable semantic knowledge, offering features such as Agent mode, hybrid retrieval, secure private deployment, and an easy‑to‑use web UI, with step‑by‑step installation instructions and demo screenshots.

LLMWeKnoraai

0 likes · 7 min read

Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide

Woodpecker Software Testing

Jan 21, 2026 · Backend Development

Building a Daily News Summarizer: Design, Implementation, and Automation (Part 4)

This article walks through the complete design and implementation of a daily news summarizer, covering source selection, web‑scraping with BeautifulSoup, database schema with SQLModel, LLM‑based summarization, FastAPI endpoints, front‑end layout, category/date browsing, and a scheduled update loop.

FastAPILLMNews Summarization

0 likes · 22 min read

Building a Daily News Summarizer: Design, Implementation, and Automation (Part 4)

DeWu Technology

Jan 21, 2026 · Artificial Intelligence

Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs

By integrating large language models to dynamically construct user knowledge graphs and applying two‑hop reasoning, the authors enhance serendipity in a large‑scale e‑commerce community recommendation system, achieving significant online gains in diversity, novelty, and user engagement metrics.

Industrial DeploymentLLMSerendipity

0 likes · 17 min read

Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs

AI Frontier Lectures

Jan 21, 2026 · Artificial Intelligence

How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization

The paper introduces AP2O‑Coder, an adaptive progressive preference optimization framework that systematically captures error types, progressively refines LLM code generation, and dynamically adapts training data, achieving up to a 3% pass@k improvement across multiple open‑source models while reducing data requirements.

AP2O-CoderLLMPreference Optimization

0 likes · 11 min read

How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization

Alibaba Cloud Infrastructure

Jan 21, 2026 · Artificial Intelligence

Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG

This article details how to deploy the 235‑billion‑parameter Qwen3‑235B model using PD‑separation and MoE techniques, explains the associated challenges, and demonstrates a production‑grade solution built on the high‑performance SGLang inference engine and the RoleBasedGroup (RBG) orchestration framework, complete with benchmark results and best‑practice YAML examples.

InferenceKubernetesLLM

0 likes · 21 min read

Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG

Data Party THU

Jan 21, 2026 · Artificial Intelligence

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Analyzing recent DeepSeek flashmla repository commits, the article uncovers that the mysterious Model1 likely corresponds to DeepSeek‑V4, detailing architectural shifts to a 512‑dimensional head, full support for NVIDIA Blackwell GPUs, token‑level sparse MLA, and new mechanisms such as Value Vector Position Awareness and Engram.

DeepSeekDeepSeek-V4GPU Optimization

0 likes · 6 min read

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Su San Talks Tech

Jan 21, 2026 · Artificial Intelligence

Turn PDFs into Smart Search Engines with WeKnora’s Open‑Source LLM Framework

WeKnora is an open‑source Tencent framework that leverages large language models, multimodal parsing and hybrid retrieval to let users query PDFs, Word files, images and other complex documents with natural language, offering a web UI, API and secure private‑cloud deployment options.

DockerLLMRAG

0 likes · 6 min read

Turn PDFs into Smart Search Engines with WeKnora’s Open‑Source LLM Framework

Java Backend Technology

Jan 21, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM‑Powered Retrieval Framework

WeKnora is an open‑source, LLM‑driven document understanding and semantic search framework that extracts structured content from PDFs, Word files, and images, builds a unified knowledge graph, and enables natural‑language queries through a modular RAG architecture with flexible deployment options.

LLMRAGSearch

0 likes · 7 min read

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM‑Powered Retrieval Framework

Zhihu Tech Column

Jan 20, 2026 · Artificial Intelligence

How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study

This article details a multi‑year, data‑driven transformation in which a product‑research team leveraged large‑model AI and agentic workflows to automate repetitive coding, streamline hot‑topic discussion creation, and replace a seven‑person outsourcing crew, achieving up to 38.6% project‑time reduction, a 22.5‑25 PD weekly capacity gain, and a dramatic drop in marginal costs.

Cost reductionGoogle ADKLLM

0 likes · 29 min read

How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study

PaperAgent

Jan 20, 2026 · Artificial Intelligence

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

Google DeepMind's new "Intrinsic Self‑Critique" method lets large language models iteratively self‑evaluate and rewrite their plans, raising Blocksworld planning accuracy from 49.8% to 89.3% and setting new records across multiple planning benchmarks.

AI researchLLMPlanning

0 likes · 5 min read

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

AI Tech Publishing

Jan 20, 2026 · Artificial Intelligence

10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering

The article presents a ten‑step architecture for implementing scalable LLM Skills, covering a meta‑tool pattern to avoid tool explosion, progressive three‑level loading to save tokens, script execution outside the LLM context, Redis‑based storage with pub/sub updates, version locking, dynamic addition, batch loading, and file‑system strategies.

Context EngineeringLLMMeta-Tool

0 likes · 10 min read

10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering

Data Party THU

Jan 19, 2026 · Artificial Intelligence

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

The article introduces Huawei's VersatileFFN, an adaptive wide‑and‑deep feed‑forward design for large language models that reuses parameters to slash memory consumption while delivering stronger inference, detailing its dual‑system inspiration, technical mechanisms, experimental gains, and implications for efficient LLM deployment.

Adaptive ComputationLLMTransformer

0 likes · 8 min read

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

PaperAgent

Jan 19, 2026 · Artificial Intelligence

How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions

Recent research shows that applying reinforcement learning to large language models can dramatically improve inference performance, but its effectiveness depends on the token distribution produced during pre‑training, prompting a novel rewrite of cross‑entropy as a single‑step policy gradient with controllable entropy parameters.

LLMModel OptimizationRL

0 likes · 6 min read

How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions

AI Engineering

Jan 18, 2026 · Artificial Intelligence

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

The BU Browser Use team open‑sourced bu‑agent‑sdk, a minimal LLM agent framework that treats the agent as a simple for‑loop and adds explicit done tools, context compression, ephemeral messages, and a unified LLM interface, enabling flexible, low‑overhead AI applications.

Agent FrameworkLLMPython

0 likes · 7 min read

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

MaGe Linux Operations

Jan 18, 2026 · Artificial Intelligence

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

This guide walks through building a production‑grade Kubernetes GPU cluster for large language model inference, covering hardware sizing, GPU resource scheduling, model storage options, automated scaling with HPA, health checks, monitoring, troubleshooting, and multi‑model deployment strategies.

DockerGPUInference

0 likes · 49 min read

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

PaperAgent

Jan 17, 2026 · Artificial Intelligence

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

This article explains how representing multi‑component scientific knowledge as hyperedges, rather than traditional triples, enables large language models to traverse complex material interactions, reduce hallucinations, and generate verifiable experimental designs, demonstrated through a large hypergraph built from thousands of scaffold papers.

AI reasoningHypergraphLLM

0 likes · 7 min read

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

AI Engineering

Jan 17, 2026 · Artificial Intelligence

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

The article details how the small Qwen‑0.6B model was adapted to generate and run WebAssembly code during inference, achieving deterministic calculations and revealing both the promise and current limitations of integrating world‑model reasoning into tiny LLMs.

InferenceLLMQwen-0.6B

0 likes · 5 min read

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

macrozheng

Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

LLMRAGWeKnora

0 likes · 7 min read

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

php Courses

Jan 16, 2026 · Artificial Intelligence

From Coding to Validation: How AI Is Redefining the Developer’s Role

The rise of large language models has shifted software development from manual coding to AI‑generated drafts, making verification, security, and business alignment the core responsibilities of modern engineers, and outlining the skills, workflows, and challenges needed to thrive in this new paradigm.

LLMaicode-generation

0 likes · 11 min read

From Coding to Validation: How AI Is Redefining the Developer’s Role

Ops Development & AI Practice

Jan 15, 2026 · Artificial Intelligence

Why Rapid Experimentation Beats Token‑Saving in LLM Development

The article explains how AI development with large language models differs from traditional software engineering, why developers feel abstract and uncertain, and offers actionable strategies—such as micro‑prototyping, tiered model usage, simple evaluation sheets, and embracing throwaway code—to accelerate learning despite token costs.

LLMRapid Prototypingtoken management

0 likes · 7 min read

Why Rapid Experimentation Beats Token‑Saving in LLM Development

Tencent Tech

Jan 15, 2026 · Artificial Intelligence

How TCAR Redefines Enterprise Multi‑Agent Routing with Reason‑First Decision Making

The article explains how Tencent Cloud's open‑source TCAR router, a 4‑billion‑parameter model, tackles the limitations of traditional single‑label routers by first reasoning and then selecting agents, enabling cross‑domain, conflict‑aware, and adaptable task coordination in enterprise AI systems.

LLMMulti-Agentai

0 likes · 7 min read

How TCAR Redefines Enterprise Multi‑Agent Routing with Reason‑First Decision Making

PaperAgent

Jan 15, 2026 · Artificial Intelligence

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

The article presents GAG, a third‑generation framework that injects proprietary domain knowledge into frozen large language models using a single token, eliminating retrieval, avoiding base model updates, and maintaining constant inference budget while delivering strong performance on private QA and public benchmarks.

AI AlignmentGAGLLM

0 likes · 8 min read

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

HyperAI Super Neural

Jan 15, 2026 · Artificial Intelligence

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

A joint Princeton and Colorado School of Mines team introduced MOFSeq‑LMM, a large‑language‑model‑based framework that leverages a million‑scale MOF dataset and a novel string representation to predict free energy with MAE 0.789 kJ/mol and synthesizeability with 97% F1, dramatically accelerating high‑throughput MOF screening.

LLMMOFsMaterials Informatics

0 likes · 15 min read

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

AI Large Model Application Practice

Jan 15, 2026 · Artificial Intelligence

Why Transformers Need Positional Embeddings and How They Work

This article explains the order‑blindness of Transformer self‑attention, why naïvely adding raw position indices harms semantics, and walks through sinusoidal, learnable, and rotary positional encodings together with PI and YaRN techniques for extending sequence length.

Deep LearningLLMPositional Embedding

0 likes · 12 min read

Why Transformers Need Positional Embeddings and How They Work

Sohu Tech Products

Jan 14, 2026 · Artificial Intelligence

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

This guide walks through building an open‑source Retrieval‑Augmented Generation (RAG) system that indexes local files with Everything, uses hybrid BM25‑vector search via Elasticsearch, and answers questions with a local LLM, covering architecture, core techniques, deployment steps, performance tweaks, and common pitfalls.

ElasticsearchLLMPython

0 likes · 11 min read

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

Aikesheng Open Source Community

Jan 14, 2026 · Artificial Intelligence

NL2SQL Datasets REEF & text2SQL4PM: Causal Analysis Meets Process Mining

This article introduces two recent NL2SQL benchmark datasets—REEF, a synthetic e‑commerce database for end‑to‑end causal analysis, and text2SQL4PM, a bilingual process‑mining dataset—detailing their construction, evaluation results, and research implications for large language models.

Causal AnalysisDatasetLLM

0 likes · 8 min read

NL2SQL Datasets REEF & text2SQL4PM: Causal Analysis Meets Process Mining

Alibaba Cloud Developer

Jan 14, 2026 · Artificial Intelligence

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

DataAgent, built on Spring AI Alibaba, tackles the "last mile" of AI data analysis by combining deterministic workflow orchestration with large‑model reasoning, offering human‑in‑the‑loop feedback, dynamic prompt configuration, hybrid retrieval, containerized Python execution, streaming SSE, multi‑model scheduling, multi‑source connectivity, and secure API‑key management to deliver instant, insight‑rich reports for business users.

AnalyticsDataAgentLLM

0 likes · 11 min read

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

PMTalk Product Manager Community

Jan 14, 2026 · Product Management

From Docs to Evals: Essential AI Skills for Modern Product Managers

AI product managers are shifting from static PRDs to dynamic evaluation frameworks—Evals—that define product quality through automated tests, golden conversations, and LLM judges, enabling continuous iteration, error-driven requirement discovery, and architecture decisions in complex AI systems.

LLMaievals

0 likes · 7 min read

From Docs to Evals: Essential AI Skills for Modern Product Managers

Network Intelligence Research Center (NIRC)

Jan 14, 2026 · Artificial Intelligence

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

At EMNLP 2025, the BUPT NIRC team presented a paper that introduces the ARR metric to quantitatively separate latent reasoning from factual shortcuts in LLMs, using Logit Lens and Attention Knockout to reveal distinct internal pathways and shares their conference experience.

ARR metricAttention KnockoutEMNLP2025

0 likes · 6 min read

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

Data Party THU

Jan 13, 2026 · Artificial Intelligence

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

DeepSeek’s newly open‑sourced Engram module introduces a scalable lookup‑based memory that separates knowledge retrieval from computation, enabling O(1) deterministic access and significantly improving large language model performance on knowledge‑heavy, reasoning, code, and math tasks without extra FLOPs.

LLMLookupMemory Architecture

0 likes · 10 min read

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

AI Insight Log

Jan 12, 2026 · Artificial Intelligence

Goodbye H100: How DeepSeek’s Engram Uses CPU Memory to Scale LLM Knowledge Bases

DeepSeek’s Engram architecture adds a deterministic dictionary lookup to Transformers, storing massive N‑gram tables in cheap CPU DRAM, which reduces GPU memory use and boosts both knowledge‑heavy and reasoning benchmarks while keeping inference latency under 3%.

CPU memoryDeterministic LookupEngram

0 likes · 7 min read

Goodbye H100: How DeepSeek’s Engram Uses CPU Memory to Scale LLM Knowledge Bases

AI Tech Publishing

Jan 12, 2026 · Artificial Intelligence

Ralph Loop: Engineering Continuous Iteration for AI Agents

Ralph Loop introduces an externalized iterative loop that forces AI agents to keep working until objective completion criteria are met, dramatically extending effective runtime from hours to a full day or more and shifting human‑agent collaboration from frequent supervision to efficient delegation.

AI AgentIterative AutomationLLM

0 likes · 17 min read

Ralph Loop: Engineering Continuous Iteration for AI Agents

Design Hub

Jan 12, 2026 · Artificial Intelligence

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

The article introduces a visual AI prompt editor that transforms lengthy, complex prompt strings into modular, editable Chinese sections, demonstrating the workflow with two examples—converting a “California girl” portrait to an Asian style and re‑imagining a cinematic skyscraper scene—while detailing step‑by‑step usage and JSON export options.

AI prompt engineeringJSON exportLLM

0 likes · 11 min read

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

Bighead's Algorithm Notes

Jan 11, 2026 · Artificial Intelligence

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

FinRpt introduces a novel multi‑agent pipeline that builds a high‑quality stock research report (ERR) dataset from six financial data sources, defines a comprehensive 11‑metric evaluation suite, and demonstrates that supervised‑fine‑tuned and reinforcement‑learned LLM agents significantly outperform single LLM baselines in both accuracy and efficiency.

DatasetFinRptLLM

0 likes · 14 min read

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

Architect's Alchemy Furnace

Jan 10, 2026 · Artificial Intelligence

Build and Test a Multi‑Agent AI System with MetaGPT

This guide walks through the MetaGPT framework—explaining its multi‑agent architecture, core concepts, predefined roles, team setup, environment preparation, installation, configuration, and troubleshooting steps—so you can quickly build, run, and validate a collaborative AI software‑company simulation.

AI agentsLLMMetaGPT

0 likes · 14 min read

Build and Test a Multi‑Agent AI System with MetaGPT

AI Engineering

Jan 10, 2026 · Artificial Intelligence

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

Alibaba's new AgeMem framework turns long‑term and short‑term memory management for large language model agents into a learnable reinforcement‑learning task, replacing handcrafted rules with a three‑stage training process and achieving significant benchmark gains.

AgeMemGRPOLLM

0 likes · 9 min read

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

JD Tech Talk

Jan 9, 2026 · Artificial Intelligence

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

JoyCode Agent leverages a patch‑test co‑generation and iterative validation framework to achieve a 74.6% Pass@1 score on the SWE‑bench Verified benchmark, reducing resource consumption by 30‑50% and introducing a closed‑loop multi‑agent pipeline that integrates testing, patch generation, trajectory compression, similarity retrieval, and decision arbitration.

LLMMulti-AgentSWE-bench

0 likes · 41 min read

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

PaperAgent

Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG

0 likes · 7 min read

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

AI Insight Log

Jan 9, 2026 · Industry Insights

Did AI Doom Tailwind? 75% Layoffs and a Founder’s Threat to Developers

The article analyzes how the rise of AI coding tools led Tailwind CSS founder Adam Wathan to reject a community PR adding an llms.txt file, trigger a 75% staff cut, and expose the collapse of the open‑source‑plus‑services business model in the AI era.

Business ModelLLMLayoffs

0 likes · 7 min read

Did AI Doom Tailwind? 75% Layoffs and a Founder’s Threat to Developers

Meituan Technology Team

Jan 8, 2026 · Artificial Intelligence

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

This article curates eight AAAI 2026 papers authored by the Meituan research team, covering verifiable stepwise rewards for LLM reasoning, annealing strategies in large‑scale training, process reward models, competence‑difficulty sampling, high‑fidelity visual text rendering, counterfactual fusion, compress‑then‑rank reranking, and cross‑modal quantization for generative recommendation, with direct PDF links for each work.

AAAI2026CounterfactualLLM

0 likes · 14 min read

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

Kuaishou Tech

Jan 8, 2026 · Artificial Intelligence

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Kuaishou secured 12 papers at AAAI 2026, covering advances in search and recommendation systems, multi‑camera video generation, multimodal understanding, generative model fundamentals, video large language models, experimental design, and LLM latent‑space reasoning, with three papers highlighted as oral presentations.

LLMVideo Generationai

0 likes · 22 min read

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Alibaba Cloud Developer

Jan 8, 2026 · Artificial Intelligence

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

This article explains how to integrate a Human‑In‑The‑Loop (HITL) mechanism into ReactAgent, detailing the motivation, design of interaction, tool description, XML‑based UI rendering, Redis‑driven waiting loop, and the broader architectural parallels with design patterns and other agent frameworks.

Design PatternsHITLHuman-in-the-Loop

0 likes · 14 min read

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

AndroidPub

Jan 8, 2026 · Artificial Intelligence

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

This article explains Anthropic’s open‑standard Agent Skill, how it serves as a reusable task specification for Claude, walks through creating a skill with metadata, instructions, and advanced Reference/Script features, and compares Skill with MCP to help developers choose the right tool.

AI automationAgent SkillAnthropic

0 likes · 11 min read

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

Sohu Tech Products

Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

Knowledge BaseLLMRAG

0 likes · 14 min read

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

21CTO

Jan 7, 2026 · Fundamentals

Can LLMs Build a Garbage‑Collector‑Free System Language? Inside Steve Klabnik’s Rue Project

Steve Klabnik, a veteran of Rust and Ruby on Rails, explores a new system programming language called Rue that aims for memory safety without garbage collection, leveraging Anthropic’s Claude AI for rapid development and discussing its design trade‑offs, progress, and future prospects.

ClaudeLLMMemory Safety

0 likes · 8 min read

Can LLMs Build a Garbage‑Collector‑Free System Language? Inside Steve Klabnik’s Rue Project

DaTaobao Tech

Jan 7, 2026 · Artificial Intelligence

5 Design Patterns to Control LLM Output in Generative AI Applications

The article presents five design patterns—Logits Masking, Grammar, Style Transfer, Reverse Neutralization, and Content Optimization—for steering the output of generative AI models, compares their suitable scenarios, advantages, drawbacks, and anti‑patterns, and provides concrete implementation steps, code snippets, and flowcharts to help developers reliably enforce style, format, and compliance constraints.

LLMPrompt engineeringgenerative AI

0 likes · 20 min read

5 Design Patterns to Control LLM Output in Generative AI Applications

Tencent Cloud Developer

Jan 7, 2026 · Artificial Intelligence

How Context Engineering Powers the Next Generation of AI Agents

Transitioning from simple chatbots to sophisticated agents, this article explains how expanding context becomes a core variable, detailing the evolution from prompt engineering to context engineering, the challenges of managing growing context, and practical solutions like structured context, tool integration, and the MCP framework for reliable AI systems.

LLMReliabilityagent

0 likes · 20 min read

How Context Engineering Powers the Next Generation of AI Agents

Wuming AI

Jan 6, 2026 · Artificial Intelligence

Top LLM Leaderboards Explained: How to Choose the Right Model

This article surveys the most popular large‑language‑model leaderboards—including lmarena, Artificial Analysis, SuperCLUE, and llm‑stats—detailing their evaluation methods, coverage areas, URLs, and practical usage tips, while warning readers that rankings are only a reference and real‑world performance may vary.

AI benchmarkingArtificial IntelligenceLLM

0 likes · 5 min read

Top LLM Leaderboards Explained: How to Choose the Right Model

Bighead's Algorithm Notes

Jan 6, 2026 · Artificial Intelligence

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

FinRS integrates hierarchical market analysis, dual decision agents, and multi‑time‑scale reward feedback to enable risk‑aware multi‑stage trading, achieving higher cumulative returns, better Sharpe ratios, and lower maximum drawdowns than existing LLM‑based and reinforcement‑learning baselines across diverse stocks.

FinRSLLMReinforcement Learning

0 likes · 14 min read

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

PMTalk Product Manager Community

Jan 6, 2026 · Industry Insights

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

This article provides a multi‑dimensional strategic analysis of three representative AI‑focused platforms—Dify, n8n, and ComfyUI—examining their product positioning, architecture, interaction models, commercialization strategies, and agent capabilities, and offers concrete recommendations for product managers on choosing the right tool based on ease of use, control, scalability, and total cost of ownership.

AI PlatformsLLMProduct Comparison

0 likes · 35 min read

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

macrozheng

Jan 6, 2026 · Artificial Intelligence

Getting Started with AgentScope Java: Build Multi‑Agent LLM Applications Quickly

This guide introduces AgentScope, a multi‑agent framework for Java that brings ReAct reasoning, tool calling, memory management, RAG, and serverless capabilities to LLM‑powered applications, and provides step‑by‑step code examples for basic and advanced usage.

AgentScopeLLMMCP

0 likes · 12 min read

Getting Started with AgentScope Java: Build Multi‑Agent LLM Applications Quickly

PaperAgent

Jan 6, 2026 · Artificial Intelligence

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

This article examines the shortcomings of naïve GraphRAG implementations on clinical data and explains how an ontology‑driven, zero‑noise GraphRAG architecture can create self‑improving, conflict‑free knowledge graphs for AI applications.

Data QualityGraphRAGLLM

0 likes · 3 min read

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

AI Insight Log

Jan 5, 2026 · Artificial Intelligence

Free Access to NVIDIA GLM‑4.7 and Minimax‑M2.1 with a Step‑by‑Step NIM Tutorial

This guide shows how to obtain a free NVIDIA NIM API key, verify a Chinese phone number, and call the hidden GLM‑4.7 and Minimax‑M2.1 large‑language models using provided Python or curl snippets, all without owning a GPU.

APIGLM-4.7LLM

0 likes · 5 min read

Free Access to NVIDIA GLM‑4.7 and Minimax‑M2.1 with a Step‑by‑Step NIM Tutorial

PaperAgent

Jan 5, 2026 · Artificial Intelligence

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

QuCo‑RAG introduces a dynamic retrieval‑augmented generation framework that quantifies uncertainty using pre‑training corpus statistics, replacing unreliable model confidence with objective frequency and co‑occurrence evidence, achieving millisecond‑level hallucination detection, superior multi‑hop QA performance, and cross‑model transferability across various LLMs.

Dynamic RetrievalLLMRetrieval Augmented Generation

0 likes · 9 min read

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

AI Insight Log

Jan 4, 2026 · Artificial Intelligence

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

The open‑source ‘Agent Skills for Context Engineering’ project, which amassed over 4,100 stars in a week, demonstrates why managing a model’s attention budget—through foundational, operational, and development‑methodology skills—is essential as context windows grow, and provides platform‑agnostic instructions for Claude Code, Cursor and other AI tools.

Agent SkillsClaude CodeContext Engineering

0 likes · 7 min read

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

Bighead's Algorithm Notes

Jan 4, 2026 · Artificial Intelligence

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

The VTA framework integrates large language model reasoning with textual annotation of technical indicators, employs a Time‑GRPO reinforcement‑learning objective and multi‑stage joint conditional training, and achieves state‑of‑the‑art accuracy and expert‑rated interpretability on US, Chinese and European stock datasets.

LLMReinforcement LearningStock Prediction

0 likes · 19 min read

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

AI Insight Log

Jan 4, 2026 · Artificial Intelligence

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

The article examines the open‑source ai‑goofish‑monitor project, which combines Playwright‑driven browsing with large‑language‑model analysis to continuously scan Xianyu listings, filter out junk, and highlight high‑quality items, while also discussing its AI‑generated code, benefits, limitations, and security risks.

LLMPlaywrightWeb Scraping

0 likes · 7 min read

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

PaperAgent

Jan 4, 2026 · Artificial Intelligence

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

The article presents Sophia, a System 3‑enabled persistent agent framework that adds a meta‑cognitive layer to LLM‑based agents, enabling identity continuity, self‑scheduled learning, real‑time self‑checks, and autonomous task generation, and validates its benefits through a 24‑hour continuous‑run experiment.

AI agentsAutonomous AgentsLLM

0 likes · 7 min read

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

Architect

Jan 3, 2026 · Artificial Intelligence

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys the emerging field of AI agent memory, presenting a three‑dimensional taxonomy of memory forms, detailing functional categories such as factual, experiential, and working memory, and outlining dynamic processes of formation, evolution, and retrieval, while also highlighting benchmarks, open‑source frameworks, and future research directions.

AI agentsLLMMemory Architecture

0 likes · 7 min read

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

AI Architecture Hub

Jan 2, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost LLM Performance with Minimal Overhead

DeepSeek's new mHC architecture projects residual connections onto a manifold, enabling a 6.7% training cost increase for 27B models while delivering significant stability and downstream performance gains over traditional residual and hyper‑connection designs.

Deep LearningLLMManifold Optimization

0 likes · 13 min read

How Manifold-Constrained Hyper-Connections Boost LLM Performance with Minimal Overhead

NetEase LeiHuo Testing Center

Jan 2, 2026 · Artificial Intelligence

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

The article explains why traditional chat‑based AI tools are limited to advice, introduces next‑generation LLM‑native applications that can understand, plan, and act, and provides a step‑by‑step guide on designing AI workflows, autonomous agents, hybrid architectures, and the Model Context Protocol (MCP) using LangChain.

AI agentsLLMLangChain

0 likes · 36 min read

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

IT Services Circle

Jan 2, 2026 · Artificial Intelligence

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

This article surveys the most popular open‑source replacements for Google NotebookLM, detailing each project's star count, supported AI models, multimodal input capabilities, Docker deployment options, and unique features such as multi‑speaker podcast generation, semantic search, and collaborative knowledge‑base integration.

DockerLLMMultimodal

0 likes · 8 min read

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

Code Mala Tang

Dec 31, 2025 · Artificial Intelligence

Can TOON Replace JSON for LLMs? A Token‑Efficient Data Format Explained

The article introduces Token‑Oriented Object Notation (TOON), a compact alternative to JSON designed for large language models, and demonstrates how its reduced syntax cuts token usage by up to 60%, speeds up parsing, and remains human‑readable.

LLMToken efficiencyai

0 likes · 7 min read

Can TOON Replace JSON for LLMs? A Token‑Efficient Data Format Explained

AI Architecture Hub

Dec 31, 2025 · Artificial Intelligence

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

This article explains the motivation behind LangGraph, walks through a quick start, details its core syntax and state management, demonstrates conditional branching, parallel execution, tool integration, multi‑agent orchestration, and real‑time monitoring, and finally discusses future directions for the framework.

LLMLangGraphParallel Execution

0 likes · 32 min read

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

Architect's Alchemy Furnace

Dec 30, 2025 · Artificial Intelligence

Run AgenticSeek Locally: Complete Guide to a Private AI Assistant

This guide walks you through installing, configuring, and running AgenticSeek—a fully local, privacy‑focused AI assistant—by setting up prerequisites, cloning the repository, adjusting environment files, launching Docker services or CLI mode, and troubleshooting common issues.

AgenticSeekDockerLLM

0 likes · 21 min read

Run AgenticSeek Locally: Complete Guide to a Private AI Assistant

Aikesheng Open Source Community

Dec 30, 2025 · Databases

Year-in-Review: Open-Source SQL LLM Benchmark, SQLE Updates, and Top DB Articles

This community roundup reviews the 2025 release of the SCALE open‑source LLM‑SQL benchmark, SQLE platform updates, curated video playlists, a curated list of the year's ten best database articles, and provides reference links for further exploration.

LLMOpenSourcebenchmark

0 likes · 10 min read

Year-in-Review: Open-Source SQL LLM Benchmark, SQLE Updates, and Top DB Articles

Data Party THU

Dec 29, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

This article reviews the survey "Memory in the Age of AI Agents," presenting a comprehensive taxonomy that classifies agent memory by its forms, functions, and dynamic mechanisms, and explores future directions such as generative memory, reinforcement‑learning‑driven management, multimodal storage, and trustworthy handling.

AI agentsAgent ArchitectureFuture AI

0 likes · 14 min read

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

Alibaba Cloud Developer

Dec 29, 2025 · Artificial Intelligence

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

This article details the architecture and implementation of Tair KVCache Manager, an enterprise‑grade service that centralises KVCache metadata, decouples inference engines from storage, provides elastic scaling, multi‑tenant isolation, high availability, and performance‑optimised cache management for large‑scale LLM inference workloads.

Cache ManagementKVCacheLLM

0 likes · 28 min read

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

AI Large Model Application Practice

Dec 29, 2025 · Artificial Intelligence

Integrating Anthropic‑Style Skills into LangChain DeepAgents: A Step‑by‑Step Guide

This article explains how to bring Anthropic's Skills concept into the open‑source LangChain DeepAgents framework by detailing the discovery, system‑prompt injection, progressive loading, and execution phases, and provides a complete code‑driven example using a web‑research Skill.

Agent SkillsDeepAgentsLLM

0 likes · 14 min read

Integrating Anthropic‑Style Skills into LangChain DeepAgents: A Step‑by‑Step Guide

MaGe Linux Operations

Dec 27, 2025 · Artificial Intelligence

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

This guide walks you through deploying large language models such as ChatGLM and Llama in production, covering environment setup, model quantization, dynamic batching, service configuration, Nginx load balancing, monitoring, troubleshooting, and best‑practice recommendations for high‑performance, cost‑effective AI inference.

GPUInferenceLLM

0 likes · 48 min read

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

AI Architecture Hub

Dec 27, 2025 · Artificial Intelligence

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

GraphRAG extends traditional Retrieval‑Augmented Generation by building a knowledge graph from documents, extracting entities and relationships, performing community detection, and supporting both local and global searches, offering detailed step‑by‑step guidance, code examples, configuration tips, and a comparison with classic RAG approaches.

GraphRAGLLMNeo4j

0 likes · 28 min read

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

Alibaba Cloud Native

Dec 27, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

This article explains how AI agents overcome context window limits by using memory systems, distinguishes short‑term (session) and long‑term (cross‑session) memory, compares implementations in Google ADK, LangChain and AgentScope, and outlines context‑engineering techniques, core components, challenges, and emerging trends.

AI memoryAgent FrameworksContext Engineering

0 likes · 20 min read

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

Alibaba Cloud Developer

Dec 26, 2025 · Artificial Intelligence

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

This article explains the challenges of token explosion in long‑running AI agent dialogues and introduces AutoContextMemory, a Java component that automatically compresses, offloads, and summarizes conversation history to dramatically reduce token usage, speed up responses, and preserve critical information.

AgentScopeLLMcontext management

0 likes · 12 min read

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

360 Tech Engineering

Dec 26, 2025 · Artificial Intelligence

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

This article presents fifteen practical chunking techniques—ranging from line‑by‑line and fixed‑size chunking to semantic and hierarchical methods—explaining their principles, ideal use‑cases, concrete input examples, chunk outputs, and key advantages or cautions for improving Retrieval‑Augmented Generation with large language models.

Data RetrievalLLMRAG

0 likes · 28 min read

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

Alibaba Cloud Developer

Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG

0 likes · 15 min read

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

Architect

Dec 25, 2025 · Artificial Intelligence

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

This article explains why traditional RAG suffers from hallucinations, introduces GraphRAG’s knowledge‑graph‑based approach, walks through its indexing and query pipelines—including text splitting, entity‑relation extraction, graph construction, community detection, and local vs. global retrieval—provides practical setup commands, Neo4j visualization steps, and compares its performance with classic RAG.

EmbeddingGraphRAGLLM

0 likes · 27 min read

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

360 Tech Engineering

Dec 25, 2025 · Artificial Intelligence

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

LangChain 1.0 replaces fragmented agent code with a production‑ready framework that unifies model outputs, simplifies tool integration, introduces content_blocks for consistent response handling, and adds a middleware system for privacy, summarization, and human‑in‑the‑loop safety, dramatically improving developer efficiency and reliability.

LLMLangChainPython

0 likes · 13 min read

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

AI Architecture Hub

Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM

0 likes · 9 min read

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

Zhuanzhuan Tech

Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM

0 likes · 11 min read

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection