Tagged articles

LLM

2301 articles · Page 2 of 24
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 11, 2026 · Artificial Intelligence

Building an AI‑Native Multi‑Agent Digital Human Architecture on Cloud Native

The article details how a cloud‑native platform called AgentTeams enables AI‑Native multi‑agent digital‑human teams to replace manual incident response, automate end‑to‑end development workflows, and securely integrate LLMs and internal services through declarative orchestration and fine‑grained permission models.

AI-nativeAgentTeamsAutomation
0 likes · 24 min read
Building an AI‑Native Multi‑Agent Digital Human Architecture on Cloud Native
Su San Talks Tech
Su San Talks Tech
Jun 11, 2026 · Artificial Intelligence

Why MarkItDown Is Dominating GitHub Trending: An In‑Depth AI‑Ready Document Converter

MarkItDown, the Microsoft‑backed open‑source tool that converts PDFs, Word, PPT, images and more into LLM‑friendly Markdown, has surged to over 150 k GitHub stars, and this article explains its architecture, installation, advanced features, strengths, limitations, and how it fits into RAG and AI workflows.

AI preprocessingLLMMCP
0 likes · 20 min read
Why MarkItDown Is Dominating GitHub Trending: An In‑Depth AI‑Ready Document Converter
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 10, 2026 · Artificial Intelligence

Beyond Orchestrating Workflows: How UnityMAS-O Trains LLM-Based Multi‑Agent Systems

UnityMAS‑O introduces a general reinforcement‑learning framework that converts predefined LLM multi‑agent workflows into trainable tasks, enabling credit assignment across roles, supporting parameter‑sharing configurations, and demonstrating significant F1 and test‑pass improvements on QA and code‑generation benchmarks.

LLMMulti-Agent Reinforcement LearningPPO
0 likes · 12 min read
Beyond Orchestrating Workflows: How UnityMAS-O Trains LLM-Based Multi‑Agent Systems
Machine Heart
Machine Heart
Jun 10, 2026 · Artificial Intelligence

MiniAppBench Reveals Only 1 in 6 AI‑Generated Apps Meet Real User Needs

MiniAppBench, the first benchmark that evaluates large language models' ability to generate fully functional interactive HTML applications, shows an average pass rate of just 17% across 16 top models—with the strongest model, GPT‑5.2, achieving only 45%—highlighting a substantial gap between current capabilities and real‑world user requirements.

AI evaluationLLMMiniAppBench
0 likes · 16 min read
MiniAppBench Reveals Only 1 in 6 AI‑Generated Apps Meet Real User Needs
Lao Guo's Learning Space
Lao Guo's Learning Space
Jun 10, 2026 · Artificial Intelligence

2026 Top 10 Local LLMs Ranked by Real Downloads, GPU Fit, and License Risks

The article analyzes why local large‑language‑model deployment is essential for privacy, offline use, and cost control, then ranks the ten most popular models in 2026 using Ollama download counts, GitHub stars, benchmark scores, and hardware requirements, and finally provides a GPU‑based selection guide, deployment‑tool comparison, license‑risk table, decision‑tree and quick‑start instructions.

GPULLMbenchmark
0 likes · 19 min read
2026 Top 10 Local LLMs Ranked by Real Downloads, GPU Fit, and License Risks
PaperAgent
PaperAgent
Jun 10, 2026 · Artificial Intelligence

Agent Era Information Retrieval: A Denoising-First Perspective (SIGIR 2026 Review)

The SIGIR 2026 review argues that as large language models become the primary consumers of retrieved results, information retrieval must shift its core objective from pure recall to denoising, presenting a five‑stage pipeline, controlled experiments, and a detailed attribution framework for noise sources.

AgentDenoisingInformation Retrieval
0 likes · 11 min read
Agent Era Information Retrieval: A Denoising-First Perspective (SIGIR 2026 Review)
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 10, 2026 · Artificial Intelligence

Layered Knowledge Base Architecture: From RAG to Agent‑Native Knowledge Context Layer

The article analyses the structural shortcomings of naive Retrieval‑Augmented Generation (RAG), compares four knowledge‑base paradigms, proposes a five‑layer pyramid knowledge context that supports role‑aware navigation and incremental sync, and presents evaluation results showing the pyramid‑plus‑RAG approach significantly outperforms plain RAG.

AIKnowledge BaseKnowledge Graph
0 likes · 22 min read
Layered Knowledge Base Architecture: From RAG to Agent‑Native Knowledge Context Layer
James' Growth Diary
James' Growth Diary
Jun 9, 2026 · Artificial Intelligence

How Hermes’s Three‑Way Adapter Unifies Anthropic, Gemini, and Codex APIs

This article explains how Hermes uses three dedicated adapters—anthropic_adapter.py, gemini_native_adapter.py, and codex_responses_adapter.py—to translate the wildly different request and response schemas of Anthropic Messages, Gemini generateContent, and Codex Responses into a single OpenAI‑style chat.completions interface, covering message formats, system prompts, tool calls, reasoning signatures, lazy SDK loading, pure‑function design, and defensive validation.

API integrationAdapter PatternAnthropic
0 likes · 24 min read
How Hermes’s Three‑Way Adapter Unifies Anthropic, Gemini, and Codex APIs
Machine Heart
Machine Heart
Jun 9, 2026 · Artificial Intelligence

OneReason: When Recommendation Systems Learn to Reason

The OneReason report details how Kuaishou’s recommendation team injects reasoning into large‑scale recommender models through a four‑level pre‑training pipeline, chain‑of‑thought (CoT) fine‑tuning, and specialized reinforcement learning, achieving significant offline gains and a 10.33% exposure lift in a live A/B test.

CoTIndustryLLM
0 likes · 31 min read
OneReason: When Recommendation Systems Learn to Reason
DataFunSummit
DataFunSummit
Jun 9, 2026 · Artificial Intelligence

From Poor RAG Performance to Production‑Ready Systems: A Deep Technical Walkthrough

The article dissects why early RAG deployments suffer from low recall, hallucinations and runaway costs, then presents a step‑by‑step diagnostic framework, hybrid search architecture, knowledge‑engineering tricks, caching and routing strategies, and explores advanced GraphRAG and Agentic RAG techniques to build reliable, enterprise‑grade solutions.

Agentic RAGGraphRAGHybrid Search
0 likes · 20 min read
From Poor RAG Performance to Production‑Ready Systems: A Deep Technical Walkthrough
Golang Shines
Golang Shines
Jun 9, 2026 · Artificial Intelligence

Essential AI Agent Design Patterns and Frameworks Every Ops Engineer Should Know

The article explains seven AI agent design patterns—workflow, routing, parallel, loop, aggregation, network, and hierarchy—illustrates their use with concrete examples and code, compares agent frameworks such as AutoGPT, Dify, AutoGen, CrewAI and LangGraph, and shows why multi‑agent architectures outperform traditional workflows in complex operational tasks.

AI AgentLLMOperations
0 likes · 12 min read
Essential AI Agent Design Patterns and Frameworks Every Ops Engineer Should Know
PaperAgent
PaperAgent
Jun 9, 2026 · Artificial Intelligence

Defining Standard Answers for Agent‑Era LLMs: A Rubrics Survey

The survey from RUC‑Gaoling AI Institute reviews Rubrics for large language models, explaining why they are needed for open‑ended, high‑risk tasks, how they are constructed, and how they can be applied to policy and reward model training as well as multi‑dimensional evaluation across general and domain‑specific scenarios.

AgentEvaluationLLM
0 likes · 14 min read
Defining Standard Answers for Agent‑Era LLMs: A Rubrics Survey
Qborfy AI
Qborfy AI
Jun 9, 2026 · Artificial Intelligence

Deep Dive into Core LLM API Parameters

While many newcomers think using an LLM API is as simple as picking a model and feeding a prompt, the real control lies in parameters such as temperature, top‑p, top‑k, max_tokens, penalties, stop, and stream, each of which dramatically influences output quality, length, cost, and behavior.

APILLMPrompt Engineering
0 likes · 21 min read
Deep Dive into Core LLM API Parameters
AI Engineer Programming
AI Engineer Programming
Jun 8, 2026 · Artificial Intelligence

Parse vs Extract: When to Use Full Document Parsing vs Targeted Data Extraction for AI

The article explains the fundamental difference between parsing—converting documents into AI‑friendly formats that preserve structure and context—and extraction—pulling predefined fields into structured outputs—while offering concrete scenarios, decision criteria, and example implementations with LlamaParse and LlamaExtract.

AIDocument ParsingLLM
0 likes · 10 min read
Parse vs Extract: When to Use Full Document Parsing vs Targeted Data Extraction for AI
Coder Trainee
Coder Trainee
Jun 8, 2026 · Artificial Intelligence

Rapidly Build AI Agents with LangChain: A Hands‑On Tutorial

This article walks through why LangChain is the leading framework for AI agents, compares it with low‑level implementations, and provides step‑by‑step code examples for installation, prompt templates, LCEL pipelines, memory modules, RAG, custom tools, and a complete customer‑service agent, concluding with a concise feature comparison.

AI AgentsLLMLangChain
0 likes · 14 min read
Rapidly Build AI Agents with LangChain: A Hands‑On Tutorial
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 8, 2026 · Artificial Intelligence

DecodeBatch Load Imbalance in LLM Inference: Request Length Differences Amplify

During LLM decoding, the DecodeBatch stage can suffer severe load imbalance because differing historical token lengths (kv_len) cause uneven attention task distribution across GPU SMs, a problem explored through detailed analysis of task granularity, SplitKV heuristics, FlashInfer’s batch‑size thresholds, and FA3’s dynamic scheduling and split strategies.

DecodeBatchFA3FlashInfer
0 likes · 29 min read
DecodeBatch Load Imbalance in LLM Inference: Request Length Differences Amplify
James' Growth Diary
James' Growth Diary
Jun 8, 2026 · Artificial Intelligence

7‑Level Multi‑Provider Fallback: Keeping the Agent Alive When a Model Fails

Hermes Agent’s auxiliary_client.py implements a seven‑level provider fallback chain that ensures auxiliary tasks keep running even if the main LLM crashes, runs out of credits, or hits rate limits, by prioritizing the user’s primary provider, cycling through alternative providers, and handling protocol quirks.

AI AgentsFallbackHermes
0 likes · 14 min read
7‑Level Multi‑Provider Fallback: Keeping the Agent Alive When a Model Fails
Programmer XiaoFu
Programmer XiaoFu
Jun 8, 2026 · Artificial Intelligence

Why Smart LLMs Still Struggle to Deploy Agents in Production

Although large language models have become more capable, deploying AI agents in production remains difficult because their probabilistic nature leads to error accumulation, testing challenges, fragile real‑world interactions, and a lack of deterministic controls, requiring strict workflows, schema validation, mock testing, and human oversight.

AI AgentsLLMProduction
0 likes · 8 min read
Why Smart LLMs Still Struggle to Deploy Agents in Production
CodePath
CodePath
Jun 8, 2026 · Artificial Intelligence

Run Your First Pi‑AI Agent in Under 10 Minutes

This tutorial walks you through preparing the environment, initializing a Node.js project, writing the first Pi‑AI agent code, using both simple and streaming calls, swapping providers with a single parameter change, and building a continuous‑conversation CLI—all in less than ten minutes.

LLMModel SwitchingNode.js
0 likes · 11 min read
Run Your First Pi‑AI Agent in Under 10 Minutes
AgentGuide
AgentGuide
Jun 8, 2026 · Artificial Intelligence

Agentic RAG vs Regular RAG: Key Differences, Trade‑offs, and Interview‑Ready Answer

This article explains what Agentic RAG is, contrasts it with ordinary RAG by detailing its dynamic decision‑making, multi‑step retrieval loop, higher cost and latency, and suitable scenarios, and outlines two implementation patterns—single‑agent and multi‑agent—plus a concise interview response.

AI AgentsAgentic RAGLLM
0 likes · 5 min read
Agentic RAG vs Regular RAG: Key Differences, Trade‑offs, and Interview‑Ready Answer
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 7, 2026 · Artificial Intelligence

Does AI Have Consciousness? Ted Chiang’s 10,000‑Word Rebuttal to Hinton’s Claim

The article examines recent industry moves to study AI consciousness, critiques Anthropic’s emotion‑vector findings, contrasts Hinton’s claim that AI is conscious with Ted Chiang’s extensive argument that large language models lack subjective experience, and warns that the AGI race cannot afford to delay this debate.

AGIAI consciousnessAnthropic
0 likes · 13 min read
Does AI Have Consciousness? Ted Chiang’s 10,000‑Word Rebuttal to Hinton’s Claim
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Jun 7, 2026 · Artificial Intelligence

22 Agentic Engineering Hacks to Turbocharge Your AI Projects

This guide walks through 22 practical Agentic Engineering techniques—from planning with /ce-plan and voice‑to‑LLM input to multi‑agent loops, remote session control, and turning everyday tasks into reusable skills—showing how to feed context, automate workflows, and avoid common pitfalls.

AI workflowAgentic EngineeringAutomation
0 likes · 15 min read
22 Agentic Engineering Hacks to Turbocharge Your AI Projects
AI Engineering
AI Engineering
Jun 7, 2026 · Artificial Intelligence

How a Four-Layer Configuration Stops Claude Code from Fabricating Answers

Claude Code often fabricates functions, imports, and test results, but by adding a four‑layer system—honesty rules in CLAUDE.md, a verification protocol, post‑write hooks, and a fact‑checking sub‑agent—developers can force the model to provide evidence, avoid false claims, and improve reliability in production.

ClaudeHooksLLM
0 likes · 12 min read
How a Four-Layer Configuration Stops Claude Code from Fabricating Answers
DataFunSummit
DataFunSummit
Jun 7, 2026 · Artificial Intelligence

How Qichacha Uses Large Language Models for Field‑Level Data Lineage

This article details Qichacha's technical journey of applying large language models to resolve field‑level data lineage challenges in a complex, multi‑source data environment, describing the motivation, architecture, practical implementation, engineering trade‑offs, and measurable outcomes.

AIBig DataData Governance
0 likes · 11 min read
How Qichacha Uses Large Language Models for Field‑Level Data Lineage
PaperAgent
PaperAgent
Jun 7, 2026 · Artificial Intelligence

How 100 Samples Let LLMs Master New Domains – The DOMINO Agent Breakthrough

The article explains how the DOMINO method lets large language models learn a domain from just dozens of real examples instead of hand‑written prompts, describes its trainable "domain switch" architecture, and shows experimental gains on time‑varying code tasks, highlighting more robust and diverse data synthesis.

DOMINOData SynthesisDomain Adaptation
0 likes · 8 min read
How 100 Samples Let LLMs Master New Domains – The DOMINO Agent Breakthrough
AI Engineer Programming
AI Engineer Programming
Jun 7, 2026 · Artificial Intelligence

Why Intent Recognition Is the Decision Hub of Agentic AI Systems

The article explains how intent recognition has evolved from simple keyword matching to a central decision hub in Agentic AI, covering basic concepts, LLM and small‑model solutions, hybrid architectures, clarification and out‑of‑scope handling, multi‑turn challenges, routing, evaluation methods, and best‑practice recommendations.

Agentic AIClarificationEvaluation
0 likes · 14 min read
Why Intent Recognition Is the Decision Hub of Agentic AI Systems
Code Mala Tang
Code Mala Tang
Jun 6, 2026 · Operations

How lowfat Cuts 91% of Command‑Line Noise Before Feeding LLMs

lowfat, a 289‑star Rust CLI tool, strips unnecessary prompts, help text, and formatting from command‑line outputs—reducing token counts by up to 97% (e.g., git log from 3350 to ~100 tokens)—and integrates with Claude Code, Shell, and OpenCode to save AI‑agent token costs.

AI AgentsCLILLM
0 likes · 9 min read
How lowfat Cuts 91% of Command‑Line Noise Before Feeding LLMs
Old Zhang's AI Learning
Old Zhang's AI Learning
Jun 6, 2026 · Artificial Intelligence

How to Build a Personal Knowledge Base with My Custom web‑pack Skill

This article explains how to construct a personal knowledge base using the author’s open‑source web‑pack Skill, which automates raw material collection, image localization, link expansion, and structured output, addressing the limitations of Obsidian’s Web Clipper and aligning with Karpathy’s LLM Wiki three‑layer architecture.

AI AgentsAutomationKnowledge Management
0 likes · 9 min read
How to Build a Personal Knowledge Base with My Custom web‑pack Skill
James' Growth Diary
James' Growth Diary
Jun 6, 2026 · Artificial Intelligence

How Honcho’s Dialectic User Model Lets Agents Learn Your Preferences Over Time

The article explains how Honcho transforms scattered conversation facts into a structured user model through a dialectic reasoning loop, detailing memory vs. user model differences, tool architecture, recall modes, prefetch caching, cost‑control mechanisms, peer cards, and common pitfalls for building ever‑more personalized AI agents.

AgentCost ControlDialectic Reasoning
0 likes · 15 min read
How Honcho’s Dialectic User Model Lets Agents Learn Your Preferences Over Time
CodePath
CodePath
Jun 6, 2026 · Artificial Intelligence

What Is PI‑Agent? Embracing a Minimalist Philosophy for Building AI Agents

The article introduces the overwhelming complexity of existing AI agent frameworks, presents PI‑Agent's subtraction philosophy and modular toolchain, outlines a twelve‑day hands‑on series with prerequisites, and aims to help readers build a focused AI agent without unnecessary bloat.

AI AgentAgent frameworkLLM
0 likes · 6 min read
What Is PI‑Agent? Embracing a Minimalist Philosophy for Building AI Agents
AI Engineer Programming
AI Engineer Programming
Jun 6, 2026 · Artificial Intelligence

How Query Rewriting Boosts Retrieval in RAG Systems

In RAG applications, ambiguous user queries often hinder retrieval effectiveness, so rewriting queries before search—through normalization, synonym expansion, linguistic rules, LLM‑based generation, query decomposition, and multi‑view strategies—can improve relevance, but must avoid over‑expansion, semantic drift, and added latency.

Information RetrievalLLMPrompt Engineering
0 likes · 11 min read
How Query Rewriting Boosts Retrieval in RAG Systems
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jun 5, 2026 · Artificial Intelligence

From Skill to Ontology: Building a Trustworthy Data Agent Semantic Layer

The article analyzes why expanding the Skill system with an ontology‑based semantic layer is essential for Data Agents, comparing metric‑centric and ontology‑centric approaches, outlining technical evolution from NL2SQL to NL2LF2SQL, and proposing a step‑by‑step implementation roadmap for enterprises.

AIData AgentData Infrastructure
0 likes · 16 min read
From Skill to Ontology: Building a Trustworthy Data Agent Semantic Layer
DataFunTalk
DataFunTalk
Jun 5, 2026 · Artificial Intelligence

How Xiaomi’s DataAgent Harness Secured Third Place in the Global Text‑to‑SQL BIRD Benchmark

It discusses Xiaomi DataAgent's third‑place ranking on the global BIRD Text‑to‑SQL benchmark, analyzes challenges such as model hallucination, lack of business knowledge, and complex multi‑table joins, and explains how a semantic harness addresses these problems to enable reliable enterprise data querying.

BIRD benchmarkDataAgentEnterprise AI
0 likes · 13 min read
How Xiaomi’s DataAgent Harness Secured Third Place in the Global Text‑to‑SQL BIRD Benchmark
Machine Heart
Machine Heart
Jun 5, 2026 · Artificial Intelligence

Do LLMs Need Sleep? CMU Paper Shows Memory Consolidation Improves Reasoning

Researchers from CMU and collaborators propose a ‘sleep’ phase for transformer‑based LLMs that repeatedly re‑processes accumulated context to update fast weights in a state‑space module, enabling memory consolidation that reduces KV‑cache pressure and markedly improves performance on long‑context, multi‑step reasoning benchmarks.

LLMLong ContextSSM
0 likes · 10 min read
Do LLMs Need Sleep? CMU Paper Shows Memory Consolidation Improves Reasoning
PaperAgent
PaperAgent
Jun 5, 2026 · Artificial Intelligence

The Most Systematic 102‑Page Review of Agent Harnesses

This article provides a comprehensive overview of the "Code as Agent Harness" paradigm, detailing its three‑layer architecture, the roles of code in reasoning, acting, and environment modeling, the mechanisms that enable reliable long‑term execution, and how multi‑agent systems scale the harness through shared code and feedback loops.

Agent HarnessCode as AgentLLM
0 likes · 10 min read
The Most Systematic 102‑Page Review of Agent Harnesses
AgentGuide
AgentGuide
Jun 5, 2026 · Artificial Intelligence

RAG vs Fine‑Tuning vs Long Context: Choosing the Right Technique for AI Agents

The article explains why Retrieval‑Augmented Generation (RAG) addresses the static knowledge limitation of large models, contrasts its role of “what to say” with fine‑tuning’s focus on “how to say,” compares costs and performance against long‑context models, and offers a practical hierarchy (Prompt → RAG → LoRA/QLoRA fine‑tuning → Distillation) plus best‑practice combinations.

AI AgentsLLMLong Context
0 likes · 9 min read
RAG vs Fine‑Tuning vs Long Context: Choosing the Right Technique for AI Agents
SuanNi
SuanNi
Jun 4, 2026 · Artificial Intelligence

Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks

Bernini combines a multimodal large language model with a diffusion renderer, uses a semantic planner‑renderer architecture, segment‑aware 3D position encoding and chain‑of‑thought reasoning, and achieves state‑of‑the‑art results on a 300‑case benchmark that outperforms closed‑source competitors.

BerniniLLMMultimodal AI
0 likes · 11 min read
Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks
Didi Tech
Didi Tech
Jun 4, 2026 · Artificial Intelligence

Designing a Multi‑Language, Multi‑Business LLM‑Powered Customer Service QA System

Didi's International Business Group built an LLM‑driven quality‑inspection platform for Spanish and Portuguese support across ride‑hailing, food delivery, and finance, using three pipelines—intent verification, compliance assessment, and VOC trend analysis—that boosted intent accuracy to 86%, compliance accuracy above 90%, and cut manual reporting time from hours to minutes.

LLMVOC analysiscompliance assessment
0 likes · 11 min read
Designing a Multi‑Language, Multi‑Business LLM‑Powered Customer Service QA System
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jun 4, 2026 · Artificial Intelligence

How Data Agents Transform Data Querying: Semantic Layer Integration and Decision‑Making (Part 1)

This article details the engineering journey of building enterprise‑grade Data Agents, covering the semantic‑layer integration that resolves NL‑to‑SQL inconsistencies, the skill‑based architecture that enables query, attribution, forecasting and cash‑flow actions, and the final multiplication formula that defines success in deep‑water AI‑driven decision making.

AI AgentData AgentData Governance
0 likes · 22 min read
How Data Agents Transform Data Querying: Semantic Layer Integration and Decision‑Making (Part 1)
Machine Heart
Machine Heart
Jun 4, 2026 · Artificial Intelligence

Defining Token Economics: A New Paradigm for LLM Agent Resource Allocation

The article introduces a systematic "Token Economics" framework that treats tokens as production factors, exchange media, and accounting units, and presents a four‑dimensional analysis of single‑agent to multi‑agent resource allocation, highlighting sustainability challenges and future research directions for LLM agents.

AI economicsAgentLLM
0 likes · 6 min read
Defining Token Economics: A New Paradigm for LLM Agent Resource Allocation
Top Architecture Tech Stack
Top Architecture Tech Stack
Jun 4, 2026 · Artificial Intelligence

Why OpenHuman’s Architecture Beats Its 118 Integrations

OpenHuman’s Memory Tree architecture separates hot and cold data paths, uses content‑addressed IDs, and builds layered summaries, offering low‑latency queries and robust idempotency for AI agents that need continuous background learning.

Content AddressingLLMLayered Summaries
0 likes · 7 min read
Why OpenHuman’s Architecture Beats Its 118 Integrations
DaTaobao Tech
DaTaobao Tech
Jun 3, 2026 · Artificial Intelligence

A Comprehensive Survey of Agent Memory: Benchmarks, Evaluation Frameworks, and System Designs

This article systematically reviews the state of agent long‑term memory by covering three core dimensions—benchmark datasets such as MUSE and LOCOMO, evaluation frameworks like MemoryAgentBench, LONGMEMEVAL and MemBench, and representative memory system implementations (THEANINE, RMM, M3‑Agent, Mem0)—while highlighting key capabilities, performance gaps, and future research directions.

AgentEvaluationLLM
0 likes · 25 min read
A Comprehensive Survey of Agent Memory: Benchmarks, Evaluation Frameworks, and System Designs
James' Growth Diary
James' Growth Diary
Jun 2, 2026 · Artificial Intelligence

Cross‑Session Retrieval with SQLite FTS5 and LLM Summaries – Hermes Agent’s Four‑Layer Architecture

This article dissects Hermes Agent’s four‑layer cross‑session retrieval system, covering persistent storage, dual‑table FTS5 indexing for CJK and English, a three‑path search strategy, intelligent truncation for LLM prompts, structured summarisation, and a holographic retrieval layer that blends FTS5, Jaccard similarity and HRR vector algebra.

Cross-Session RetrievalFTS5HRR
0 likes · 25 min read
Cross‑Session Retrieval with SQLite FTS5 and LLM Summaries – Hermes Agent’s Four‑Layer Architecture
Linyb Geek Road
Linyb Geek Road
Jun 2, 2026 · Artificial Intelligence

From Toy to Productivity: Real‑World Insights into AI Agent Harness Engineering

The article explains why large‑model AI agents need a dedicated Harness engineering layer—beyond prompt tricks—to become reliable collaborators in enterprise pipelines, illustrates the concept with the Aegis project, outlines common pitfalls, and shows how engineers can shift from writing code to steering and validating AI‑driven workflows.

AI AgentEnterprise AIHarness Engineering
0 likes · 26 min read
From Toy to Productivity: Real‑World Insights into AI Agent Harness Engineering
DaTaobao Tech
DaTaobao Tech
Jun 1, 2026 · Artificial Intelligence

Designing LLM‑Friendly Architecture: What Truly Makes an AI‑Friendly System?

The article analyzes how traditional deterministic engineering architectures clash with the probabilistic, semantic, and dynamic nature of LLM‑driven AI, proposing three paradigm shifts and detailing an AI‑Friendly stack—including Multi‑Agent, Context Engineering, and observability—that achieved 95.7% audit accuracy and over 80% efficiency gains in real‑world marketing scenarios.

AI ArchitectureLLMObservability
0 likes · 25 min read
Designing LLM‑Friendly Architecture: What Truly Makes an AI‑Friendly System?
IoT Full-Stack Technology
IoT Full-Stack Technology
Jun 1, 2026 · Artificial Intelligence

How Front‑End Developers Can Transition to AI Agent Engineering by 2026: A Complete Guide

This article analyses why front‑end engineers face shrinking opportunities by 2026, explains the rise of AI Agent technology, compares the required skill sets, outlines realistic salary expectations, and provides a step‑by‑step roadmap for a successful career shift into AI Agent development.

AI AgentLLMPrompt Engineering
0 likes · 20 min read
How Front‑End Developers Can Transition to AI Agent Engineering by 2026: A Complete Guide
AI Waka
AI Waka
Jun 1, 2026 · Artificial Intelligence

Why Claude Code Skills Fail to Activate and How to Achieve 100% Reliability

The article investigates why Claude Code skills activate only about half the time, describes a systematic series of 650 automated tests across description variants and environment conditions, and shows that an imperative SKILL.md description with a negative constraint reliably yields 100% activation.

ClaudeDockerExperimental Design
0 likes · 11 min read
Why Claude Code Skills Fail to Activate and How to Achieve 100% Reliability
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 31, 2026 · Artificial Intelligence

MetaAgent-X Enables Self‑Evolving Agents for Native Collaboration

MetaAgent-X tackles the limitation of fixed‑executor multi‑agent systems by jointly training a Designer that creates lightweight Python‑based collaboration scripts and an Executor that runs them, using hierarchical rollouts and stagewise co‑evolution to improve both design and execution across math and code benchmarks.

LLMMetaAgent-XMulti-Agent Systems
0 likes · 13 min read
MetaAgent-X Enables Self‑Evolving Agents for Native Collaboration
DeepHub IMBA
DeepHub IMBA
May 31, 2026 · Artificial Intelligence

Chunking Strategies for Video RAG: Pause‑Based, Sliding‑Window, and LLM‑Driven Methods

The article examines how to chunk transcribed video text for Retrieval‑Augmented Generation, comparing pause‑based, overlapping‑window, length‑based fallback, and LLM‑driven topic chunking methods, and shows how combining fine‑grained and thematic chunks yields a multi‑layered pipeline that improves context coverage for both precise and broad queries.

ChunkingLLMRAG
0 likes · 8 min read
Chunking Strategies for Video RAG: Pause‑Based, Sliding‑Window, and LLM‑Driven Methods
IT Services Circle
IT Services Circle
May 31, 2026 · Backend Development

Why Hand‑Crafted HTTP Calls to LLMs Are a Pitfall and How Spring AI Solves It

The article analyzes the hidden dangers of writing raw HTTP calls for large language models in Java projects—hard‑coded keys, fragile request bodies, missing retries, no observability—and demonstrates how Spring AI’s unified abstractions, built‑in resilience, streaming, function calling, and seamless Spring integration eliminate these issues while enabling effortless model switching and production‑grade AI services.

AI integrationFunction CallingJava
0 likes · 20 min read
Why Hand‑Crafted HTTP Calls to LLMs Are a Pitfall and How Spring AI Solves It
Smart Workplace Lab
Smart Workplace Lab
May 30, 2026 · Artificial Intelligence

Why Too Many AI “Perfect” Options Paralyze Decisions—and a 3‑Step Constraint Framework to Fix It

The article explains how an overload of AI‑generated options overwhelms human working memory, then presents a three‑step framework—hard‑constraint prompts, decision‑protection checklist, and overdue‑circuit‑breaker routing—that narrows choices, speeds decisions from days to hours, and improves execution certainty.

AI decision makingDecision AutomationLLM
0 likes · 6 min read
Why Too Many AI “Perfect” Options Paralyze Decisions—and a 3‑Step Constraint Framework to Fix It
DataFunTalk
DataFunTalk
May 30, 2026 · Artificial Intelligence

Deep Dive into Agent Harness: Dissecting the Architecture of AI Agents

This article breaks down the concept of an Agent Harness—a complete software infrastructure that surrounds large language models—covering its definition, three engineering layers, twelve core components, step‑by‑step execution flow, and the trade‑offs that determine production‑grade performance.

Agent HarnessContext ManagementLLM
0 likes · 19 min read
Deep Dive into Agent Harness: Dissecting the Architecture of AI Agents
Machine Heart
Machine Heart
May 30, 2026 · Artificial Intelligence

Beyond Single-Agent: Survey of Collaboration, Attribution, and Self‑Evolution in LLM Multi‑Agents

This survey introduces the LIFE framework for LLM‑based multi‑agent systems, outlining four stages—from individual agent capabilities through collaborative structures, failure attribution, to systemic self‑evolution—while analyzing how role design, communication, and scheduling affect performance, error propagation, and adaptive improvement.

AI SurveyFailure AttributionLLM
0 likes · 10 min read
Beyond Single-Agent: Survey of Collaboration, Attribution, and Self‑Evolution in LLM Multi‑Agents
Machine Heart
Machine Heart
May 30, 2026 · Artificial Intelligence

Can MIT’s Attention Matching Cut LLM Memory 50× Without Accuracy Loss?

MIT researchers introduce Attention Matching, a latent‑space KV‑cache compaction technique that reduces large‑language‑model memory usage up to 50‑fold with negligible precision loss, outperforming token‑pruning, summarization, and prior compaction methods across benchmarks like QuALITY, LongHealth, and AIME‑2025.

Attention MatchingKV cacheLLM
0 likes · 13 min read
Can MIT’s Attention Matching Cut LLM Memory 50× Without Accuracy Loss?
AI Engineer Programming
AI Engineer Programming
May 29, 2026 · Artificial Intelligence

How to Build a Reliable RAG Test Dataset

The article explains why a structured test set is essential for Retrieval‑Augmented Generation systems, outlines failure modes, describes layered evaluation of retrieval and generation, details infrastructure like chunk IDs and manifests, and provides a complete annotation pipeline with cold‑start and adversarial strategies.

EvaluationLLMRAG
0 likes · 24 min read
How to Build a Reliable RAG Test Dataset
Architect's Ambition
Architect's Ambition
May 29, 2026 · Artificial Intelligence

Enterprise Agent Deployment: Model Selection, Scenario Trade‑offs, and Platformization

This article breaks down the complete logic for rolling out enterprise‑grade AI agents, explaining the core definition, comparing autonomous planning versus workflow‑based models, outlining four Multi‑Agent collaboration patterns, and detailing a step‑by‑step optimization and platformization roadmap to avoid common pitfalls.

AI AgentsEnterprise AILLM
0 likes · 14 min read
Enterprise Agent Deployment: Model Selection, Scenario Trade‑offs, and Platformization
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 28, 2026 · Artificial Intelligence

Solo Development of GQLA: Challenging DeepSeek’s MLA and DSA

This article presents GQLA, a single‑author variant of MLA that eliminates three hardware‑related drawbacks of MLA, demonstrates how it achieves balanced compute‑memory performance on both high‑end H100 and more modest H20 GPUs, and details conversion methods (TransGQLA) and sparse extensions with concrete benchmark results.

GQLALLMMLA
0 likes · 16 min read
Solo Development of GQLA: Challenging DeepSeek’s MLA and DSA
ZhiKe AI
ZhiKe AI
May 28, 2026 · Artificial Intelligence

Why Your LLM Skill Gets Ignored and 5 Proven Design Patterns to Make Agents Work

Even after spending hours crafting a Skill, many LLM agents ignore it, leading to failed automation; this article analyzes why and presents five validated design patterns—linear flow, decision tree with lazy loading, iterative loops, baton passing, and multi‑stage checkpoints—plus concrete examples and a minimal Skill template to ensure reliable, production‑grade agent behavior.

AgentAutomationLLM
0 likes · 12 min read
Why Your LLM Skill Gets Ignored and 5 Proven Design Patterns to Make Agents Work
DeepHub IMBA
DeepHub IMBA
May 28, 2026 · Artificial Intelligence

AutoGen Multi‑Agent Demo: Coder, Reviewer, and Executor Automatically Complete a Code Review

The article explains how Microsoft’s AutoGen framework enables a Planner‑Executor‑Critic loop and a three‑agent GroupChat workflow, providing step‑by‑step Python code that configures AssistantAgent, UserProxyAgent, and ReviewerAgent to generate, review, and execute code automatically, and discusses the system’s advantages, scalability, and real‑world deployments.

AutoGenGroupChatLLM
0 likes · 13 min read
AutoGen Multi‑Agent Demo: Coder, Reviewer, and Executor Automatically Complete a Code Review
Machine Heart
Machine Heart
May 28, 2026 · Artificial Intelligence

Why Google’s AI Can’t Count the Letters in Its Own Name

The article examines why the newly AI‑powered Google Search fails at simple letter‑count questions like “how many P’s are in Google,” tracing the issue to token‑based language models, illustrating it with examples, and discussing both short‑term prompts and long‑term architectural solutions such as byte‑level models.

Google SearchJagged IntelligenceLLM
0 likes · 13 min read
Why Google’s AI Can’t Count the Letters in Its Own Name
James' Growth Diary
James' Growth Diary
May 28, 2026 · Artificial Intelligence

Mastering Prompt Engineering: Few‑Shot, Chain‑of‑Thought, and Self‑Consistency Techniques

This article breaks down three core prompt‑engineering techniques—Few‑Shot prompting for output format stability, Chain‑of‑Thought for multi‑step reasoning, and Self‑Consistency for answer robustness—showing when to use each, how to combine them in LangChain, and providing concrete code examples, performance data, and common pitfalls.

Chain-of-ThoughtDynamic RoutingFew-shot
0 likes · 30 min read
Mastering Prompt Engineering: Few‑Shot, Chain‑of‑Thought, and Self‑Consistency Techniques
Architect's Guide
Architect's Guide
May 28, 2026 · Artificial Intelligence

How Claude Code Prompt Caching Cuts AI Costs by Up to 90% and Boosts Efficiency

Prompt Caching in Anthropic's Claude Code replaces repeated processing of identical prompt prefixes with a prefix‑hash cache, slashing input‑token costs by up to 90%, reducing first‑token latency by 79%, and improving throughput, while preserving model output exactly as if no cache were used.

AI EngineeringCache InvalidationCache Metrics
0 likes · 30 min read
How Claude Code Prompt Caching Cuts AI Costs by Up to 90% and Boosts Efficiency
Big Data Tech Team
Big Data Tech Team
May 28, 2026 · Artificial Intelligence

Boosting Data Warehouse Productivity with AI: Practical Strategies and Use Cases

The article outlines how large language models can automate repetitive data‑warehouse tasks—from natural‑language SQL generation and standardized modeling to automated code review, metadata management, multimodal data handling, and self‑service analytics—presenting a three‑phase implementation roadmap for measurable efficiency gains.

AIChatBIData Governance
0 likes · 9 min read
Boosting Data Warehouse Productivity with AI: Practical Strategies and Use Cases
SuanNi
SuanNi
May 27, 2026 · Artificial Intelligence

Can Agent Skills Be Trained Like Neural Networks? SkillOpt Demonstrates Success

SkillOpt treats an agent’s Skill document as a trainable external state, applying classic deep‑learning tools such as epochs, batch size, learning rate and validation gating, and in experiments across 52 benchmark units it lifts GPT‑5.5 performance by an average of 23.5 points while enabling cross‑model and cross‑environment transfer with no additional inference cost.

Agent SkillDeep Learning OptimizationLLM
0 likes · 11 min read
Can Agent Skills Be Trained Like Neural Networks? SkillOpt Demonstrates Success
Data Party THU
Data Party THU
May 27, 2026 · Artificial Intelligence

How Bengio’s TBA Decouples Sampling and Learning to Speed Up LLM RL by 50×

The article explains how large‑language‑model post‑training suffers from rollout bottlenecks, introduces the Trajectory Balance with Asynchrony (TBA) framework that separates a Searcher from a Trainer, reuses off‑policy trajectories via a Trajectory Balance objective, and demonstrates up to 50× speed‑ups while preserving or improving performance on math reasoning, preference fine‑tuning, and automated red‑team tasks.

Asynchronous TrainingLLMOff-Policy
0 likes · 9 min read
How Bengio’s TBA Decouples Sampling and Learning to Speed Up LLM RL by 50×
Ximalaya Technology Team
Ximalaya Technology Team
May 27, 2026 · Artificial Intelligence

Ximalaya’s LLM‑Powered Interactive Recommendation System: Architecture and Results

The article details Ximalaya’s three‑layer interactive recommendation architecture—PBox for parameter control, an LLM‑driven Agent for intent understanding, and the iSUG interface—showing how natural‑language‑based parameter tuning shifts the paradigm from one‑way push to two‑way dialogue and significantly improves recommendation efficiency and user retention.

FunctionCallingInteractiveLLM
0 likes · 17 min read
Ximalaya’s LLM‑Powered Interactive Recommendation System: Architecture and Results
Bilibili Tech
Bilibili Tech
May 27, 2026 · Artificial Intelligence

How to Use A2UI + Vue to Enable Large Models to Generate Interactive Interfaces

This article details how a unified AI assistant framework built for Bilibili's advertising business evolves from plain text output to generating fully interactive UI by leveraging Google’s A2UI protocol, a custom Vue renderer, double‑validation mechanisms, SSE dual‑channel streaming, and a wrapper component system, providing concrete examples and architectural diagrams.

A2UIAgentGenerative UI
0 likes · 17 min read
How to Use A2UI + Vue to Enable Large Models to Generate Interactive Interfaces
James' Growth Diary
James' Growth Diary
May 27, 2026 · Operations

Detecting Agent Silent Killers: Early Alerts for Latency Spikes, Token Explosions, and Infinite Loops

The article presents a three‑layer monitoring system—LangSmith tracing, Prometheus metrics, and Alertmanager alerts—together with concrete metric definitions, alert rules, and code examples to proactively detect latency spikes, token overuse, and dead‑loop cycles in production LLM agents, while also outlining common pitfalls and best‑practice recommendations.

AgentCostAlertLLM
0 likes · 18 min read
Detecting Agent Silent Killers: Early Alerts for Latency Spikes, Token Explosions, and Infinite Loops
AI Step-by-Step
AI Step-by-Step
May 27, 2026 · Artificial Intelligence

Why Agent Context Management Prioritizes Information Over Shortening Prompts

The article breaks down the multi‑layered context of LLM agents, explains four management dimensions—capacity, content, structure, lifecycle—illustrates common failure scenarios, proposes four practical baselines, and maps maturity levels from free‑form heaps to full‑lifecycle orchestration.

AgentContext ManagementLLM
0 likes · 15 min read
Why Agent Context Management Prioritizes Information Over Shortening Prompts
Su San Talks Tech
Su San Talks Tech
May 27, 2026 · Artificial Intelligence

Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?

The article analyzes the drawbacks of manually coding HTTP calls to large language models—hard‑coded keys, fragile request construction, missing retries, and poor observability—and demonstrates how Spring AI’s layered abstraction, unified configuration, built‑in resilience, function calling, RAG support, and seamless Spring ecosystem integration solve these problems for production‑grade Java applications.

Function CallingJavaLLM
0 likes · 24 min read
Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?
James' Growth Diary
James' Growth Diary
May 26, 2026 · Artificial Intelligence

Curator Daemon: Managing the Birth, Aging, and Death of Hermes Agent Skills

The article dissects Hermes' Curator daemon—a lightweight forked agent that runs asynchronously after each dialogue to combat skill‑library entropy by identifying stale, redundant, or obsolete skills, applying a three‑state lifecycle, LLM‑driven merge decisions, provenance‑based archiving, and offering debugging tips.

AI AgentCuratorHermes
0 likes · 12 min read
Curator Daemon: Managing the Birth, Aging, and Death of Hermes Agent Skills
Yunqi AI+
Yunqi AI+
May 26, 2026 · Artificial Intelligence

How AI‑Native Products Bring Software Closer to the Business Frontline

The article analyzes how AI‑native products reshape traditional software by processing unstructured data with LLMs, adding a semantic layer that understands, calls, outputs, and learns from business context, thereby turning rapid business changes into traceable, reusable system capabilities.

AI-nativeLLMSemantic Layer
0 likes · 18 min read
How AI‑Native Products Bring Software Closer to the Business Frontline
Machine Heart
Machine Heart
May 26, 2026 · Artificial Intelligence

Beyond Simple Map APIs: How Spatial‑Agent Enables LLMs to Build Executable Geo‑Analysis Workflows

Spatial‑Agent introduces a GeoFlow Graph middle layer that transforms natural‑language map queries into verifiable, step‑by‑step geospatial analysis workflows, showing significant accuracy gains on MapEval‑API and MapQA benchmarks and highlighting the importance of GIScience concepts for reliable LLM‑driven spatial reasoning.

GIScienceGeoFlow GraphLLM
0 likes · 12 min read
Beyond Simple Map APIs: How Spatial‑Agent Enables LLMs to Build Executable Geo‑Analysis Workflows
Tencent Cloud Developer
Tencent Cloud Developer
May 26, 2026 · Artificial Intelligence

How TencentDB Agent Memory Cuts Tokens by 61% and Boosts Success Rate 52% with Mermaid Infinite Canvas and Context Offloading

The article presents a technical deep‑dive into TencentDB Agent Memory’s short‑term memory compression, which combines context offloading and a Mermaid‑based infinite canvas to reduce token usage by up to 61 % while improving task success rates by over 50 % across multiple long‑session benchmarks.

AgentContext OffloadingLLM
0 likes · 45 min read
How TencentDB Agent Memory Cuts Tokens by 61% and Boosts Success Rate 52% with Mermaid Infinite Canvas and Context Offloading
Tencent Cloud Developer
Tencent Cloud Developer
May 26, 2026 · Artificial Intelligence

What Hidden Secrets Does the Agent’s System Prompt Code Reveal?

This article dissects OpenClaw's agent architecture, detailing how the System Prompt, Skill modules, and Agent Loop interact, explaining PromptMode variations, safety rules, tool definitions, skill loading pipelines, heartbeat handling, sub‑agent spawning, silent replies, and the context engine that assembles messages for LLMs.

Agent LoopContext EngineLLM
0 likes · 17 min read
What Hidden Secrets Does the Agent’s System Prompt Code Reveal?
AI Step-by-Step
AI Step-by-Step
May 26, 2026 · Artificial Intelligence

How Prompt Caching Works in LLMs and How to Write More Efficient Prompts

The article explains that LLM prompt caching reuses internal KV states rather than full answers, compares provider implementations, quantifies cost and latency savings, and provides concrete guidelines for structuring prompts to maximize cache hits, along with monitoring signals and a practical evaluation checklist.

AI inferenceLLMPrompt Engineering
0 likes · 13 min read
How Prompt Caching Works in LLMs and How to Write More Efficient Prompts
The Dominant Programmer
The Dominant Programmer
May 26, 2026 · Artificial Intelligence

Spring AI ChatMemory: Concepts, Practical Setup, and Common Issues

This guide explains how Spring AI abstracts LLM conversation memory using a three‑layer architecture, demonstrates configuring MessageWindowChatMemory with a sliding‑window strategy, shows two ways to register the memory advisor, and provides complete Maven, YAML, and Java code examples with test screenshots.

ChatMemoryConversation MemoryJava
0 likes · 9 min read
Spring AI ChatMemory: Concepts, Practical Setup, and Common Issues
Baidu Geek Talk
Baidu Geek Talk
May 25, 2026 · Artificial Intelligence

RenderFlow: Agentic Code Delivery for Baidu’s Vertical Search Rendering Service

The article presents RenderFlow, a system that integrates LLM‑generated code into Baidu’s search result rendering pipeline by building a generate‑execute‑feedback‑repair‑publish loop, detailing its architecture, multi‑round repair mechanism, quality safeguards, and the resulting reduction of delivery cycles from days to minutes across nearly a thousand scenarios.

LLMagentic deliverycode generation
0 likes · 23 min read
RenderFlow: Agentic Code Delivery for Baidu’s Vertical Search Rendering Service
Linyb Geek Road
Linyb Geek Road
May 25, 2026 · Artificial Intelligence

Designing a Claude Code Harness for Production‑Grade Java Microservices

The article presents a detailed, production‑focused harness for Claude Code that structures prompts, rules, skills, and external hooks to compensate for LLM shortcomings in Java microservice development, preventing hallucinations, concurrency bugs, and false completions while ensuring reliable code delivery.

JavaLLMMicroservices
0 likes · 20 min read
Designing a Claude Code Harness for Production‑Grade Java Microservices
AI Architecture Path
AI Architecture Path
May 25, 2026 · Artificial Intelligence

Turn Any Codebase into an Interactive, Searchable Knowledge Graph with Claude‑Optimized Understand‑Anything

New developers often drown in massive legacy codebases, struggling to map dependencies and understand architecture, but Understand‑Anything leverages Claude, Tree‑sitter, and multi‑agent pipelines to generate a searchable, visual knowledge graph, offering onboarding tours, semantic QA, incremental diff analysis, and cross‑language support, while the article also compares it against competing tools and provides installation and usage guidance.

AI AgentsClaude CodeKnowledge Graph
0 likes · 15 min read
Turn Any Codebase into an Interactive, Searchable Knowledge Graph with Claude‑Optimized Understand‑Anything
Machine Heart
Machine Heart
May 24, 2026 · Artificial Intelligence

Can CODA Enable LLMs and Beginners to Write Lightning‑Fast Transformer Kernels?

CODA rewrites Transformer blocks as GEMM‑epilogue programs, exposing five primitive building blocks that let both AI‑generated code and human programmers fuse memory‑intensive operations into the GEMM epilogue, eliminating costly tensor moves and achieving up to 1.8× speed‑ups on H100 GPUs for RMSNorm, SwiGLU, RoPE and other components, while preserving numerical accuracy.

CODACUDAGEMM
0 likes · 11 min read
Can CODA Enable LLMs and Beginners to Write Lightning‑Fast Transformer Kernels?
Data Party THU
Data Party THU
May 24, 2026 · Artificial Intelligence

How Graphify Builds Codebase Knowledge Graphs and Replaces Vector Search with Graph Traversal

Graphify is a Python tool and Claude Code skill that creates a persistent, queryable knowledge graph of code, documentation, and media, cutting token usage by up to 71.5× compared with raw file reads, and it does so through a three‑pass pipeline that combines deterministic AST extraction, optional local audio transcription, and AI‑driven semantic extraction.

Claude CodeKnowledge GraphLLM
0 likes · 13 min read
How Graphify Builds Codebase Knowledge Graphs and Replaces Vector Search with Graph Traversal
Java Companion
Java Companion
May 24, 2026 · Artificial Intelligence

How a Chinese Open‑Source AI Code Auditor with 6K Stars Uncovered 49 CVEs

DeepAudit, a 6K‑star open‑source AI code‑audit system, uses a four‑agent architecture and sandboxed PoC verification to automatically discover and confirm 49 high‑severity CVEs across popular projects, while offering both deep audit and instant analysis modes, but it faces model dependency, cost, and sandbox limitations.

AI code auditCVELLM
0 likes · 11 min read
How a Chinese Open‑Source AI Code Auditor with 6K Stars Uncovered 49 CVEs