Tagged articles
2013 articles
Page 6 of 21
Woodpecker Software Testing
Woodpecker Software Testing
Mar 4, 2026 · Artificial Intelligence

Practical Testing of AI Agents: From ChatOps Assistants to Autonomous Driving Bots

The article examines the 2024 shift to dynamic AI agents, outlines why traditional testing falls short, and presents three real‑world case studies—ChatOps IT assistant, multi‑agent e‑commerce risk platform, and embodied inspection robot—detailing novel testing frameworks and measurable improvements.

AI agentsChatOpsHybrid Testing
0 likes · 8 min read
Practical Testing of AI Agents: From ChatOps Assistants to Autonomous Driving Bots
Woodpecker Software Testing
Woodpecker Software Testing
Mar 4, 2026 · Artificial Intelligence

Optimizing Prompt Performance: A Must‑Read Guide for Test Engineers

In the era of LLM‑driven intelligent testing, prompts act as test cases whose latency, token usage, retry rate, context retention, and determinism must be measured and optimized, and this article provides a concrete five‑metric framework and a four‑step practical method backed by real‑world data.

AI testingLLMPerformance Testing
0 likes · 8 min read
Optimizing Prompt Performance: A Must‑Read Guide for Test Engineers
Tencent Cloud Developer
Tencent Cloud Developer
Mar 4, 2026 · Artificial Intelligence

How OpenClaw Uses a Multi‑Layer Defense System to Prevent LLM Context Overflow

The article provides a detailed technical walkthrough of OpenClaw's three‑stage context‑management framework—including pre‑emptive pruning, LLM‑driven compaction, and overflow‑recovery truncation—showing how each layer protects long‑running AI agent sessions from exceeding token windows while preserving essential information.

LLMOpenClawcache optimization
0 likes · 27 min read
How OpenClaw Uses a Multi‑Layer Defense System to Prevent LLM Context Overflow
AI Tech Publishing
AI Tech Publishing
Mar 4, 2026 · Artificial Intelligence

AI Agent Context Management: Comparing Six Major Companies' Approaches

The article analyzes how six leading AI‑agent providers—Manus, Cursor, Anthropic, OpenAI, Google, and LangChain—tackle the fundamental problem of when and how a large language model should see information, detailing each solution, a cross‑company comparison matrix, consensus points, controversies, and open research questions.

AI agentsLLMMemory
0 likes · 19 min read
AI Agent Context Management: Comparing Six Major Companies' Approaches
Open Source Tech Hub
Open Source Tech Hub
Mar 4, 2026 · Artificial Intelligence

Building AI Agents: From Basics to OpenAI-Compatible LLM Calls

This article explains the fundamental concepts of AI agents, their perception‑reasoning‑action loop, the evolution from rule‑based bots to LLM‑driven agents, and provides step‑by‑step Python and PHP code for invoking a large language model via the OpenAI‑compatible API.

AILLMOpenAI
0 likes · 11 min read
Building AI Agents: From Basics to OpenAI-Compatible LLM Calls
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 3, 2026 · Artificial Intelligence

Can ROM‑Based LLM Accelerators Reach 20,000 tokens/s and End the GPU Era?

The article analyzes the ROMA and TOM architectures that embed large‑language‑model weights in on‑chip ROM + SRAM, achieving up to 20,000 tokens/s inference speed, compares them with GPU and Taalas solutions, and discusses their impact on edge AI, embodied intelligence, extreme environments, and privacy.

AI acceleratorEdge ComputingLLM
0 likes · 11 min read
Can ROM‑Based LLM Accelerators Reach 20,000 tokens/s and End the GPU Era?
Tencent Cloud Developer
Tencent Cloud Developer
Mar 3, 2026 · Artificial Intelligence

Why AI Coding Agents Are Just Loops + Context Engineering (And How to Build One)

The article explains that AI coding agents operate as a simple while‑loop driven by context engineering, details their core control flow, compares various tools, and provides a step‑by‑step Python implementation demonstrating how to define tools, system prompts, and the ReAct loop for practical use.

AI CodingLLMPython implementation
0 likes · 17 min read
Why AI Coding Agents Are Just Loops + Context Engineering (And How to Build One)
AI Explorer
AI Explorer
Mar 2, 2026 · Operations

Huawei Team’s LLM‑Enhanced Algorithm Wins CVRP Challenge, Redefining Optimization Design

A joint Huawei and City University of Hong Kong team combined large language models with evolutionary computation to solve the capacity‑constrained vehicle routing problem, winning the CVRPLib BKS Global Challenge and demonstrating how AI can automate and transform algorithm design, heralding a new paradigm for operations optimization.

AI for ScienceCVRPEvolutionary Algorithms
0 likes · 7 min read
Huawei Team’s LLM‑Enhanced Algorithm Wins CVRP Challenge, Redefining Optimization Design
Radish, Keep Going!
Radish, Keep Going!
Mar 2, 2026 · Artificial Intelligence

Why Do Your AI Agents Forget Over Time? A 3‑Layer Memory Architecture to Keep Them Sharp

This article explains why AI agents lose recall after prolonged use, analyzes three core flaws in current markdown‑based memory designs, reviews recent research, and presents a deterministic, zero‑cost three‑layer architecture—including short‑term, daily, and long‑term storage, a lightweight knowledge graph, and active forgetting mechanisms—to maintain reliable agent memory.

Knowledge GraphLLMOpenClaw
0 likes · 16 min read
Why Do Your AI Agents Forget Over Time? A 3‑Layer Memory Architecture to Keep Them Sharp
Woodpecker Software Testing
Woodpecker Software Testing
Mar 2, 2026 · Artificial Intelligence

Practical AI Agent Testing: From LLMs to Quality Control Breakthrough

The article recounts a fintech AI advisor project where a four‑layer testing pyramid—intent parsing, planning, tool integration, and end‑to‑end scenarios—was built to overcome the shortcomings of traditional input‑output tests for AI agents, achieving a 76% drop in P0 incidents and a 92.4% task‑completion rate.

AI AgentFinTechLLM
0 likes · 8 min read
Practical AI Agent Testing: From LLMs to Quality Control Breakthrough
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 2, 2026 · Artificial Intelligence

How “Skills” Turn LLM Prompts into Portable, Engineered Workflows

This article dissects the evolution of LLM prompts into structured, version‑controlled skill packages, explains the AgentSkills specification, details OpenClaw’s implementation, compares prompts, memory, MCP and skills, and provides end‑to‑end examples with code, flowcharts and best‑practice recommendations.

Agent SkillsLLMOpenClaw
0 likes · 40 min read
How “Skills” Turn LLM Prompts into Portable, Engineered Workflows
PaperAgent
PaperAgent
Mar 1, 2026 · Artificial Intelligence

How On-Policy Context Distillation Enables LLMs to Retain Experience Forever

On-Policy Context Distillation (OPCD) compresses transient in‑context knowledge into LLM parameters, allowing models to permanently retain problem‑solving experience without ground‑truth labels; the article details the OPCD framework, training steps, teacher‑student configurations, and experimental results on math, games, and system‑prompt tasks, highlighting its advantages over traditional context distillation.

LLMOPCDartificial intelligence
0 likes · 8 min read
How On-Policy Context Distillation Enables LLMs to Retain Experience Forever
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 28, 2026 · Artificial Intelligence

From Prompt Learning to SIPDO: The Closed‑Loop Evolution Driving Continuous Innovation

The article traces how prompt optimization has mirrored the historical evolution of parameter learning, outlines four development phases—from evolutionary search to beyond‑first‑order methods—and explains how SIPDO’s synthetic‑data feedback and difficulty‑progression create a closed‑loop system that yields consistent performance gains across LLM benchmarks.

AIClosed Loop LearningLLM
0 likes · 18 min read
From Prompt Learning to SIPDO: The Closed‑Loop Evolution Driving Continuous Innovation
AI Explorer
AI Explorer
Feb 28, 2026 · Artificial Intelligence

Explore the Awesome LLM Apps Repository: Hands‑On RAG and AI Agent Examples

The article presents the “Awesome LLM Apps” GitHub repository—over 98 000 stars and hundreds of open‑source LLM projects that showcase Retrieval‑Augmented Generation, AI agents, and multi‑agent collaborations across diverse use‑cases, and offers step‑by‑step guidance on browsing, cloning, configuring, and running these examples for developers, product managers, students, and AI enthusiasts.

AI agentsGitHubLLM
0 likes · 6 min read
Explore the Awesome LLM Apps Repository: Hands‑On RAG and AI Agent Examples
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 28, 2026 · Artificial Intelligence

How to Build a Private AI‑Powered RSS Reading Knowledge Base

The article details a fully automated workflow that fetches 92 top‑tech blogs via RSS, cleans the content into Markdown, uses a MiniMax‑M2.5 LLM to generate concise Chinese summaries, and delivers them through Bark and a Telegram bot, all stored for seamless integration with Obsidian.

AIBarkLLM
0 likes · 10 min read
How to Build a Private AI‑Powered RSS Reading Knowledge Base
DataFunSummit
DataFunSummit
Feb 27, 2026 · Artificial Intelligence

How Large Language Models Are Revolutionizing Ad Recommendation and Solving Cold‑Start Problems

This article explains how advertising recommendation is evolving from traditional feature‑engineered models to LLM‑driven pipelines, detailing data‑infrastructure challenges, semantic upgrades with multimodal embeddings, case studies in short‑video ads, user cold‑start prompt engineering, and future directions for generative recommendation systems.

Ad TechLLMMultimodal
0 likes · 12 min read
How Large Language Models Are Revolutionizing Ad Recommendation and Solving Cold‑Start Problems
Data Party THU
Data Party THU
Feb 27, 2026 · Artificial Intelligence

How “Vibe Coding” Is Redefining Software Development in 2026

Vibe coding, introduced by Andrej Karpathy, lets developers describe software functionality in natural language, letting large language models generate complete code, and the article reviews its concept, three leading 2026 tools (Cursor, Replit, Windsurf), a step‑by‑step workflow, advantages, drawbacks, and future trends.

AI CodingLLMVibe Coding
0 likes · 10 min read
How “Vibe Coding” Is Redefining Software Development in 2026
ByteDance SE Lab
ByteDance SE Lab
Feb 27, 2026 · Artificial Intelligence

How to Build Secure, Scalable LLM Agent Tools: Best Practices & Real-World Cases

This article explains why robust Agent Tools are essential for LLM agents, outlines a five‑stage lifecycle with concrete design principles such as type safety, LLM‑friendly interfaces, OpenAPI integration, self‑healing error handling, human‑in‑the‑loop safeguards, and performance optimizations, and demonstrates their impact through retail and fintech case studies.

Agent ToolsIndustry CasesLLM
0 likes · 20 min read
How to Build Secure, Scalable LLM Agent Tools: Best Practices & Real-World Cases
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 27, 2026 · Information Security

How ABACI AI Agent Automates Linux Kernel Fuzzing, Bug Attribution, and Patch Generation

The article presents ABACI, an AI‑driven kernel defect intelligent agent that automates the entire lifecycle of Linux kernel fuzzing, from test deployment and crash analysis to root‑cause bisect, fix‑bisect, and LLM‑generated patch creation, dramatically reducing manual effort and accelerating vulnerability remediation.

LLMPatch Generationfuzzing
0 likes · 23 min read
How ABACI AI Agent Automates Linux Kernel Fuzzing, Bug Attribution, and Patch Generation
AI Tech Publishing
AI Tech Publishing
Feb 27, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Building OpenClaw: A Persistent AI Assistant with Sessions, Tools, and Multi‑Agent Support

This tutorial walks through constructing OpenClaw from scratch, covering persistent JSONL sessions, SOUL.md persona files, tool definitions and an agent loop, permission checks, gateway architecture, context compression, long‑term memory, command queuing, scheduled heartbeats, and multi‑agent routing, all with concrete Python code examples.

AI agentsLLMMulti-Agent
0 likes · 38 min read
Step‑by‑Step Guide to Building OpenClaw: A Persistent AI Assistant with Sessions, Tools, and Multi‑Agent Support
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Feb 26, 2026 · Artificial Intelligence

How RAG Gives Large Language Models Their Own Knowledge Base – Illustrated with Easysearch

The article explains why Retrieval‑Augmented Generation (RAG) is needed to overcome large language models' knowledge cut‑off and hallucination issues, details the offline indexing and online retrieval‑generation workflow, compares RAG with fine‑tuning, and shows how Easysearch’s hybrid search makes an effective RAG backbone.

EasysearchFine-tuningHybrid Search
0 likes · 10 min read
How RAG Gives Large Language Models Their Own Knowledge Base – Illustrated with Easysearch
Tencent Cloud Developer
Tencent Cloud Developer
Feb 26, 2026 · Artificial Intelligence

Building a Minimalist AI Agent Framework: Theory, Architecture, and Code Walkthrough

This article explains the fundamentals of AI agents, compares major frameworks, introduces the ReAct, Plan‑and‑Execute, and Reflection paradigms, and provides a step‑by‑step Python implementation of a lightweight agent loop with LLM calls, tool execution, and context engineering, complete with usage examples and references.

AI AgentContext EngineeringLLM
0 likes · 28 min read
Building a Minimalist AI Agent Framework: Theory, Architecture, and Code Walkthrough
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 26, 2026 · Artificial Intelligence

Why Longer Token Chains Don't Mean Better Reasoning: Google's Deep Thinking Ratio

Google’s recent study shows that the length of a model’s token chain is negatively correlated with inference accuracy, and introduces the Deep Thinking Ratio (DTR) metric to identify truly reasoning tokens, enabling the Think@n strategy to halve compute cost without sacrificing performance.

Deep Thinking RatioInferenceLLM
0 likes · 6 min read
Why Longer Token Chains Don't Mean Better Reasoning: Google's Deep Thinking Ratio
SuanNi
SuanNi
Feb 25, 2026 · Artificial Intelligence

How SkillsBench Reveals the Real Impact of Agent Skills on LLM Performance

The SkillsBench benchmark systematically evaluates how professionally crafted Skills boost large language model agents across 84 complex tasks, revealing significant performance gains, domain‑specific effects, and the trade‑offs of skill size and model scale.

Agent SkillsLLMSkillsBench
0 likes · 11 min read
How SkillsBench Reveals the Real Impact of Agent Skills on LLM Performance
DataFunSummit
DataFunSummit
Feb 25, 2026 · Artificial Intelligence

Why RAG Fails in Production and How to Fix It: Expert Insights

This article summarizes a DataFun‑hosted roundtable where leading AI experts dissect the gap between RAG’s promise and real‑world deployment, exposing low recall, hallucinations, and cost overruns, then present systematic diagnostics, evaluation metrics, hybrid search, and engineering best practices to reliably operationalize RAG in enterprise settings.

Enterprise AIHybrid SearchLLM
0 likes · 18 min read
Why RAG Fails in Production and How to Fix It: Expert Insights
Data STUDIO
Data STUDIO
Feb 25, 2026 · Artificial Intelligence

Build a Large Language Model from Scratch with PyTorch—No Libraries, No Shortcuts

This guide walks you through building, training, and fine‑tuning a Transformer‑based large language model entirely from scratch using PyTorch, covering tokenization, self‑attention, multi‑head attention, positional encoding, model architecture, data preparation, training loops, and fine‑tuning on custom lyrics.

Fine-tuningGPTLLM
0 likes · 43 min read
Build a Large Language Model from Scratch with PyTorch—No Libraries, No Shortcuts
AI Insight Log
AI Insight Log
Feb 25, 2026 · Artificial Intelligence

How an Open‑Source Plugin Solves Claude Code’s Session‑Memory Loss

Claude Code forgets all prior context each new session because large language models only see the current window, but the open‑source claude‑mem plugin records project actions, compresses them into semantic summaries, and injects the relevant history back into Claude, dramatically reducing re‑explanation overhead.

AI AssistantClaude CodeLLM
0 likes · 8 min read
How an Open‑Source Plugin Solves Claude Code’s Session‑Memory Loss
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 24, 2026 · Artificial Intelligence

From Traditional RL to LLM‑RL: Theory Derivation and Engineering Improvements

The article walks through the fundamentals of traditional policy‑gradient reinforcement learning, derives the Reinforce objective, maps its concepts to large‑language‑model RL, and then discusses practical engineering solutions such as GRPO, async rollout, importance‑sampling corrections, and token‑flow management for industrial‑scale training.

Async RolloutGRPOImportance Sampling
0 likes · 10 min read
From Traditional RL to LLM‑RL: Theory Derivation and Engineering Improvements
DataFunSummit
DataFunSummit
Feb 24, 2026 · Artificial Intelligence

How Large Language Models Are Redefining Search Ranking at Tencent

This article details Tencent Search's exploration of large‑model‑driven ranking, covering the evolution from traditional keyword retrieval to RAG‑based AI search, the multi‑stage AI ranking architecture (L0‑L5), model training pipelines, distillation, synthetic data generation, and future research directions.

LLMRAGranking architecture
0 likes · 21 min read
How Large Language Models Are Redefining Search Ranking at Tencent
AI Product Manager Community
AI Product Manager Community
Feb 24, 2026 · Artificial Intelligence

Mastering AI Agents: 100 Essential Questions Across 5 Stages

This comprehensive guide walks you through five development stages of AI agents—core concepts, advanced planning, memory management, tool integration, and enterprise deployment—answering 100 practical questions that reveal definitions, architectures, best‑practice patterns, safety measures, and performance‑optimisation techniques for production‑grade agents.

AI agentsAgent ArchitectureEnterprise Deployment
0 likes · 34 min read
Mastering AI Agents: 100 Essential Questions Across 5 Stages
AI Waka
AI Waka
Feb 24, 2026 · Artificial Intelligence

Stop Fragmenting Docs: How Tree‑Based PageIndex Improves RAG Accuracy and Efficiency

The article explains why breaking documents into countless semantic fragments harms retrieval‑augmented generation, introduces PageIndex’s tree‑structured, inference‑driven approach as a superior alternative, and provides detailed setup, usage, and integration instructions for both local and production environments.

AIDocument SearchLLM
0 likes · 9 min read
Stop Fragmenting Docs: How Tree‑Based PageIndex Improves RAG Accuracy and Efficiency
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 24, 2026 · Artificial Intelligence

Master ReAct Agents: From Observation to Action with Real Code Examples

This article introduces the ReAct agent paradigm—combining reasoning and acting—explains its observation‑think‑act loop, showcases a step‑by‑step weather‑and‑clothing example, outlines essential components, provides pseudo‑code for the execution flow, and links to the Lynxe Func‑Agent framework on GitHub.

LLMReactToolCalling
0 likes · 11 min read
Master ReAct Agents: From Observation to Action with Real Code Examples
Black & White Path
Black & White Path
Feb 23, 2026 · Information Security

PentAGI: AI‑Powered Penetration Testing Platform Integrates 20+ Tools to Redefine Security Assessments

PentAGI is an open‑source, AI‑driven penetration testing platform released by VXControl in early 2025 that automatically orchestrates over twenty security tools—including Nmap, Metasploit, sqlmap—and generates comprehensive reports within isolated Docker environments, offering advanced agent architecture, real‑time intelligence gathering, and scalable deployment options.

AI penetration testingDockerLLM
0 likes · 5 min read
PentAGI: AI‑Powered Penetration Testing Platform Integrates 20+ Tools to Redefine Security Assessments
AI Tech Publishing
AI Tech Publishing
Feb 23, 2026 · Artificial Intelligence

Final Lesson: Build a Fully Working RSS News Brief Agent

In this final lesson of a nine‑day Agent engineering series, the author integrates the full Agent Loop, tools, MCP, skills, RAG, context handling, multi‑turn dialogue, and multi‑agent coordination to create a runnable RSS news‑briefing Agent that fetches feeds in parallel, filters content with LLMs, summarizes articles, and outputs a markdown report.

Agent CoordinationLLMMulti-Agent
0 likes · 12 min read
Final Lesson: Build a Fully Working RSS News Brief Agent
AI Tech Publishing
AI Tech Publishing
Feb 22, 2026 · Artificial Intelligence

Mastering Multi‑Agent Collaboration: Handoff Mode and Coordination

This lesson explains how to extend a single‑agent system with multi‑agent collaboration, covering context isolation, Handoff and Router patterns, flat coordinator architecture, code examples, task decomposition, and practical run‑time demos for building complex AI workflows.

AICoordinatorHandoff
0 likes · 20 min read
Mastering Multi‑Agent Collaboration: Handoff Mode and Coordination

Why the App Store Model Is Obsolete: Karpathy’s Radical Call for On‑Demand App Creation

Karpathy argues that as LLM agents can instantly generate highly customized software, the traditional App Store model of discrete downloadable apps is becoming outdated, sparking debate over AI‑native services, sensor APIs, and the future of on‑demand, temporary applications.

AI agentsAI-native CLIApp Store
0 likes · 8 min read
Why the App Store Model Is Obsolete: Karpathy’s Radical Call for On‑Demand App Creation
PaperAgent
PaperAgent
Feb 21, 2026 · Artificial Intelligence

Why Millions of LLM Agents Still Fail to Form a Real Society

An in‑depth analysis of the Moltbook platform shows that even with 2.6 million autonomous LLM agents interacting for months, large‑scale interaction does not automatically lead to genuine social structures, revealing three layers of socialization failure and offering a three‑dimensional diagnostic framework for AI societies.

AI agentsAI societyDiagnostic framework
0 likes · 9 min read
Why Millions of LLM Agents Still Fail to Form a Real Society
Architect
Architect
Feb 20, 2026 · Artificial Intelligence

How Agent Loops Give AI Agents a Personality: Engineering Secrets Revealed

This article explains how the Agent Loop—an engineered while‑loop that repeatedly calls an LLM, decides when to use tools, executes them, and feeds results back—creates persistence, style, memory, judgment, and safety boundaries that together make an AI agent feel like it has its own personality.

AI Agent EngineeringAgent LoopLLM
0 likes · 24 min read
How Agent Loops Give AI Agents a Personality: Engineering Secrets Revealed
Open Source Tech Hub
Open Source Tech Hub
Feb 20, 2026 · Artificial Intelligence

How to Build AI Agents in PHP with the Model Context Protocol (MCP)

Learn how to connect PHP-based AI agents to the Model Context Protocol (MCP) using the open‑source Neuron AI framework, covering MCP fundamentals, server setup, tool integration, and example code for creating custom agents that can invoke external APIs, databases, and web content.

AI agentsLLMMCP
0 likes · 12 min read
How to Build AI Agents in PHP with the Model Context Protocol (MCP)
21CTO
21CTO
Feb 19, 2026 · Fundamentals

Why Compilers Still Matter: Debunking Musk’s ‘Code‑Free’ Future

The article traces Grace Hopper’s pioneering compiler work, critiques Elon Musk’s claim that AI will eliminate coding, explains how modern compilers transform source code through multiple deterministic stages, and argues that source code remains essential despite advances in large language models.

Code OptimizationLLMSoftware Engineering
0 likes · 17 min read
Why Compilers Still Matter: Debunking Musk’s ‘Code‑Free’ Future
Black & White Path
Black & White Path
Feb 19, 2026 · Information Security

How AI Cracks AWS in Under 8 Minutes, Rendering Cloud Defenses Useless

A Sysdig report shows that attackers using large language models can steal credentials, elevate privileges, move laterally across 19 AWS accounts, hijack Amazon Bedrock models, and abuse GPU resources—all within eight minutes, leaving traditional cloud defenses with virtually no response window.

AIAWSGPU abuse
0 likes · 6 min read
How AI Cracks AWS in Under 8 Minutes, Rendering Cloud Defenses Useless
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 18, 2026 · Artificial Intelligence

Microsoft’s 671B LLM Unifies Offline Ad Tasks—Can It Cut Compute Costs?

Microsoft’s AdNanny replaces a forest of specialized offline models with a single 671 B LLM, using a three‑stage data factory to generate reasoning‑rich corpora, dynamic task re‑weighting, RL‑based metric alignment, and a hybrid 31‑pipeline‑parallel architecture that halves compute cost while boosting performance on core ad‑ranking tasks.

AdNannyLLMLarge Model
0 likes · 9 min read
Microsoft’s 671B LLM Unifies Offline Ad Tasks—Can It Cut Compute Costs?
AI Engineering
AI Engineering
Feb 17, 2026 · Artificial Intelligence

Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance

Claude Sonnet 4.6, Anthropic’s latest model, introduces a beta‑stage million‑token window and markedly better coding, computer‑use and long‑context reasoning, scoring 72.5% on OSWorld versus 14.9% for Sonnet 3.5, while offering Excel connectors, dynamic search filtering, stronger prompt‑injection resistance, and a pricing tier that makes it a strong alternative to Opus for many workloads.

AI CodingAPIClaude
0 likes · 4 min read
Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance
AI Insight Log
AI Insight Log
Feb 17, 2026 · Artificial Intelligence

Qwen 3.5 Launches on New Year’s Eve as DeepSeek Only Sends a Holiday Greeting

On Chinese New Year's Eve, Alibaba's Qwen 3.5 open‑source model—featuring a 397 billion‑parameter backbone with a 17 billion‑parameter active set, hybrid linear attention, and sparse MoE—was released under Apache 2.0, delivering 8.6‑19× faster inference, top‑tier agent, code and multimodal scores, and rapid integration across major AI platforms.

Apache 2.0LLMMultimodal
0 likes · 11 min read
Qwen 3.5 Launches on New Year’s Eve as DeepSeek Only Sends a Holiday Greeting
Black & White Path
Black & White Path
Feb 17, 2026 · Information Security

AI-Generated Malware Exploits React2Shell to Attack Docker: A Low‑Barrier Threat Surge

A Darktrace‑detected campaign shows AI‑generated malware leveraging the React2Shell vulnerability to compromise an intentionally exposed Docker daemon, download LLM‑crafted payloads, and install XMRig mining software, highlighting a new low‑skill threat vector that evades traditional signature defenses.

AI-generated malwareDockerLLM
0 likes · 5 min read
AI-Generated Malware Exploits React2Shell to Attack Docker: A Low‑Barrier Threat Surge
AI Cyberspace
AI Cyberspace
Feb 16, 2026 · Artificial Intelligence

Unlocking Claude’s Power: A Deep Dive into Agent Skills and Their Architecture

This article explains the concept, design, implementation, and best‑practice guidelines of Anthropic’s Claude Agent Skills, compares them with the MCP protocol, and provides practical instructions for creating, installing, and using Skills to extend large‑language‑model capabilities efficiently.

Agent SkillsClaudeLLM
0 likes · 18 min read
Unlocking Claude’s Power: A Deep Dive into Agent Skills and Their Architecture
AI Tech Publishing
AI Tech Publishing
Feb 15, 2026 · Artificial Intelligence

Mastering Agent Tool Use: Adding Search, Time, and Calculator Functions

This tutorial extends a minimal LLM Agent loop by introducing Tool Use (function calling) to give the agent actionable capabilities—searching the web, retrieving the current datetime, and performing mathematical calculations—while explaining the BaseTool architecture, registration process, system‑prompt adjustments, and practical execution examples.

AI AgentBaseToolFunction Calling
0 likes · 15 min read
Mastering Agent Tool Use: Adding Search, Time, and Calculator Functions
PaperAgent
PaperAgent
Feb 15, 2026 · Artificial Intelligence

How MiniCPM‑SALA Merges Sparse and Linear Attention to Break Long‑Context Limits

MiniCPM‑SALA introduces a hybrid sparse‑linear attention architecture that reduces quadratic compute and memory costs, achieves state‑of‑the‑art performance on long‑context benchmarks, and delivers up to 3.5× faster inference than full‑attention models on sequences up to 1 million tokens.

LLMLinear AttentionModel architecture
0 likes · 17 min read
How MiniCPM‑SALA Merges Sparse and Linear Attention to Break Long‑Context Limits
AI Insight Log
AI Insight Log
Feb 14, 2026 · Artificial Intelligence

ByteDance Unveils Doubao 2.0 Pro: A Domestic Model Taking on GPT‑5.2

ByteDance's Seed 2.0 Pro (Doubao 2.0) showcases industry‑leading performance on math, vision, document, long‑video, and code benchmarks, dramatically lowers inference cost, and is now available in the Doubao app and Trae IDE, positioning it as a serious challenger to GPT‑5.2 and other top LLMs.

AIByteDanceCode Generation
0 likes · 7 min read
ByteDance Unveils Doubao 2.0 Pro: A Domestic Model Taking on GPT‑5.2
DataFunTalk
DataFunTalk
Feb 14, 2026 · Artificial Intelligence

Memory‑Based Self‑Evolution: Enabling AI Agents to Learn Like Humans

This article explores a new agent‑optimization paradigm—Memory‑Based Self‑Evolution—detailing how dynamic memory systems such as Dynamic Cheatsheet, ReasoningBank, ACE, and MemGen transform LLM agents from static, parameter‑only models into continuously learning entities that can adapt to real‑world data, with a focus on insurance industry applications.

Agent MemoryInsurance AILLM
0 likes · 13 min read
Memory‑Based Self‑Evolution: Enabling AI Agents to Learn Like Humans
AI Engineering
AI Engineering
Feb 14, 2026 · Artificial Intelligence

DeepSeek‑V4‑Lite‑285B Hits 100% Recall in 256K Token Tests – A Needle‑in‑a‑Haystack Benchmark

Community testing of DeepSeek's rumored V4‑Lite‑285B model using the OpenAI MRCR 8‑pin standard shows perfect 1.0000 scores on several 128K‑token samples and a 256K‑token sample, achieving 100% recall in native 256K context while longer contexts drop to about 60%, with a note that the "needle‑in‑a‑haystack" method may be exploitable by DSA mechanisms.

DeepSeekLLMlong context
0 likes · 3 min read
DeepSeek‑V4‑Lite‑285B Hits 100% Recall in 256K Token Tests – A Needle‑in‑a‑Haystack Benchmark
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 14, 2026 · Artificial Intelligence

Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

The AliGo travel platform upgraded its AI assistant by replacing a single‑agent workflow with a modular multi‑agent system, introducing dynamic prompt generation, real‑time reasoning chains, context sharing, observability, and a knowledge base, which dramatically improved accuracy, stability, and user experience.

AI ArchitectureAgentScopeKnowledge Base
0 likes · 19 min read
Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering
PaperAgent
PaperAgent
Feb 12, 2026 · Artificial Intelligence

How GLM-5 Turns LLMs into System‑Architect Agents: A Deep Technical Review

An in‑depth analysis shows how GLM‑5 surpasses traditional code‑generation LLMs by autonomously designing, implementing, and debugging complex multi‑agent systems, from a fireworks HTML demo to a 35,000‑line TrustGraph refactor, highlighting its architecture, tool integration, and cost‑effective advantages.

AI CodingBackend DevelopmentLLM
0 likes · 9 min read
How GLM-5 Turns LLMs into System‑Architect Agents: A Deep Technical Review
DataFunTalk
DataFunTalk
Feb 11, 2026 · Artificial Intelligence

Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System

This round‑table dissects the gap between RAG’s hype and real‑world production, exposing common pitfalls such as low recall, hallucinations and cost overruns, and then delivers a systematic diagnostic framework, hybrid search strategies, fine‑tuning rules, and practical best‑practice roadmaps for building reliable enterprise RAG solutions.

Agentic RAGFine-tuningHybrid Search
0 likes · 20 min read
Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System
DaTaobao Tech
DaTaobao Tech
Feb 9, 2026 · Artificial Intelligence

Boosting Trustworthiness in Retrieval‑Augmented Generation: The Trustworthy Generation Design Pattern

This article presents the Trustworthy Generation design pattern for Retrieval‑Augmented Generation (RAG) systems, analyzes four root causes of low trustworthiness—retrieval errors, content reliability, pre‑retrieval reasoning mistakes, and model hallucinations—and proposes layered solutions, citation techniques, CRAG and Self‑RAG architectures, guardrails, and practical trade‑offs.

AI SafetyGenerationLLM
0 likes · 16 min read
Boosting Trustworthiness in Retrieval‑Augmented Generation: The Trustworthy Generation Design Pattern
Data Party THU
Data Party THU
Feb 9, 2026 · Artificial Intelligence

Aligning Collaborative Filtering with LLM Token Generation: The TCA4Rec Breakthrough

This paper introduces the TCA4Rec framework that directly aligns item‑level collaborative‑filtering preferences with token‑level objectives of large language models, presenting novel modules, extensive experiments, and analysis that demonstrate significant performance gains in generative recommendation tasks.

Generative RecommendationLLMRecommendation Systems
0 likes · 9 min read
Aligning Collaborative Filtering with LLM Token Generation: The TCA4Rec Breakthrough
PaperAgent
PaperAgent
Feb 9, 2026 · Artificial Intelligence

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

AMemGym introduces an on‑policy, interactive benchmark that evaluates and trains AI assistants' long‑term memory by structuring state evolution, diagnosing memory failures, and enabling agents to self‑evolve, revealing that selective memory writing outperforms passive approaches across various LLM and agent architectures.

AI memoryLLMagent
0 likes · 8 min read
Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym
Shuge Unlimited
Shuge Unlimited
Feb 9, 2026 · Artificial Intelligence

Build Agent Workflows in 3 Minutes with Refly’s Natural‑Language Builder

Refly is an open‑source Agent Skills Builder that lets you create production‑grade AI workflows in minutes using natural language, offering versioned, reusable skills, runtime intervention, extensive tool integrations, and export options that outperform traditional automation platforms.

AI automationComparisonLLM
0 likes · 16 min read
Build Agent Workflows in 3 Minutes with Refly’s Natural‑Language Builder
Data Party THU
Data Party THU
Feb 8, 2026 · Artificial Intelligence

How LangGraph Turns Multi‑Agent Workflows into Editable Graphs

This article explains LangGraph's graph‑based design, runtime behavior, state management, checkpoint persistence, and flexible workflow modifications, providing concrete code examples and patterns that illustrate why the framework is well‑suited for complex multi‑agent AI systems.

AILLMLangGraph
0 likes · 14 min read
How LangGraph Turns Multi‑Agent Workflows into Editable Graphs
AI Tech Publishing
AI Tech Publishing
Feb 8, 2026 · Artificial Intelligence

Why Bigger Context Windows Fail and How Structured Graphs Deliver Precise Fact Retrieval

The article argues that large language models struggle with exact factual answers and that extending context windows often degrades performance, while knowledge graphs provide structured, traceable retrieval; it proposes a unified graph monograph and small, focused context slices to empower LLMs with accurate information.

Context RetrievalKnowledge GraphLLM
0 likes · 10 min read
Why Bigger Context Windows Fail and How Structured Graphs Deliver Precise Fact Retrieval
AI Tech Publishing
AI Tech Publishing
Feb 6, 2026 · Artificial Intelligence

2026 Large Model Engineering Roadmap: From Foundations to Production

This roadmap outlines a step‑by‑step learning path for building, optimizing, and safely deploying large language model systems, covering fundamentals, vector stores, RAG, advanced techniques, fine‑tuning, inference speed, deployment, observability, agents, and production safeguards.

DeploymentFine-tuningInference
0 likes · 5 min read
2026 Large Model Engineering Roadmap: From Foundations to Production
Instant Consumer Technology Team
Instant Consumer Technology Team
Feb 6, 2026 · Artificial Intelligence

How AI‑Powered Agentic Labeling Transforms Customer Conversation Tagging

This article details an end‑to‑end AI system that replaces manual, error‑prone tagging of customer dialogues with a large‑language‑model‑driven, vector‑based pipeline that automatically discovers, clusters, and iteratively refines business‑level tags, dramatically cutting cycle time and improving coverage.

Agentic AIHDBSCANLLM
0 likes · 33 min read
How AI‑Powered Agentic Labeling Transforms Customer Conversation Tagging
PaperAgent
PaperAgent
Feb 6, 2026 · Artificial Intelligence

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

The paper introduces xMemory, a hierarchical "split‑aggregate‑retrieve" framework that reduces token usage by up to 30% and improves QA performance by more than 10 points in long‑range agent conversations, outperforming traditional RAG across multiple LLMs.

Agent MemoryHierarchical RetrievalLLM
0 likes · 8 min read
How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points
Data STUDIO
Data STUDIO
Feb 6, 2026 · Artificial Intelligence

Building a Basic Chatbot with LangGraph: Step‑by‑Step AI Agent Tutorial

This article walks through building AI agents with LangGraph in Python, starting with a simple GCD workflow and then creating a memory‑enabled chatbot using GPT‑4o, covering state management, nodes, edges, conditional loops, recursion limits, and visual debugging.

AI agentsChatbotLLM
0 likes · 18 min read
Building a Basic Chatbot with LangGraph: Step‑by‑Step AI Agent Tutorial
JD Tech
JD Tech
Feb 5, 2026 · Artificial Intelligence

How OxygenREC Marries Fast and Slow Thinking to Revolutionize E‑commerce Recommendations

OxygenREC presents a fast‑slow thinking, instruction‑following generative framework that overcomes latency, reasoning, and multi‑scene scalability challenges in e‑commerce recommendation, delivering unified training, low‑latency inference, and significant business impact across JD.com scenarios.

LLMe‑commercegenerative AI
0 likes · 13 min read
How OxygenREC Marries Fast and Slow Thinking to Revolutionize E‑commerce Recommendations
AI Tech Publishing
AI Tech Publishing
Feb 5, 2026 · Artificial Intelligence

From Java Backend to AI Agent Engineer: Essential Knowledge for the Transition

This comprehensive guide walks Java backend developers through the fundamentals of AI agents, comparing agents with traditional workflows, detailing core components such as LLMs, tools, and memory, and exploring practical patterns, frameworks, and code examples to help them successfully shift into AI agent development.

AI agentsAgent FrameworksLLM
0 likes · 35 min read
From Java Backend to AI Agent Engineer: Essential Knowledge for the Transition
Alibaba Cloud Native
Alibaba Cloud Native
Feb 4, 2026 · Artificial Intelligence

Boost Java Agent Performance with End‑to‑End Online Training Using Trinity‑RFT

This article explains how to overcome the training‑deployment gap for Java‑based AI agents by introducing a cloud‑native, low‑intrusion online training pipeline built on AgentScope Java and Trinity‑RFT, detailing architecture, configuration, custom selection and reward strategies, and showing measurable accuracy gains on a SQL‑Agent benchmark.

JavaLLMOnlineTraining
0 likes · 21 min read
Boost Java Agent Performance with End‑to‑End Online Training Using Trinity‑RFT
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 4, 2026 · Artificial Intelligence

Progressive Disclosure: Making Multi‑Skill LLM Agents Efficient and Scalable

This article examines the core challenge of giving large‑language‑model agents many abilities while keeping context size limited, compares three common loading strategies, introduces a progressive‑disclosure skill mechanism with three loading layers, and details its implementation, benefits, limitations, and suitable use cases in AgentScope‑Java.

JavaLLMProgressive Disclosure
0 likes · 17 min read
Progressive Disclosure: Making Multi‑Skill LLM Agents Efficient and Scalable
JD Cloud Developers
JD Cloud Developers
Feb 4, 2026 · Artificial Intelligence

How Deep Research Transforms LLMs into Autonomous AI Researchers

This article examines Deep Research, an AI system that adds autonomous planning and deep reasoning to large language models, enabling them to browse the web, perform long‑chain reasoning, and generate professional, citation‑rich reports for complex tasks such as industry trend analysis and technical competitive research.

AI researchAutonomous AgentsLLM
0 likes · 22 min read
How Deep Research Transforms LLMs into Autonomous AI Researchers
JD Tech Talk
JD Tech Talk
Feb 4, 2026 · Artificial Intelligence

How Deep Research Turns LLMs into Autonomous AI Researchers

This article explains the background, core features, underlying ReAct‑based architecture, and engineering solutions of Deep Research—a system that equips large language models with autonomous planning, long‑chain reasoning, and professional report generation to tackle complex information‑intensive tasks.

AI researchAutonomous AgentsLLM
0 likes · 21 min read
How Deep Research Turns LLMs into Autonomous AI Researchers
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Feb 4, 2026 · Artificial Intelligence

Why LLM Agents Rush to Call Tools and How to Stop Them

The article explains that premature tool calls in LLM agents stem from a data‑distribution bias in fine‑tuning, and it presents practical fixes such as adding non‑tool samples, enforcing a Thought chain, and using negative sampling to teach the model when to think before acting.

LLMThought ChainTool Calling
0 likes · 10 min read
Why LLM Agents Rush to Call Tools and How to Stop Them
Wuming AI
Wuming AI
Feb 3, 2026 · Artificial Intelligence

How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents

This article demystifies short‑term and long‑term memory in LLM‑driven autonomous agents, explaining their mechanisms, limitations, and practical implementations such as sliding windows, summarization, and vector‑based retrieval, while illustrating each concept with concrete Cherry Studio examples and relevant research references.

Autonomous AgentsCherry StudioLLM
0 likes · 7 min read
How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents
Amap Tech
Amap Tech
Feb 3, 2026 · Artificial Intelligence

Building a Scalable AI Agent Smart Task Framework for Offline & Event‑Driven Use

After LLMs entered the deep‑water stage, developers realized that agents must go beyond passive Q&A to support asynchronous, long‑running, and subscribable tasks; this article details the design, architecture, and engineering challenges of the “Xiao Gao Teacher AI Agent” smart‑task system, from event‑driven logic to fault‑tolerant deployment.

AI AgentEvent-Driven ArchitectureLLM
0 likes · 19 min read
Building a Scalable AI Agent Smart Task Framework for Offline & Event‑Driven Use
Data STUDIO
Data STUDIO
Feb 3, 2026 · Artificial Intelligence

Build a Self‑Thinking AI Agent with LangGraph: A Step‑by‑Step Guide

This tutorial explains how LangGraph adds explicit control‑flow, cycles, and shared state to LLM applications, and walks through building a Strava‑based intelligent training coach with Python code, node definitions, state design, graph assembly, and GitHub Actions deployment.

AI agentsLLMLangGraph
0 likes · 12 min read
Build a Self‑Thinking AI Agent with LangGraph: A Step‑by‑Step Guide
PaperAgent
PaperAgent
Feb 3, 2026 · Artificial Intelligence

Relink: Turning GraphRAG into a Dynamic, Query‑Driven Knowledge Graph

Relink introduces a ‘reason‑and‑construct’ paradigm that builds knowledge‑graph paths during inference, combining a high‑precision factual graph with a high‑recall potential‑relation pool, using query‑driven dynamic path expansion and contrastive alignment to markedly improve multi‑hop QA performance and robustness to sparse knowledge.

Dynamic RetrievalGraphRAGKnowledge Graph
0 likes · 8 min read
Relink: Turning GraphRAG into a Dynamic, Query‑Driven Knowledge Graph
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Feb 3, 2026 · Artificial Intelligence

Why Loss Masking Is the Hidden Key to Effective LLM Fine‑Tuning

The article explains how loss masking in supervised fine‑tuning of large language models prevents the model from learning irrelevant tokens such as user inputs, system prompts, tool outputs, and padding, thereby focusing training on the assistant’s responses and improving performance and generalization.

AI trainingFine-tuningLLM
0 likes · 10 min read
Why Loss Masking Is the Hidden Key to Effective LLM Fine‑Tuning
Java Architecture Diary
Java Architecture Diary
Feb 2, 2026 · Artificial Intelligence

Why a 10‑Year‑Old Java JSON Library Is Now Targeting LLMs with TOON

json-io, a decade‑old Java JSON library known for zero‑config, circular‑reference support, and lightweight size, has added full TOON (Token‑Oriented Object Notation) read/write capabilities, a token‑efficient format designed for LLMs that can cut serialization costs by 30‑60% and integrates seamlessly with Spring Boot and Spring AI.

AIJavaLLM
0 likes · 9 min read
Why a 10‑Year‑Old Java JSON Library Is Now Targeting LLMs with TOON
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 2, 2026 · Artificial Intelligence

Boosting A/B Experiment Automation: Prompt Engineering Achieves 80% Accuracy

This article details how a production‑grade prompt system powered by large language models was designed to replace manual A/B experiment inspection, introducing a six‑level priority decision tree, robust data preprocessing, and systematic bad‑case analysis that lifted automation accuracy from 68% to over 80% while providing clear, explainable recommendations.

A/B testingLLMPrompt engineering
0 likes · 46 min read
Boosting A/B Experiment Automation: Prompt Engineering Achieves 80% Accuracy
AI Waka
AI Waka
Feb 1, 2026 · Artificial Intelligence

Boost LLM Inference Speed: Precision Tricks, Quantization, and Multi‑GPU Strategies

This article reviews practical techniques for accelerating large language model inference—including reduced‑precision formats, post‑training quantization, adapter‑based fine‑tuning, pruning, continuous batch processing, and multi‑GPU deployment—while providing concrete code examples, benchmark results, and guidance on selecting the right approach for production workloads.

GPUInferenceLLM
0 likes · 20 min read
Boost LLM Inference Speed: Precision Tricks, Quantization, and Multi‑GPU Strategies
Data Party THU
Data Party THU
Feb 1, 2026 · Artificial Intelligence

How AutoLink Turns Schema Linking into an Interactive Database Exploration

AutoLink introduces an autonomous, iterative schema‑linking approach for Text‑to‑SQL that treats schema discovery as a progressive, agent‑driven exploration, dramatically improving recall while cutting token costs, and outperforms existing database‑level and element‑level methods on large benchmarks such as Spider 2.0‑Lite and BIRD.

AutoLinkDatabase ExplorationLLM
0 likes · 19 min read
How AutoLink Turns Schema Linking into an Interactive Database Exploration
Architecture and Beyond
Architecture and Beyond
Feb 1, 2026 · Artificial Intelligence

5 High‑ROI Strategies to Supercharge RAG Retrieval Performance

This article outlines five practical engineering strategies—multi‑vector retrieval, manual splitting and labeling, scalar enhancement, context augmentation, and dense‑sparse vector integration—that together address common RAG retrieval bottlenecks and dramatically improve recall stability and answer quality.

BM25EngineeringLLM
0 likes · 17 min read
5 High‑ROI Strategies to Supercharge RAG Retrieval Performance
AI Waka
AI Waka
Jan 31, 2026 · Artificial Intelligence

Build a 2026‑Ready LangGraph AI Agent: A Step‑by‑Step Guide

This tutorial walks you through constructing a LangGraph‑based AI agent for automated Strava training plans, covering core concepts like state, nodes, and edges, detailed workflow steps, Python code examples, conditional graph routing, testing, and deployment via GitHub Actions.

AI AgentLLMLangGraph
0 likes · 18 min read
Build a 2026‑Ready LangGraph AI Agent: A Step‑by‑Step Guide
Data Party THU
Data Party THU
Jan 31, 2026 · Artificial Intelligence

Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough

The article examines the Test‑Time Training to Discover (TTT‑Discover) approach, which applies reinforcement learning during inference to let large language models continuously improve on single test problems, and reports strong results across mathematics, GPU kernel optimization, algorithm design, and biology.

AI researchLLMScientific Discovery
0 likes · 9 min read
Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough
DaTaobao Tech
DaTaobao Tech
Jan 30, 2026 · Artificial Intelligence

Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling

This article proposes an ASR‑driven pipeline that creates high‑quality AI‑reply vs. human‑like reply pairs, trains a rewrite model and a reward model, and uses GRPO reinforcement learning to generate natural, helpful, and less AI‑sounding responses in digital‑human live streaming, achieving 92% accuracy and 97% helpfulness while improving user experience.

ASR dataLLMQwen
0 likes · 20 min read
Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling