Tagged articles
2015 articles
Page 11 of 21
DataFunTalk
DataFunTalk
Sep 5, 2025 · Artificial Intelligence

Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and details the four essential modules—Profile, Memory, Planning, and Action—that power large‑language‑model agents in large‑scale AI serving.

AI agentsAnt GroupDistributed Systems
0 likes · 5 min read
Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray
Instant Consumer Technology Team
Instant Consumer Technology Team
Sep 5, 2025 · Artificial Intelligence

How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×

This article explains how Context Engineering (CE) extends Prompt Engineering by integrating seven core elements—system prompts, user input, short‑term memory, long‑term memory, retrieval, tools, and structured output—using the open‑source Dify platform to build dynamic, multimodal agents that cut inference costs tenfold and raise complex‑task success rates by 40%.

AI Agent DevelopmentDifyLLM
0 likes · 16 min read
How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 5, 2025 · Artificial Intelligence

How Browser-Use Leverages LLMs to Transform Browser Automation

This article explores Browser-Use, an AI‑driven browser automation framework that combines large language models, visual perception, and DOM analysis to enable intelligent, multi‑step web tasks such as registration, price comparison, form filling, and monitoring, while detailing its architecture, historical context, core modules, and future challenges.

AI agentsBrowser AutomationLLM
0 likes · 26 min read
How Browser-Use Leverages LLMs to Transform Browser Automation
Data Party THU
Data Party THU
Sep 4, 2025 · Artificial Intelligence

How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU

This article analyzes the memory bottleneck of massive language models, explains the mathematical modeling of memory requirements, evaluates traditional sharding limits, and details how GPT‑OSS’s MXFP4 quantization combined with Mixture‑of‑Experts reduces memory, bandwidth, and compute demands enough to fit a 1200‑billion‑parameter model onto an 80 GB GPU with minimal accuracy loss.

FP4LLMMXFP4
0 likes · 11 min read
How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU
Data Party THU
Data Party THU
Sep 4, 2025 · Artificial Intelligence

Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive

This article provides a comprehensive technical analysis of PPO‑based reinforcement learning methods for large language models, detailing the evolution from the original PPO algorithm through GRPO, DAPO, and GSPO, and explaining their motivations, mathematical formulations, advantages, and practical challenges such as entropy collapse and importance‑sampling variance.

DAPOGRPOGSPO
0 likes · 30 min read
Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive
Tencent Cloud Developer
Tencent Cloud Developer
Sep 4, 2025 · Artificial Intelligence

Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents

Youtu-Agent, an open‑source agent framework released by Tencent Youtu Lab, combines minimalist design with high performance, delivers strong benchmark results without training or proprietary models, and offers flexible, cost‑effective, automated agent generation for researchers, developers, and AI enthusiasts.

AI agentsFrameworkLLM
0 likes · 12 min read
Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 4, 2025 · Artificial Intelligence

Why Context Engineering Is the New Frontier in LLM Development

This article explores the rise of Context Engineering as an essential discipline for large language models, comparing it to Prompt Engineering, detailing its definition, classifications, common pitfalls such as poisoning and distraction, and presenting best‑practice strategies and an LLM‑OS analogy for building robust AI agents.

LLMLLM OSMemory Management
0 likes · 27 min read
Why Context Engineering Is the New Frontier in LLM Development
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 4, 2025 · Artificial Intelligence

How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark

The August 2025 SCALE benchmark evaluates new AI models—including the GPT‑5 family, DeepSeek‑V3.1, and the SQLShift tool—across SQL understanding, optimization, and dialect conversion, revealing distinct strengths, weaknesses, and the growing advantage of specialized tools over generic large language models.

AIDeepSeekGPT-5
0 likes · 15 min read
How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark
Sohu Tech Products
Sohu Tech Products
Sep 3, 2025 · Artificial Intelligence

How GRPO Revolutionizes RLHF for Large Language Models

This article explains the motivation, mathematical foundations, implementation details, advantages, experimental results, and future directions of Group Relative Policy Optimization (GRPO), a novel reinforcement‑learning algorithm that replaces PPO’s value network with efficient group‑wise relative evaluation for large language models.

Artificial IntelligenceGRPOLLM
0 likes · 17 min read
How GRPO Revolutionizes RLHF for Large Language Models
DataFunSummit
DataFunSummit
Sep 3, 2025 · Artificial Intelligence

Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers

This article explains the Model Context Protocol (MCP), its three‑layer architecture, its core advantages, and step‑by‑step development of an MCP server in TypeScript (with Python and C++ examples), showing how LLMs can invoke tools for tasks like Unreal Engine code analysis.

LLMMCPPython
0 likes · 16 min read
Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers
37 Interactive Technology Team
37 Interactive Technology Team
Sep 3, 2025 · Artificial Intelligence

How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices

Discover how AI, especially large language models, transforms traditional web scraping by introducing semantic understanding, dynamic adaptability, and automated extraction, with in‑depth reviews of emerging tools like Crawl4AI and Browser‑use, practical code examples, best‑practice guidelines, and deployment tips for modern data collection.

AIBrowser UseCrawl4AI
0 likes · 17 min read
How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 3, 2025 · Artificial Intelligence

How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards

Atom-Searcher introduces an atomic‑thought reinforcement‑learning framework that decomposes complex reasoning into fine‑grained units, uses a Reasoning Reward Model to assign step‑wise rewards, dynamically balances process and result incentives, and achieves state‑of‑the‑art performance on multiple LLM benchmarks.

Agentic ResearchAtomic ThoughtLLM
0 likes · 12 min read
How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards
Cognitive Technology Team
Cognitive Technology Team
Sep 3, 2025 · Artificial Intelligence

How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices

This article chronicles the author's hands‑on journey of designing AI agents to automatically generate Helm charts for open‑source applications, exploring agent role definition, behavior paradigms like ReAct and plan‑and‑execute, prompt engineering challenges, structured workflows, multi‑agent collaboration, and practical lessons for reliable, production‑grade automation.

AI agentsAgent FrameworksHelm chart automation
0 likes · 29 min read
How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices
Architects Research Society
Architects Research Society
Sep 2, 2025 · Artificial Intelligence

What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?

The article contrasts pseudo‑agent AI—such as simple LLM chatbots, RPA scripts, and RAG systems—with genuine agentic AI architectures that combine large language models, orchestrators, memory stores, tool‑calling, planning modules, and multi‑agent collaboration, highlighting key capabilities like autonomous planning, feedback loops, and dynamic tool coordination.

Autonomous PlanningLLMOrchestrator
0 likes · 3 min read
What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?
DataFunSummit
DataFunSummit
Sep 2, 2025 · Artificial Intelligence

How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and breaks down the four essential modules—Profile, Memory, Planning, and Action—that enable large‑language‑model agents to operate in real‑world scenarios.

Ant GroupDistributed SystemsLLM
0 likes · 5 min read
How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents
Coder Circle
Coder Circle
Sep 2, 2025 · Artificial Intelligence

Unlocking the New Era of AI Development: Exploring Spring AI Core Classes

This article walks through Spring AI’s three core classes—Message, Prompt, and ChatModel—explaining their roles, showing concrete code examples for constructing messages, building prompts, and invoking a large language model via a REST controller, and provides a complete demo repository.

ChatModelLLMMessage
0 likes · 3 min read
Unlocking the New Era of AI Development: Exploring Spring AI Core Classes
Data Party THU
Data Party THU
Sep 1, 2025 · Artificial Intelligence

Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou

The article analyzes Denny Zhou's Stanford CS25 lecture on large language model reasoning, explaining how intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation together unlock powerful reasoning capabilities beyond traditional greedy decoding.

AI researchChain-of-ThoughtLLM
0 likes · 17 min read
Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 31, 2025 · Artificial Intelligence

Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining

AlphaEval is a unified, parallelizable evaluation framework that assesses Alpha mining models across predictive ability, time stability, market‑perturbation robustness, financial logic, and diversity without backtesting, matching full backtest results while offering higher efficiency and open‑source reproducibility.

Alpha MiningEvaluation FrameworkLLM
0 likes · 10 min read
Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining
JD Retail Technology
JD Retail Technology
Aug 29, 2025 · Artificial Intelligence

Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide

The article recounts how a risk‑control algorithm engineer transformed a generic large language model into a specialized e‑commerce fraud detector by iteratively designing prompts, injecting business rules, structuring I/O, and introducing a dual‑hypothesis decision framework to achieve accurate, automated risk analysis.

Artificial IntelligenceLLMRisk Detection
0 likes · 11 min read
Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 28, 2025 · Artificial Intelligence

Key AI-Driven Quantitative Finance Papers from KDD2025

This article summarizes recent AI research on quantitative finance, covering AlphaAgent's LLM-driven alpha mining, UMI's multi‑level irrationality factors, PDU's progressive dependency learning for stock ranking, SSPT's stock‑specific pretraining transformer, and Enhancer's distribution‑aware meta‑learning framework, all of which demonstrate improved stock prediction and resistance to alpha decay.

Alpha MiningFinancial AILLM
0 likes · 9 min read
Key AI-Driven Quantitative Finance Papers from KDD2025
IT Services Circle
IT Services Circle
Aug 28, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It

Developers using DeepSeek V3.1's API have reported that the model intermittently inserts the Chinese character “极” (or its variants) into generated code, a bug that spreads across multiple platforms and threatens high‑precision code generation, prompting community workarounds and speculation about its root causes.

AI model bugDeepSeekLLM
0 likes · 6 min read
Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 28, 2025 · Artificial Intelligence

How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark

This report objectively evaluates DeepSeek‑V3.1 on professional‑grade SQL tasks, presenting its balanced strengths in understanding, optimization, and dialect conversion, highlighting its top scores in syntax error detection and Chinese database conversion while exposing weaknesses in execution‑plan analysis and large‑SQL transformations.

Artificial IntelligenceDeepSeekLLM
0 likes · 8 min read
How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark
Fun with Large Models
Fun with Large Models
Aug 28, 2025 · Artificial Intelligence

A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework

The article compares LangGraph with LangChain, explains why a graph‑based architecture offers greater flexibility than linear chains, outlines LangGraph’s three‑layer core architecture and its ecosystem tools—including LangSmith, LangGraph Studio, CLI, and Agent Chat UI—while noting its reliance on LangChain and the need for VPN for CLI usage.

AI agentsGraph WorkflowLLM
0 likes · 11 min read
A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework
Alibaba Cloud Native
Alibaba Cloud Native
Aug 27, 2025 · Artificial Intelligence

How LoongSuite Enables Full‑Stack Observability for LLM Applications

The article explains the rapid evolution of the AI application ecosystem, outlines the challenges of end‑to‑end observability for large‑language‑model services, and details how the open‑source LoongSuite suite—through non‑intrusive instrumentation for Python and Go agents and tight integration with the Dify platform—provides comprehensive, cloud‑native monitoring, tracing, and metric collection across the entire AI stack.

AICloud NativeDify
0 likes · 19 min read
How LoongSuite Enables Full‑Stack Observability for LLM Applications
Wuming AI
Wuming AI
Aug 26, 2025 · Artificial Intelligence

A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems

This article presents a hierarchical breakdown of Agentic AI, detailing the foundational large language models, the capabilities of AI agents, the coordination mechanisms of multi‑agent systems, and the supporting infrastructure needed for reliability, scalability, and security.

AI agentsAgentic AIInfrastructure
0 likes · 5 min read
A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems
DataFunSummit
DataFunSummit
Aug 25, 2025 · Artificial Intelligence

Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment

This article explains how Xiaomi designed and deployed a vertical‑domain question‑answering assistant for product and car queries, covering business background, a four‑module RAG‑plus‑LLM architecture, knowledge‑base construction, custom chunking strategies, dynamic signal handling, and the challenges overcome to achieve reliable real‑time voice interactions.

Agent ArchitectureLLMRAG
0 likes · 22 min read
Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment
DataFunSummit
DataFunSummit
Aug 24, 2025 · Artificial Intelligence

Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights

This article examines the core mechanisms of large language models, revealing asymmetric token behaviors, novel token‑compression techniques, scaling‑law theory, and mixed‑precision quantization methods that together boost inference efficiency while dramatically reducing model size.

Artificial IntelligenceLLMToken Compression
0 likes · 26 min read
Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights
Data Party THU
Data Party THU
Aug 22, 2025 · Artificial Intelligence

How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs

This article presents a systematic study of extracting knowledge graphs from unstructured news articles using small quantized LLMs, exposing the brittleness of LangChain's JSON‑based pipelines, evaluating prompt‑engineering fixes, and introducing the BAML framework whose fuzzy parsing and concise schema raise extraction success from roughly 25% to over 99% on a 344‑document benchmark.

BAMLGraphRAGLLM
0 likes · 33 min read
How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs
Ctrip Technology
Ctrip Technology
Aug 22, 2025 · Artificial Intelligence

How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%

This article explains how an AIGC‑driven solution uses large language models, prompt engineering, and a layered architecture built on Flask and LangChain to automatically transform product requirement documents into structured, BDD‑style test cases, achieving 89% adoption and up to 70% time reduction.

AI testingAIGCFlask
0 likes · 9 min read
How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%
Data Thinking Notes
Data Thinking Notes
Aug 21, 2025 · Artificial Intelligence

Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning

This article distills Denny Zhou’s Stanford CS25 lecture, explaining how large language models achieve reasoning through intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation, while highlighting theoretical foundations and practical breakthroughs.

Chain-of-ThoughtLLMReinforcement Learning
0 likes · 18 min read
Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning
21CTO
21CTO
Aug 21, 2025 · Artificial Intelligence

Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities

The article analyzes why AI agent initiatives often flop compared to traditional software, explains the fundamental differences in development approaches, and introduces a three‑step Agent Capability Benchmark Testing framework with concrete evaluation criteria and a practical weekly‑report agent example.

AI agentsLLMagent development
0 likes · 12 min read
Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 21, 2025 · Artificial Intelligence

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Since last year, the debate over “Prompt Engineering” has split between practitioners who favor “Context Engineering” for building scalable agent systems and scholars who treat Prompt Engineering as a broad umbrella term, highlighting the need to dynamically construct and manage context for reliable, extensible AI applications.

AI agentsLLMRAG
0 likes · 33 min read
Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG
Alibaba Cloud Native
Alibaba Cloud Native
Aug 21, 2025 · Cloud Native

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

This article explains why traditional load‑balancing methods fall short for large language model services and introduces Higress AI Gateway's three specialized algorithms—global minimum‑request, prefix‑matching, and GPU‑aware load balancing—detailing their design, Redis‑based implementation, deployment steps, and performance gains.

GPULLMload balancing
0 likes · 11 min read
How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 21, 2025 · Artificial Intelligence

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

This article details the challenges of building an AI‑powered defect deduplication system using Retrieval‑Augmented Generation, explains why LLMs produce composite (spliced) results, diagnoses the root cause as information loss in the RAG pipeline, and presents a step‑by‑step solution that restores atomicity of records for reliable duplicate detection.

AI debuggingKnowledge BaseLLM
0 likes · 14 min read
Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It
JD Tech
JD Tech
Aug 20, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

This article examines the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, applies iterative DPO training and self-consistency voting, and demonstrates how these techniques raise execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD benchmarkIterative DPOJ-Schema
0 likes · 11 min read
Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 20, 2025 · Artificial Intelligence

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

This article explains how Alibaba Cloud OpenSearch LLM version evolved from RAG 1.0 to RAG 2.0, introducing the DeepSearch multi‑agent architecture that combines offline data processing, online query handling, planning, clarification, search, and summarization agents to deliver more accurate and complex AI‑driven answers.

AI searchDeepSearchLLM
0 likes · 10 min read
How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 20, 2025 · Artificial Intelligence

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Vibe Coding, a new AI‑centric programming paradigm introduced by Andrej Karpathy, replaces traditional code‑centric development with natural‑language‑driven interactions, enabling developers to act as product‑focused guides while large language models generate code, and discusses tools, workflows, benefits, challenges, and future trends.

AI CodingLLMVibe Coding
0 likes · 26 min read
What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm
Data Party THU
Data Party THU
Aug 20, 2025 · Artificial Intelligence

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

This article surveys recent large‑scale corpus rewriting techniques for LLM pre‑training, covering K2’s token‑utilization strategies, domain‑specific methods like SwallowMath/Code, reStructured pretraining, the WRAP pipeline, Nemotron‑CC filtering, Pro‑X noise removal, and the MAGA multi‑style expansion, while highlighting challenges, experimental findings, and open research questions.

LLMcorpus rewritingdata synthesis
0 likes · 20 min read
How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond
Instant Consumer Technology Team
Instant Consumer Technology Team
Aug 19, 2025 · Artificial Intelligence

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

This article explores why proper document chunking is crucial for Retrieval‑Augmented Generation, explains core concepts like context windows and signal‑to‑noise, compares various chunking strategies—from simple fixed‑size splits to semantic and hybrid approaches—and provides practical Python code examples to help you build more effective RAG pipelines.

LLMRAGText Splitting
0 likes · 24 min read
Mastering Document Chunking for RAG: Strategies, Code & Best Practices
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 19, 2025 · Artificial Intelligence

How to Strengthen LLM System Prompts for Safer AI Agents

This guide explains how to reinforce system prompts for AI agents by optimizing their content and structure, using active defense, role‑based, and format constraints, providing practical examples, measurement methods, and experimental results that demonstrate up to 90% reduction in unsafe behavior.

AI SafetyLLMSystem Prompt
0 likes · 13 min read
How to Strengthen LLM System Prompts for Safer AI Agents
Data Party THU
Data Party THU
Aug 19, 2025 · Artificial Intelligence

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

This article examines how reinforcement learning fine‑tuning influences large language model reasoning, revealing that RL primarily amplifies pre‑trained capabilities, suffers from entropy collapse, and fails to push the model’s reasoning boundary, supported by extensive experiments on scaling laws, entropy analysis, and mitigation techniques.

LLMRLRLVR
0 likes · 24 min read
Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained
Tencent Cloud Developer
Tencent Cloud Developer
Aug 19, 2025 · Artificial Intelligence

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

This article explains the fundamentals of large language models, covering transformer self‑attention, prompt engineering, API usage with temperature and tool parameters, function calling, agent architectures, the Model Context Protocol (MCP), Agent‑to‑Agent (A2A) communication, and future AI programming roles.

A2AAI agentsFunction Calling
0 likes · 11 min read
Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling
Kuaishou Tech
Kuaishou Tech
Aug 18, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

The Klear‑Reasoner model, built on Qwen3‑8B‑Base and powered by the novel Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm, surpasses same‑size open‑source baselines on challenging math (AIME) and code (LiveCodeBench) benchmarks, while revealing key insights on data quality, reward design, and clipping strategies for large‑language‑model reasoning.

GPPOLLMReinforcement Learning
0 likes · 11 min read
How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization
Qborfy AI
Qborfy AI
Aug 16, 2025 · Artificial Intelligence

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

This article explains what tokens are in large language models, how they are counted and priced, compares tokenization methods across major models, and provides practical guidelines and code examples for optimizing token usage and selecting the appropriate model for different scenarios.

AICost OptimizationLLM
0 likes · 8 min read
Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model
DaTaobao Tech
DaTaobao Tech
Aug 15, 2025 · Mobile Development

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

This article explains how to eliminate stuttered text output in iOS chat applications powered by local LLMs using the MNN framework, by introducing a three‑layer optimization—smart stream buffering, UI update throttling with batch processing, and a typewriter‑style animation—to achieve smooth, near‑online responsiveness.

LLMMNNSwift
0 likes · 16 min read
How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation
Instant Consumer Technology Team
Instant Consumer Technology Team
Aug 15, 2025 · Artificial Intelligence

Why Building Enterprise AI Agents Feels Like Building a Distributed Brain

An engineer recounts the hard‑earned lessons from moving beyond RAG to enterprise‑level AI agents, exposing three critical challenges—scheduling, memory management, and tool integration—and proposes architectural patterns that turn fragile prototypes into robust, observable, and secure AI systems.

AI agentsAgentic EngineeringEnterprise AI
0 likes · 9 min read
Why Building Enterprise AI Agents Feels Like Building a Distributed Brain
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 15, 2025 · Artificial Intelligence

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

This article systematically adapts classic deep reinforcement‑learning techniques—such as multi‑step returns, TD(λ)/GAE, V‑trace corrections, uncertainty‑aware weighting, safety constraints, distribution‑robust optimization, and value‑guided decoding—to improve large language model training and inference, providing concrete formulas, implementation tips, and empirical results.

Deep RLGAELLM
0 likes · 17 min read
Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 15, 2025 · Artificial Intelligence

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

This article systematically explains how to build reliable, high‑performance AI agents by focusing on the core components—LLM, prompts, workflows, RAG, and tools—while covering prompt engineering techniques, DSL‑based workflow design, vector‑database knowledge bases, security against prompt injection, and practical project planning.

AI AgentLLMRAG
0 likes · 15 min read
Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies
Tencent Technical Engineering
Tencent Technical Engineering
Aug 14, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

This article systematically examines the root causes of hallucinations in large language models, evaluates their pros and cons, and presents a comprehensive set of optimization techniques—including prompt engineering, RAG, sampling tweaks, supervised fine‑tuning, LoRA, RLHF, chain‑of‑thought reasoning, and agent/workflow designs—to build more reliable and trustworthy AI applications.

AILLMLoRA
0 likes · 29 min read
Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions
Data Party THU
Data Party THU
Aug 14, 2025 · Artificial Intelligence

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

The article analyzes the FilterLLM approach, which augments a frozen LLM with billions of learnable user tokens to predict a full‑user interaction probability distribution in a single forward pass, dramatically speeding up cold‑start recommendation while preserving recommendation quality across multiple benchmarks.

AIFilterLLMLLM
0 likes · 8 min read
How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations
JD Cloud Developers
JD Cloud Developers
Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article presents a comprehensive study on improving Text-to-SQL performance by introducing J‑Schema for structured schema representation, applying iterative Direct Preference Optimization (DPO) training, and leveraging self‑consistency voting mechanisms, achieving up to a 12% accuracy gain on the BIRD benchmark.

Database QAIterative DPOJ-Schema
0 likes · 10 min read
Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency
JD Retail Technology
JD Retail Technology
Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article surveys the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, details an iterative DPO training pipeline with hyper‑parameter tuning, and demonstrates how self‑consistency voting boosts execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD datasetIterative DPOLLM
0 likes · 14 min read
Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency
Youzan Coder
Youzan Coder
Aug 13, 2025 · Artificial Intelligence

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

This article explains what an AI agent is, outlines its four core modules—perception, memory, planning, and action—describes the role of large language models, compares software development generations, discusses memory implementations, planning methods like ReAct and Plan‑and‑Solve, and covers evaluation, cost analysis, and differences between agents and workflows.

AIAgentLLM
0 likes · 15 min read
Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation
Zhongtong Tech
Zhongtong Tech
Aug 13, 2025 · Artificial Intelligence

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open‑source interface that standardizes how large language models interact with external data sources and tools, offering a USB‑C‑like universal connector for AI applications, with built‑in session management, security, and flexible HTTP/SSE transport for seamless real‑world integration.

AI integrationLLMMCP
0 likes · 7 min read
Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)
Data Party THU
Data Party THU
Aug 12, 2025 · Artificial Intelligence

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Chain‑of‑Thought (CoT) enables large language models to solve complex tasks by breaking problems into sequential reasoning steps, improving accuracy in mathematics, commonsense, code generation, business strategy, and medical diagnosis, while highlighting its principles, advantages, challenges, and future prospects.

Chain-of-ThoughtLLMPrompt Design
0 likes · 13 min read
Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains
Qborfy AI
Qborfy AI
Aug 12, 2025 · Artificial Intelligence

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

This article explains how massive Transformer‑based large language models compress text data into mathematical representations, why scale, self‑attention, and training paradigms enable emergent general intelligence, and walks through tokenization, embedding, multi‑layer attention, architecture choices, energy costs, and hallucination mitigation.

AIEmbeddingLLM
0 likes · 6 min read
What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling
Huolala Tech
Huolala Tech
Aug 12, 2025 · Information Security

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

This article explores a hybrid approach that combines traditional static application security testing (SAST) with large language models (LLM) to automatically detect business‑logic vulnerabilities, detailing the methodology, implementation stages, experimental results, and the challenges of integrating AI into code security analysis.

AILLMSAST
0 likes · 15 min read
Can AI Boost Traditional SAST to Detect Complex Logic Bugs?
Liangxu Linux
Liangxu Linux
Aug 11, 2025 · Artificial Intelligence

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

This article introduces four notable open‑source AI projects—Google's Gemini CLI, the voice‑interactive XiaoZhi chatbot, the comprehensive AI Engineering Hub, and the GPT‑Pilot programming companion—detailing their key features, generous free quotas, star counts, supported hardware, and providing direct GitHub repository links for each.

AIChatbotGemini CLI
0 likes · 5 min read
Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 11, 2025 · Artificial Intelligence

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

This article details an innovative approach that uses large‑model supervised fine‑tuning to overcome the instability of code RAG and code agents during open‑source repository upgrades, addressing domain‑specific terminology, code style differences, and improving recall, accuracy, and deployment efficiency.

AI agentsFine-tuningLLM
0 likes · 11 min read
How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching
Data Party THU
Data Party THU
Aug 11, 2025 · Artificial Intelligence

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

This article systematically compares the architectures of recent large language models—including DeepSeek V3/R1, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen 3, SmolLM 3 and Kimi 2—highlighting innovations such as MLA, MoE, post‑norm, sliding‑window attention, NoPE and optimizer choices, with diagrams and code examples to illustrate their impact on efficiency and performance.

ComparisonLLMMLA
0 likes · 12 min read
What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More
AI Large Model Application Practice
AI Large Model Application Practice
Aug 11, 2025 · Artificial Intelligence

How to Build an LLM-Powered Smart Resume Screening System

This article presents a detailed design and implementation of an LLM‑based intelligent resume matching system that combines semantic vector retrieval, structured rule filtering, multi‑dimensional weighted scoring, and natural‑language interaction to create a fast, quantifiable, and explainable hiring pipeline.

AI RecruitmentLLMRAG
0 likes · 18 min read
How to Build an LLM-Powered Smart Resume Screening System
Wuming AI
Wuming AI
Aug 11, 2025 · Industry Insights

Why LLMs Overthink and How Developers Can Control Inference Depth

Developers notice that large language models often enter an "overthinking" mode that slows down simple coding tasks, prompting calls for adjustable inference depth controls so models can switch between quick checks and deep analysis based on task risk level.

AI usabilityDeveloper ExperienceLLM
0 likes · 5 min read
Why LLMs Overthink and How Developers Can Control Inference Depth
Data Party THU
Data Party THU
Aug 10, 2025 · Artificial Intelligence

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

This study introduces EvoVLMA, an evolutionary vision-language model adaptation framework that automatically searches training-free VLM adaptation algorithms using a two-stage LLM-guided evolution, demonstrating superior performance—such as a 1.91 % accuracy gain on 8-shot image classification—and releasing the code publicly.

Evolutionary AlgorithmsLLMModel Adaptation
0 likes · 5 min read
Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?
Data Party THU
Data Party THU
Aug 10, 2025 · Artificial Intelligence

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

This article evaluates whether autoregressive large language models can generate several tokens in a single inference step, describing a mask‑based multi‑token prediction framework, gated LoRA adaptation, experimental results on Tulu‑3‑8B showing up to 5.2× speedup, and discusses implications for future research.

AI efficiencyLLMMulti-token generation
0 likes · 13 min read
Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation
Sohu Smart Platform Tech Team
Sohu Smart Platform Tech Team
Aug 9, 2025 · Artificial Intelligence

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

This article explains the challenges of running large language models on mobile devices, reviews recent industry efforts, and provides a step‑by‑step guide—including code snippets—for integrating a distilled GPT‑2 model with Sohu's Hybrid AI Engine using TensorFlow Lite and Keras‑NLP for on‑device inference.

Hybrid AIKerasLLM
0 likes · 10 min read
Deploying Large Language Models Offline on Mobile Devices: A Practical Guide
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 8, 2025 · Artificial Intelligence

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

This article examines how the Manus sandbox and CodeAct mechanisms inspire a GitOps‑based approach to building LLM agents, detailing the design of planner and executor components, workflow steps, advantages such as RAG and observability, and the potential for low‑cost, scalable intelligent agent development.

AI agentsGitOpsIntelligent agents
0 likes · 12 min read
Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration
Tencent Technical Engineering
Tencent Technical Engineering
Aug 8, 2025 · Artificial Intelligence

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

This article systematically compares major open‑source deep‑research agent frameworks—including DeerFlow, SmolAgents, LangChainAI, SkyworkAI, and Researcher—detailing their architectures, best practices, and commercial alternatives, to help developers and users choose the most suitable tool for automated research workflows.

AI automationDeep ResearchLLM
0 likes · 27 min read
Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison
Tencent Cloud Developer
Tencent Cloud Developer
Aug 8, 2025 · Artificial Intelligence

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

This comprehensive guide explains when to use AI agents, presents core design patterns such as prompt chains, routing, parallelization, orchestrator‑worker and eval‑optimize loops, and offers concrete implementation advice and tool‑prompt engineering techniques for building reliable, high‑quality agent systems.

LLMprompt engineeringtool engineering
0 likes · 24 min read
Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools
Amap Tech
Amap Tech
Aug 7, 2025 · Artificial Intelligence

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

This article describes how the Gaode terminal team tackled large‑scale repository upgrades by building a code‑RAG and code‑Agent tool, addressing recall and stability issues, then fine‑tuning a small LLM (Qwen3‑4B) with LoRA and custom datasets to achieve reliable, low‑cost, on‑device code‑query performance.

Code AgentLLMLoRA
0 likes · 11 min read
Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Aug 4, 2025 · Artificial Intelligence

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

This article explains how Retrieval‑Augmented Generation (RAG) and long‑term memory systems like MenoBase enable large language models to overcome short‑term memory limits, dynamically retrieve up‑to‑date knowledge, and personalize interactions, with practical Dify implementation steps and real‑world use cases across industries.

AIDifyKnowledge Base
0 likes · 18 min read
How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant
Baidu Maps Tech Team
Baidu Maps Tech Team
Jul 31, 2025 · Artificial Intelligence

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

This article explains how Baidu Map’s AI voice assistant converts spoken commands into precise navigation actions by detailing the speech‑to‑text pipeline, intent parsing, template and generative approaches, tool‑calling mechanisms, memory and reflection capabilities, and future directions for intelligent agents.

AIIntent ParsingLLM
0 likes · 14 min read
How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands
Data Party THU
Data Party THU
Jul 31, 2025 · Industry Insights

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

The mini‑SWE‑agent, a lightweight open‑source software‑engineering AI built by the original SWE‑bench team, achieves about 65% bug‑fix success on the SWE‑bench benchmark using roughly 100 lines of Python, thanks to its minimal dependencies, shell‑based execution, linear history, and support for various container environments, offering a fast, extensible alternative to the full‑featured SWE‑agent.

AI AgentLLMSWE-bench
0 likes · 8 min read
How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 31, 2025 · Artificial Intelligence

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

This article explores the importance of post‑training for large language models, explains scaling laws for pre‑ and post‑training, details common fine‑tuning methods (full, PEFT, LoRA), outlines alignment techniques such as RLHF, DPO, PPO, and presents practical workflows using Llama 3 and DeepSeek‑R1, while also discussing test‑time reasoning optimizations.

AlignmentFine-tuningLLM
0 likes · 19 min read
Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jul 30, 2025 · Artificial Intelligence

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

This article analyzes the prompt‑inflation bottleneck that arises when large language models (LLMs) must handle thousands of Model Context Protocol (MCP) services, and introduces the MCP‑RAG architecture—a retrieval‑augmented generation solution that builds a metadata knowledge base and intelligent retrieval layer to enable precise, efficient MCP service discovery at scale.

AILLMMCP
0 likes · 21 min read
How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls
Ops Development Stories
Ops Development Stories
Jul 29, 2025 · Artificial Intelligence

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

LLMLangGraphRAG
0 likes · 64 min read
Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 29, 2025 · Artificial Intelligence

How to Transform Chaotic AI Prompts into Robust System Designs

This article examines the pitfalls of rule‑heavy prompt engineering, introduces a systematic four‑layer architecture for AI prompts, outlines six practical compilation principles, and demonstrates how to rewrite a tangled prompt into a clear, maintainable, and scalable system blueprint.

AI ArchitectureLLMSystem Design
0 likes · 84 min read
How to Transform Chaotic AI Prompts into Robust System Designs
Data Thinking Notes
Data Thinking Notes
Jul 27, 2025 · Databases

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

This article explains how Text2SQL technology converts natural language queries into executable SQL using large language models, and demonstrates how the open‑source Dify platform’s visual workflow and component‑based development dramatically lower the barrier for building, validating, and deploying secure, low‑code Text2SQL applications.

AIData-drivenDify
0 likes · 13 min read
How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps
Architecture and Beyond
Architecture and Beyond
Jul 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Secret to Powerful AI Agents

This article explains how AI agents work through perception, planning, and action, describes the four supporting systems—memory, tools, safety, and evaluation—and shows how the evolution from prompt engineering to context engineering, with strategies like selective saving, retrieval, compression, and modularization, addresses the core challenges of managing large‑scale context for reliable, efficient agent performance.

AI agentsContext EngineeringLLM
0 likes · 17 min read
Why Context Engineering Is the Secret to Powerful AI Agents
DaTaobao Tech
DaTaobao Tech
Jul 23, 2025 · Artificial Intelligence

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

Alibaba introduces the ali‑langengine‑dflow framework, a hybrid distributed‑agent architecture that moves core intelligence to the cloud while keeping execution reachable on heterogeneous client devices, addressing data‑isolation, latency and security issues of existing cloud‑VM and local‑agent solutions for 2C internet services.

AIAgentDistributed Systems
0 likes · 21 min read
How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges
FunTester
FunTester
Jul 23, 2025 · Artificial Intelligence

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

This article explains why a perfect answer from a large language model requires iterative prompt design, outlines a six‑step spiral loop for refining prompts, and offers practical tips such as starting with a minimal prompt, focusing on one improvement at a time, and preserving version history.

Artificial IntelligenceIterative DesignLLM
0 likes · 5 min read
Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration
Go Programming World
Go Programming World
Jul 23, 2025 · Artificial Intelligence

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Vibe Coding, introduced by Andrej Karpathy in 2025, lets developers describe software goals in natural language while large language models generate the code, reshaping the developer’s role, outlining the workflow, discussing tools, risks, and future prospects of this AI‑driven programming paradigm.

AI-driven developmentLLMVibe Coding
0 likes · 6 min read
Directing Code with AI: How Vibe Coding Turns Natural Language into Software
Code Mala Tang
Code Mala Tang
Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMOllama
0 likes · 12 min read
Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)