Tagged articles

2015 articles

Page 11 of 21

Sep 5, 2025 · Artificial Intelligence

Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and details the four essential modules—Profile, Memory, Planning, and Action—that power large‑language‑model agents in large‑scale AI serving.

AI agentsAnt GroupDistributed Systems

0 likes · 5 min read

Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray

Instant Consumer Technology Team

Sep 5, 2025 · Artificial Intelligence

How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×

This article explains how Context Engineering (CE) extends Prompt Engineering by integrating seven core elements—system prompts, user input, short‑term memory, long‑term memory, retrieval, tools, and structured output—using the open‑source Dify platform to build dynamic, multimodal agents that cut inference costs tenfold and raise complex‑task success rates by 40%.

AI Agent DevelopmentDifyLLM

0 likes · 16 min read

How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×

Alibaba Cloud Developer

Sep 5, 2025 · Artificial Intelligence

How Browser-Use Leverages LLMs to Transform Browser Automation

This article explores Browser-Use, an AI‑driven browser automation framework that combines large language models, visual perception, and DOM analysis to enable intelligent, multi‑step web tasks such as registration, price comparison, form filling, and monitoring, while detailing its architecture, historical context, core modules, and future challenges.

AI agentsBrowser AutomationLLM

0 likes · 26 min read

How Browser-Use Leverages LLMs to Transform Browser Automation

Volcano Engine Developer Services

Sep 4, 2025 · Backend Development

How to Build a Multi‑Agent LLM Flow in Go with Eino – Deer‑Go Deep Dive

This article explains how to re‑implement ByteDance's DeerFlow deep‑research framework in Go (Deer‑Go), covering the multi‑agent architecture, control‑hand‑off, interrupt & checkpoint mechanisms, integration with the Hertz SSE server, and step‑by‑step deployment instructions.

CheckpointDeerFlowEino

0 likes · 16 min read

How to Build a Multi‑Agent LLM Flow in Go with Eino – Deer‑Go Deep Dive

Data Party THU

Sep 4, 2025 · Artificial Intelligence

How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU

This article analyzes the memory bottleneck of massive language models, explains the mathematical modeling of memory requirements, evaluates traditional sharding limits, and details how GPT‑OSS’s MXFP4 quantization combined with Mixture‑of‑Experts reduces memory, bandwidth, and compute demands enough to fit a 1200‑billion‑parameter model onto an 80 GB GPU with minimal accuracy loss.

FP4LLMMXFP4

0 likes · 11 min read

How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU

Data Party THU

Sep 4, 2025 · Artificial Intelligence

Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive

This article provides a comprehensive technical analysis of PPO‑based reinforcement learning methods for large language models, detailing the evolution from the original PPO algorithm through GRPO, DAPO, and GSPO, and explaining their motivations, mathematical formulations, advantages, and practical challenges such as entropy collapse and importance‑sampling variance.

DAPOGRPOGSPO

0 likes · 30 min read

Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive

Tencent Cloud Developer

Sep 4, 2025 · Artificial Intelligence

Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents

Youtu-Agent, an open‑source agent framework released by Tencent Youtu Lab, combines minimalist design with high performance, delivers strong benchmark results without training or proprietary models, and offers flexible, cost‑effective, automated agent generation for researchers, developers, and AI enthusiasts.

AI agentsFrameworkLLM

0 likes · 12 min read

Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents

Alibaba Cloud Developer

Sep 4, 2025 · Artificial Intelligence

Why Context Engineering Is the New Frontier in LLM Development

This article explores the rise of Context Engineering as an essential discipline for large language models, comparing it to Prompt Engineering, detailing its definition, classifications, common pitfalls such as poisoning and distraction, and presenting best‑practice strategies and an LLM‑OS analogy for building robust AI agents.

LLMLLM OSMemory Management

0 likes · 27 min read

Why Context Engineering Is the New Frontier in LLM Development

Aikesheng Open Source Community

Sep 4, 2025 · Artificial Intelligence

How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark

The August 2025 SCALE benchmark evaluates new AI models—including the GPT‑5 family, DeepSeek‑V3.1, and the SQLShift tool—across SQL understanding, optimization, and dialect conversion, revealing distinct strengths, weaknesses, and the growing advantage of specialized tools over generic large language models.

AIDeepSeekGPT-5

0 likes · 15 min read

How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark

Sohu Tech Products

Sep 3, 2025 · Artificial Intelligence

How GRPO Revolutionizes RLHF for Large Language Models

This article explains the motivation, mathematical foundations, implementation details, advantages, experimental results, and future directions of Group Relative Policy Optimization (GRPO), a novel reinforcement‑learning algorithm that replaces PPO’s value network with efficient group‑wise relative evaluation for large language models.

Artificial IntelligenceGRPOLLM

0 likes · 17 min read

How GRPO Revolutionizes RLHF for Large Language Models

DataFunSummit

Sep 3, 2025 · Artificial Intelligence

Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers

This article explains the Model Context Protocol (MCP), its three‑layer architecture, its core advantages, and step‑by‑step development of an MCP server in TypeScript (with Python and C++ examples), showing how LLMs can invoke tools for tasks like Unreal Engine code analysis.

LLMMCPPython

0 likes · 16 min read

Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers

37 Interactive Technology Team

Sep 3, 2025 · Artificial Intelligence

How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices

Discover how AI, especially large language models, transforms traditional web scraping by introducing semantic understanding, dynamic adaptability, and automated extraction, with in‑depth reviews of emerging tools like Crawl4AI and Browser‑use, practical code examples, best‑practice guidelines, and deployment tips for modern data collection.

AIBrowser UseCrawl4AI

0 likes · 17 min read

How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices

Baobao Algorithm Notes

Sep 3, 2025 · Artificial Intelligence

How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards

Atom-Searcher introduces an atomic‑thought reinforcement‑learning framework that decomposes complex reasoning into fine‑grained units, uses a Reasoning Reward Model to assign step‑wise rewards, dynamically balances process and result incentives, and achieves state‑of‑the‑art performance on multiple LLM benchmarks.

Agentic ResearchAtomic ThoughtLLM

0 likes · 12 min read

How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards

Cognitive Technology Team

Sep 3, 2025 · Artificial Intelligence

How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices

This article chronicles the author's hands‑on journey of designing AI agents to automatically generate Helm charts for open‑source applications, exploring agent role definition, behavior paradigms like ReAct and plan‑and‑execute, prompt engineering challenges, structured workflows, multi‑agent collaboration, and practical lessons for reliable, production‑grade automation.

AI agentsAgent FrameworksHelm chart automation

0 likes · 29 min read

How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices

Architects Research Society

Sep 2, 2025 · Artificial Intelligence

What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?

The article contrasts pseudo‑agent AI—such as simple LLM chatbots, RPA scripts, and RAG systems—with genuine agentic AI architectures that combine large language models, orchestrators, memory stores, tool‑calling, planning modules, and multi‑agent collaboration, highlighting key capabilities like autonomous planning, feedback loops, and dynamic tool coordination.

Autonomous PlanningLLMOrchestrator

0 likes · 3 min read

What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?

DataFunSummit

Sep 2, 2025 · Artificial Intelligence

How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and breaks down the four essential modules—Profile, Memory, Planning, and Action—that enable large‑language‑model agents to operate in real‑world scenarios.

Ant GroupDistributed SystemsLLM

0 likes · 5 min read

How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents

Coder Circle

Sep 2, 2025 · Artificial Intelligence

Unlocking the New Era of AI Development: Exploring Spring AI Core Classes

This article walks through Spring AI’s three core classes—Message, Prompt, and ChatModel—explaining their roles, showing concrete code examples for constructing messages, building prompts, and invoking a large language model via a REST controller, and provides a complete demo repository.

ChatModelLLMMessage

0 likes · 3 min read

Unlocking the New Era of AI Development: Exploring Spring AI Core Classes

AI Large Model Application Practice

Sep 2, 2025 · Artificial Intelligence

Building a Multimodal Hybrid Retrieval Agent on an Integrated AI Data Layer

This article explores why many enterprise AI projects fail to deliver value, analyzes the complexity of real‑world AI use cases, and presents a step‑by‑step demo that combines vector, keyword, numeric, and spatial queries using OceanBase as a unified multimodal data store.

Enterprise AIHybrid SearchLLM

0 likes · 15 min read

Building a Multimodal Hybrid Retrieval Agent on an Integrated AI Data Layer

Data Party THU

Sep 1, 2025 · Artificial Intelligence

Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou

The article analyzes Denny Zhou's Stanford CS25 lecture on large language model reasoning, explaining how intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation together unlock powerful reasoning capabilities beyond traditional greedy decoding.

AI researchChain-of-ThoughtLLM

0 likes · 17 min read

Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou

Bighead's Algorithm Notes

Aug 31, 2025 · Artificial Intelligence

Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining

AlphaEval is a unified, parallelizable evaluation framework that assesses Alpha mining models across predictive ability, time stability, market‑perturbation robustness, financial logic, and diversity without backtesting, matching full backtest results while offering higher efficiency and open‑source reproducibility.

Alpha MiningEvaluation FrameworkLLM

0 likes · 10 min read

Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining

Architecture and Beyond

Aug 31, 2025 · Artificial Intelligence

Managing LLM Agent Context: Insights from OpenManus, Manus, Claude Code & Gemini-cli

This article examines why context management is critical for LLM agents, compares the strategies of OpenManus, Manus, Claude Code, and Gemini-cli, and extracts practical lessons on token limits, compression techniques, and engineering trade‑offs for building efficient, cost‑effective AI systems.

AILLMcompression

0 likes · 14 min read

Managing LLM Agent Context: Insights from OpenManus, Manus, Claude Code & Gemini-cli

Tencent Technical Engineering

Aug 29, 2025 · Artificial Intelligence

How Retrieval‑Augmented Generation Evolves into Autonomous AI Agents

This article examines the limitations of large language models' internal knowledge, explains how retrieval‑augmented generation (RAG) and tool‑augmented generation address these limits, and traces the evolution from simple retrieve‑then‑generate pipelines to autonomous, multi‑modal AI agents.

AI agentsLLMRAG

0 likes · 20 min read

How Retrieval‑Augmented Generation Evolves into Autonomous AI Agents

JD Retail Technology

Aug 29, 2025 · Artificial Intelligence

Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide

The article recounts how a risk‑control algorithm engineer transformed a generic large language model into a specialized e‑commerce fraud detector by iteratively designing prompts, injecting business rules, structuring I/O, and introducing a dual‑hypothesis decision framework to achieve accurate, automated risk analysis.

Artificial IntelligenceLLMRisk Detection

0 likes · 11 min read

Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide

Bighead's Algorithm Notes

Aug 28, 2025 · Artificial Intelligence

Key AI-Driven Quantitative Finance Papers from KDD2025

This article summarizes recent AI research on quantitative finance, covering AlphaAgent's LLM-driven alpha mining, UMI's multi‑level irrationality factors, PDU's progressive dependency learning for stock ranking, SSPT's stock‑specific pretraining transformer, and Enhancer's distribution‑aware meta‑learning framework, all of which demonstrate improved stock prediction and resistance to alpha decay.

Alpha MiningFinancial AILLM

0 likes · 9 min read

Key AI-Driven Quantitative Finance Papers from KDD2025

IT Services Circle

Aug 28, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It

Developers using DeepSeek V3.1's API have reported that the model intermittently inserts the Chinese character “极” (or its variants) into generated code, a bug that spreads across multiple platforms and threatens high‑precision code generation, prompting community workarounds and speculation about its root causes.

AI model bugDeepSeekLLM

0 likes · 6 min read

Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It

Aikesheng Open Source Community

Aug 28, 2025 · Artificial Intelligence

How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark

This report objectively evaluates DeepSeek‑V3.1 on professional‑grade SQL tasks, presenting its balanced strengths in understanding, optimization, and dialect conversion, highlighting its top scores in syntax error detection and Chinese database conversion while exposing weaknesses in execution‑plan analysis and large‑SQL transformations.

Artificial IntelligenceDeepSeekLLM

0 likes · 8 min read

How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark

Fun with Large Models

Aug 28, 2025 · Artificial Intelligence

A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework

The article compares LangGraph with LangChain, explains why a graph‑based architecture offers greater flexibility than linear chains, outlines LangGraph’s three‑layer core architecture and its ecosystem tools—including LangSmith, LangGraph Studio, CLI, and Agent Chat UI—while noting its reliance on LangChain and the need for VPN for CLI usage.

AI agentsGraph WorkflowLLM

0 likes · 11 min read

A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework

Alibaba Cloud Native

Aug 27, 2025 · Artificial Intelligence

How LoongSuite Enables Full‑Stack Observability for LLM Applications

The article explains the rapid evolution of the AI application ecosystem, outlines the challenges of end‑to‑end observability for large‑language‑model services, and details how the open‑source LoongSuite suite—through non‑intrusive instrumentation for Python and Go agents and tight integration with the Dify platform—provides comprehensive, cloud‑native monitoring, tracing, and metric collection across the entire AI stack.

AICloud NativeDify

0 likes · 19 min read

How LoongSuite Enables Full‑Stack Observability for LLM Applications

AI Large Model Application Practice

Aug 26, 2025 · Information Security

Secure Your MCP‑Based LLM Apps with Google OAuth 2.0: A Step‑by‑Step Python Demo

This guide walks you through the fundamentals of OAuth 2.0, explains how the MCP protocol supports it, and provides a complete Python example that integrates Google OAuth 2.0 into an MCP server and client, covering configuration, code implementation, and testing procedures.

AuthenticationGoogleLLM

0 likes · 16 min read

Secure Your MCP‑Based LLM Apps with Google OAuth 2.0: A Step‑by‑Step Python Demo

Wuming AI

Aug 26, 2025 · Artificial Intelligence

A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems

This article presents a hierarchical breakdown of Agentic AI, detailing the foundational large language models, the capabilities of AI agents, the coordination mechanisms of multi‑agent systems, and the supporting infrastructure needed for reliability, scalability, and security.

AI agentsAgentic AIInfrastructure

0 likes · 5 min read

A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems

DataFunSummit

Aug 25, 2025 · Artificial Intelligence

Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment

This article explains how Xiaomi designed and deployed a vertical‑domain question‑answering assistant for product and car queries, covering business background, a four‑module RAG‑plus‑LLM architecture, knowledge‑base construction, custom chunking strategies, dynamic signal handling, and the challenges overcome to achieve reliable real‑time voice interactions.

Agent ArchitectureLLMRAG

0 likes · 22 min read

Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment

DataFunSummit

Aug 24, 2025 · Artificial Intelligence

Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights

This article examines the core mechanisms of large language models, revealing asymmetric token behaviors, novel token‑compression techniques, scaling‑law theory, and mixed‑precision quantization methods that together boost inference efficiency while dramatically reducing model size.

Artificial IntelligenceLLMToken Compression

0 likes · 26 min read

Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights

Data Party THU

Aug 22, 2025 · Artificial Intelligence

How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs

This article presents a systematic study of extracting knowledge graphs from unstructured news articles using small quantized LLMs, exposing the brittleness of LangChain's JSON‑based pipelines, evaluating prompt‑engineering fixes, and introducing the BAML framework whose fuzzy parsing and concise schema raise extraction success from roughly 25% to over 99% on a 344‑document benchmark.

BAMLGraphRAGLLM

0 likes · 33 min read

How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs

Ctrip Technology

Aug 22, 2025 · Artificial Intelligence

How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%

This article explains how an AIGC‑driven solution uses large language models, prompt engineering, and a layered architecture built on Flask and LangChain to automatically transform product requirement documents into structured, BDD‑style test cases, achieving 89% adoption and up to 70% time reduction.

AI testingAIGCFlask

0 likes · 9 min read

How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%

Data Thinking Notes

Aug 21, 2025 · Artificial Intelligence

Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning

This article distills Denny Zhou’s Stanford CS25 lecture, explaining how large language models achieve reasoning through intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation, while highlighting theoretical foundations and practical breakthroughs.

Chain-of-ThoughtLLMReinforcement Learning

0 likes · 18 min read

Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning

21CTO

Aug 21, 2025 · Artificial Intelligence

Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities

The article analyzes why AI agent initiatives often flop compared to traditional software, explains the fundamental differences in development approaches, and introduces a three‑step Agent Capability Benchmark Testing framework with concrete evaluation criteria and a practical weekly‑report agent example.

AI agentsLLMagent development

0 likes · 12 min read

Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities

Volcano Engine Developer Services

Aug 21, 2025 · Artificial Intelligence

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Since last year, the debate over “Prompt Engineering” has split between practitioners who favor “Context Engineering” for building scalable agent systems and scholars who treat Prompt Engineering as a broad umbrella term, highlighting the need to dynamically construct and manage context for reliable, extensible AI applications.

AI agentsLLMRAG

0 likes · 33 min read

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Alibaba Cloud Native

Aug 21, 2025 · Cloud Native

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

This article explains why traditional load‑balancing methods fall short for large language model services and introduces Higress AI Gateway's three specialized algorithms—global minimum‑request, prefix‑matching, and GPU‑aware load balancing—detailing their design, Redis‑based implementation, deployment steps, and performance gains.

GPULLMload balancing

0 likes · 11 min read

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

Instant Consumer Technology Team

Aug 21, 2025 · Artificial Intelligence

How Data‑Juicer Supercharges LLM Training with High‑Quality Multimodal Data

Data‑Juicer is an open‑source, one‑stop multimodal data processing system that provides fine‑grained operators, scalable pipelines, and ready‑made recipes to deliver high‑quality, diverse, and model‑friendly data for large language model pre‑training, fine‑tuning, and multimodal applications.

AILLMMultimodal

0 likes · 22 min read

How Data‑Juicer Supercharges LLM Training with High‑Quality Multimodal Data

Alibaba Cloud Developer

Aug 21, 2025 · Artificial Intelligence

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

This article details the challenges of building an AI‑powered defect deduplication system using Retrieval‑Augmented Generation, explains why LLMs produce composite (spliced) results, diagnoses the root cause as information loss in the RAG pipeline, and presents a step‑by‑step solution that restores atomicity of records for reliable duplicate detection.

AI debuggingKnowledge BaseLLM

0 likes · 14 min read

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

JD Tech

Aug 20, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

This article examines the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, applies iterative DPO training and self-consistency voting, and demonstrates how these techniques raise execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD benchmarkIterative DPOJ-Schema

0 likes · 11 min read

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

Alibaba Cloud Big Data AI Platform

Aug 20, 2025 · Artificial Intelligence

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

This article explains how Alibaba Cloud OpenSearch LLM version evolved from RAG 1.0 to RAG 2.0, introducing the DeepSearch multi‑agent architecture that combines offline data processing, online query handling, planning, clarification, search, and summarization agents to deliver more accurate and complex AI‑driven answers.

AI searchDeepSearchLLM

0 likes · 10 min read

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

Volcano Engine Developer Services

Aug 20, 2025 · Artificial Intelligence

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Vibe Coding, a new AI‑centric programming paradigm introduced by Andrej Karpathy, replaces traditional code‑centric development with natural‑language‑driven interactions, enabling developers to act as product‑focused guides while large language models generate code, and discusses tools, workflows, benefits, challenges, and future trends.

AI CodingLLMVibe Coding

0 likes · 26 min read

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Data Party THU

Aug 20, 2025 · Artificial Intelligence

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

This article surveys recent large‑scale corpus rewriting techniques for LLM pre‑training, covering K2’s token‑utilization strategies, domain‑specific methods like SwallowMath/Code, reStructured pretraining, the WRAP pipeline, Nemotron‑CC filtering, Pro‑X noise removal, and the MAGA multi‑style expansion, while highlighting challenges, experimental findings, and open research questions.

LLMcorpus rewritingdata synthesis

0 likes · 20 min read

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

Rare Earth Juejin Tech Community

Aug 20, 2025 · Artificial Intelligence

How to Build an AI-Powered Code Review System with Node.js and LLMs

This article walks through designing and implementing an AI code review tool using Node.js, GitLab webhooks, and large language models, covering prompt engineering, diff augmentation, token management, response parsing, and automated comment posting to streamline the review process.

AICode reviewGitLab

0 likes · 25 min read

How to Build an AI-Powered Code Review System with Node.js and LLMs

Instant Consumer Technology Team

Aug 19, 2025 · Artificial Intelligence

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

This article explores why proper document chunking is crucial for Retrieval‑Augmented Generation, explains core concepts like context windows and signal‑to‑noise, compares various chunking strategies—from simple fixed‑size splits to semantic and hybrid approaches—and provides practical Python code examples to help you build more effective RAG pipelines.

LLMRAGText Splitting

0 likes · 24 min read

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

Volcano Engine Developer Services

Aug 19, 2025 · Artificial Intelligence

How to Strengthen LLM System Prompts for Safer AI Agents

This guide explains how to reinforce system prompts for AI agents by optimizing their content and structure, using active defense, role‑based, and format constraints, providing practical examples, measurement methods, and experimental results that demonstrate up to 90% reduction in unsafe behavior.

AI SafetyLLMSystem Prompt

0 likes · 13 min read

How to Strengthen LLM System Prompts for Safer AI Agents

Data Party THU

Aug 19, 2025 · Artificial Intelligence

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

This article examines how reinforcement learning fine‑tuning influences large language model reasoning, revealing that RL primarily amplifies pre‑trained capabilities, suffers from entropy collapse, and fails to push the model’s reasoning boundary, supported by extensive experiments on scaling laws, entropy analysis, and mitigation techniques.

LLMRLRLVR

0 likes · 24 min read

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

Tencent Cloud Developer

Aug 19, 2025 · Artificial Intelligence

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

This article explains the fundamentals of large language models, covering transformer self‑attention, prompt engineering, API usage with temperature and tool parameters, function calling, agent architectures, the Model Context Protocol (MCP), Agent‑to‑Agent (A2A) communication, and future AI programming roles.

A2AAI agentsFunction Calling

0 likes · 11 min read

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

Kuaishou Tech

Aug 18, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

The Klear‑Reasoner model, built on Qwen3‑8B‑Base and powered by the novel Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm, surpasses same‑size open‑source baselines on challenging math (AIME) and code (LiveCodeBench) benchmarks, while revealing key insights on data quality, reward design, and clipping strategies for large‑language‑model reasoning.

GPPOLLMReinforcement Learning

0 likes · 11 min read

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

Data Party THU

Aug 17, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Unpacking the Probabilistic Roots and Fixes

Large language models often generate confident but false statements—a phenomenon called hallucination—because they predict the next token based on statistical patterns rather than factual understanding, and this article explains the underlying mechanisms and practical mitigation strategies.

LLMRLHFRetrieval Augmented Generation

0 likes · 11 min read

Why Do Large Language Models Hallucinate? Unpacking the Probabilistic Roots and Fixes

Qborfy AI

Aug 16, 2025 · Artificial Intelligence

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

This article explains what tokens are in large language models, how they are counted and priced, compares tokenization methods across major models, and provides practical guidelines and code examples for optimizing token usage and selecting the appropriate model for different scenarios.

AICost OptimizationLLM

0 likes · 8 min read

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

DaTaobao Tech

Aug 15, 2025 · Mobile Development

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

This article explains how to eliminate stuttered text output in iOS chat applications powered by local LLMs using the MNN framework, by introducing a three‑layer optimization—smart stream buffering, UI update throttling with batch processing, and a typewriter‑style animation—to achieve smooth, near‑online responsiveness.

LLMMNNSwift

0 likes · 16 min read

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

Instant Consumer Technology Team

Aug 15, 2025 · Artificial Intelligence

Why Building Enterprise AI Agents Feels Like Building a Distributed Brain

An engineer recounts the hard‑earned lessons from moving beyond RAG to enterprise‑level AI agents, exposing three critical challenges—scheduling, memory management, and tool integration—and proposes architectural patterns that turn fragile prototypes into robust, observable, and secure AI systems.

AI agentsAgentic EngineeringEnterprise AI

0 likes · 9 min read

Why Building Enterprise AI Agents Feels Like Building a Distributed Brain

Baobao Algorithm Notes

Aug 15, 2025 · Artificial Intelligence

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

This article systematically adapts classic deep reinforcement‑learning techniques—such as multi‑step returns, TD(λ)/GAE, V‑trace corrections, uncertainty‑aware weighting, safety constraints, distribution‑robust optimization, and value‑guided decoding—to improve large language model training and inference, providing concrete formulas, implementation tips, and empirical results.

Deep RLGAELLM

0 likes · 17 min read

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

Alibaba Cloud Developer

Aug 15, 2025 · Artificial Intelligence

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

This article systematically explains how to build reliable, high‑performance AI agents by focusing on the core components—LLM, prompts, workflows, RAG, and tools—while covering prompt engineering techniques, DSL‑based workflow design, vector‑database knowledge bases, security against prompt injection, and practical project planning.

AI AgentLLMRAG

0 likes · 15 min read

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

Tencent Technical Engineering

Aug 14, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

This article systematically examines the root causes of hallucinations in large language models, evaluates their pros and cons, and presents a comprehensive set of optimization techniques—including prompt engineering, RAG, sampling tweaks, supervised fine‑tuning, LoRA, RLHF, chain‑of‑thought reasoning, and agent/workflow designs—to build more reliable and trustworthy AI applications.

AILLMLoRA

0 likes · 29 min read

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

Data Party THU

Aug 14, 2025 · Artificial Intelligence

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

The article analyzes the FilterLLM approach, which augments a frozen LLM with billions of learnable user tokens to predict a full‑user interaction probability distribution in a single forward pass, dramatically speeding up cold‑start recommendation while preserving recommendation quality across multiple benchmarks.

AIFilterLLMLLM

0 likes · 8 min read

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

Huolala Tech

Aug 14, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries

This article explores how large language models break the natural‑language‑to‑SQL barrier, outlines the challenges of NLP‑driven data retrieval, compares Text2SQL and Text2DSL approaches, and proposes a unified data service and metric platform to power enterprise‑grade ChatBI solutions.

AIChatBILLM

0 likes · 22 min read

How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries

JD Cloud Developers

Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article presents a comprehensive study on improving Text-to-SQL performance by introducing J‑Schema for structured schema representation, applying iterative Direct Preference Optimization (DPO) training, and leveraging self‑consistency voting mechanisms, achieving up to a 12% accuracy gain on the BIRD benchmark.

Database QAIterative DPOJ-Schema

0 likes · 10 min read

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

JD Retail Technology

Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article surveys the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, details an iterative DPO training pipeline with hyper‑parameter tuning, and demonstrates how self‑consistency voting boosts execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD datasetIterative DPOLLM

0 likes · 14 min read

Youzan Coder

Aug 13, 2025 · Artificial Intelligence

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

This article explains what an AI agent is, outlines its four core modules—perception, memory, planning, and action—describes the role of large language models, compares software development generations, discusses memory implementations, planning methods like ReAct and Plan‑and‑Solve, and covers evaluation, cost analysis, and differences between agents and workflows.

AIAgentLLM

0 likes · 15 min read

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

Zhongtong Tech

Aug 13, 2025 · Artificial Intelligence

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open‑source interface that standardizes how large language models interact with external data sources and tools, offering a USB‑C‑like universal connector for AI applications, with built‑in session management, security, and flexible HTTP/SSE transport for seamless real‑world integration.

AI integrationLLMMCP

0 likes · 7 min read

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

Data Party THU

Aug 12, 2025 · Artificial Intelligence

How SWE‑Swiss Enables a 32B Model to Match Larger LLMs on Software Engineering Tasks

Researchers from Peking University, ByteDance Seed, and Hong Kong University present SWE‑Swiss, a 32‑billion‑parameter model that, through a two‑stage training recipe and enhanced self‑consistency, achieves 60.2% accuracy on SWE‑bench Verified, matching larger models while remaining fully open‑source.

LLMSWE‑Swiss

0 likes · 8 min read

How SWE‑Swiss Enables a 32B Model to Match Larger LLMs on Software Engineering Tasks

Data Party THU

Aug 12, 2025 · Artificial Intelligence

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Chain‑of‑Thought (CoT) enables large language models to solve complex tasks by breaking problems into sequential reasoning steps, improving accuracy in mathematics, commonsense, code generation, business strategy, and medical diagnosis, while highlighting its principles, advantages, challenges, and future prospects.

Chain-of-ThoughtLLMPrompt Design

0 likes · 13 min read

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Qborfy AI

Aug 12, 2025 · Artificial Intelligence

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

This article explains how massive Transformer‑based large language models compress text data into mathematical representations, why scale, self‑attention, and training paradigms enable emergent general intelligence, and walks through tokenization, embedding, multi‑layer attention, architecture choices, energy costs, and hallucination mitigation.

AIEmbeddingLLM

0 likes · 6 min read

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

Huolala Tech

Aug 12, 2025 · Information Security

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

This article explores a hybrid approach that combines traditional static application security testing (SAST) with large language models (LLM) to automatically detect business‑logic vulnerabilities, detailing the methodology, implementation stages, experimental results, and the challenges of integrating AI into code security analysis.

AILLMSAST

0 likes · 15 min read

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

Liangxu Linux

Aug 11, 2025 · Artificial Intelligence

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

This article introduces four notable open‑source AI projects—Google's Gemini CLI, the voice‑interactive XiaoZhi chatbot, the comprehensive AI Engineering Hub, and the GPT‑Pilot programming companion—detailing their key features, generous free quotas, star counts, supported hardware, and providing direct GitHub repository links for each.

AIChatbotGemini CLI

0 likes · 5 min read

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

Alibaba Cloud Developer

Aug 11, 2025 · Artificial Intelligence

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

This article details an innovative approach that uses large‑model supervised fine‑tuning to overcome the instability of code RAG and code agents during open‑source repository upgrades, addressing domain‑specific terminology, code style differences, and improving recall, accuracy, and deployment efficiency.

AI agentsFine-tuningLLM

0 likes · 11 min read

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

Data Party THU

Aug 11, 2025 · Artificial Intelligence

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

This article systematically compares the architectures of recent large language models—including DeepSeek V3/R1, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen 3, SmolLM 3 and Kimi 2—highlighting innovations such as MLA, MoE, post‑norm, sliding‑window attention, NoPE and optimizer choices, with diagrams and code examples to illustrate their impact on efficiency and performance.

ComparisonLLMMLA

0 likes · 12 min read

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

AI Large Model Application Practice

Aug 11, 2025 · Artificial Intelligence

How to Build an LLM-Powered Smart Resume Screening System

This article presents a detailed design and implementation of an LLM‑based intelligent resume matching system that combines semantic vector retrieval, structured rule filtering, multi‑dimensional weighted scoring, and natural‑language interaction to create a fast, quantifiable, and explainable hiring pipeline.

AI RecruitmentLLMRAG

0 likes · 18 min read

How to Build an LLM-Powered Smart Resume Screening System

Wuming AI

Aug 11, 2025 · Industry Insights

Why LLMs Overthink and How Developers Can Control Inference Depth

Developers notice that large language models often enter an "overthinking" mode that slows down simple coding tasks, prompting calls for adjustable inference depth controls so models can switch between quick checks and deep analysis based on task risk level.

AI usabilityDeveloper ExperienceLLM

0 likes · 5 min read

Why LLMs Overthink and How Developers Can Control Inference Depth

Data Party THU

Aug 10, 2025 · Artificial Intelligence

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

This study introduces EvoVLMA, an evolutionary vision-language model adaptation framework that automatically searches training-free VLM adaptation algorithms using a two-stage LLM-guided evolution, demonstrating superior performance—such as a 1.91 % accuracy gain on 8-shot image classification—and releasing the code publicly.

Evolutionary AlgorithmsLLMModel Adaptation

0 likes · 5 min read

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

Data Party THU

Aug 10, 2025 · Artificial Intelligence

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

This article evaluates whether autoregressive large language models can generate several tokens in a single inference step, describing a mask‑based multi‑token prediction framework, gated LoRA adaptation, experimental results on Tulu‑3‑8B showing up to 5.2× speedup, and discusses implications for future research.

AI efficiencyLLMMulti-token generation

0 likes · 13 min read

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

Sohu Smart Platform Tech Team

Aug 9, 2025 · Artificial Intelligence

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

This article explains the challenges of running large language models on mobile devices, reviews recent industry efforts, and provides a step‑by‑step guide—including code snippets—for integrating a distilled GPT‑2 model with Sohu's Hybrid AI Engine using TensorFlow Lite and Keras‑NLP for on‑device inference.

Hybrid AIKerasLLM

0 likes · 10 min read

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

Alibaba Cloud Big Data AI Platform

Aug 8, 2025 · Artificial Intelligence

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

This article examines how the Manus sandbox and CodeAct mechanisms inspire a GitOps‑based approach to building LLM agents, detailing the design of planner and executor components, workflow steps, advantages such as RAG and observability, and the potential for low‑cost, scalable intelligent agent development.

AI agentsGitOpsIntelligent agents

0 likes · 12 min read

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

Tencent Technical Engineering

Aug 8, 2025 · Artificial Intelligence

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

This article systematically compares major open‑source deep‑research agent frameworks—including DeerFlow, SmolAgents, LangChainAI, SkyworkAI, and Researcher—detailing their architectures, best practices, and commercial alternatives, to help developers and users choose the most suitable tool for automated research workflows.

AI automationDeep ResearchLLM

0 likes · 27 min read

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

Tencent Cloud Developer

Aug 8, 2025 · Artificial Intelligence

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

This comprehensive guide explains when to use AI agents, presents core design patterns such as prompt chains, routing, parallelization, orchestrator‑worker and eval‑optimize loops, and offers concrete implementation advice and tool‑prompt engineering techniques for building reliable, high‑quality agent systems.

LLMprompt engineeringtool engineering

0 likes · 24 min read

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

Amap Tech

Aug 7, 2025 · Artificial Intelligence

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

This article describes how the Gaode terminal team tackled large‑scale repository upgrades by building a code‑RAG and code‑Agent tool, addressing recall and stability issues, then fine‑tuning a small LLM (Qwen3‑4B) with LoRA and custom datasets to achieve reliable, low‑cost, on‑device code‑query performance.

Code AgentLLMLoRA

0 likes · 11 min read

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

Full-Stack Cultivation Path

Aug 7, 2025 · Artificial Intelligence

Getting Started with MCP: From Core Concepts to Building Server and Client

This article explains why the Model Context Protocol (MCP) is needed for LLMs, describes its client‑server architecture, data and transport layers, and provides step‑by‑step Python examples for creating both an MCP server and a client using FastMCP and low‑level APIs.

LLMMCPModel Context Protocol

0 likes · 18 min read

Getting Started with MCP: From Core Concepts to Building Server and Client

Alibaba Cloud Big Data AI Platform

Aug 6, 2025 · Artificial Intelligence

How to Build an Enterprise‑Grade Vector Search Q&A Bot with Milvus and n8n

This article explains how to combine Alibaba Cloud Milvus, a high‑performance vector database, with the low‑code workflow platform n8n to create an enterprise‑level, domain‑specific intelligent Q&A system, covering challenges, architecture, setup, workflow configuration, and verification steps.

LLMMilvusn8n

0 likes · 18 min read

How to Build an Enterprise‑Grade Vector Search Q&A Bot with Milvus and n8n

Architect's Alchemy Furnace

Aug 4, 2025 · Artificial Intelligence

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

This article explains how Retrieval‑Augmented Generation (RAG) and long‑term memory systems like MenoBase enable large language models to overcome short‑term memory limits, dynamically retrieve up‑to‑date knowledge, and personalize interactions, with practical Dify implementation steps and real‑world use cases across industries.

AIDifyKnowledge Base

0 likes · 18 min read

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

Go Programming World

Aug 1, 2025 · Fundamentals

Should Go Developers Learn Python? Avoid the Syntax‑Translation Pitfall

In the AI‑driven era, Go programmers face a dilemma about mastering Python, and this article explains when Python is essential, how to prevent naïve code translation, and provides concrete examples of writing truly Pythonic code versus direct Go‑style conversions.

AICode TranslationGo

0 likes · 7 min read

Should Go Developers Learn Python? Avoid the Syntax‑Translation Pitfall

Ctrip Technology

Aug 1, 2025 · Artificial Intelligence

How Semantic Search Transforms Hotel Booking: From Entity Recognition to Vector Retrieval

This article explores how Ctrip leverages advanced AI techniques—including named entity recognition, entity linking, large language models, and vector search—to replace traditional keyword queries with semantic search, dramatically improving hotel recommendation accuracy and user experience.

AILLMVector Retrieval

0 likes · 14 min read

How Semantic Search Transforms Hotel Booking: From Entity Recognition to Vector Retrieval

Baidu Maps Tech Team

Jul 31, 2025 · Artificial Intelligence

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

This article explains how Baidu Map’s AI voice assistant converts spoken commands into precise navigation actions by detailing the speech‑to‑text pipeline, intent parsing, template and generative approaches, tool‑calling mechanisms, memory and reflection capabilities, and future directions for intelligent agents.

AIIntent ParsingLLM

0 likes · 14 min read

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

Data Party THU

Jul 31, 2025 · Industry Insights

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

The mini‑SWE‑agent, a lightweight open‑source software‑engineering AI built by the original SWE‑bench team, achieves about 65% bug‑fix success on the SWE‑bench benchmark using roughly 100 lines of Python, thanks to its minimal dependencies, shell‑based execution, linear history, and support for various container environments, offering a fast, extensible alternative to the full‑featured SWE‑agent.

AI AgentLLMSWE-bench

0 likes · 8 min read

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

Alibaba Cloud Developer

Jul 31, 2025 · Artificial Intelligence

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

This article explores the importance of post‑training for large language models, explains scaling laws for pre‑ and post‑training, details common fine‑tuning methods (full, PEFT, LoRA), outlines alignment techniques such as RLHF, DPO, PPO, and presents practical workflows using Llama 3 and DeepSeek‑R1, while also discussing test‑time reasoning optimizations.

AlignmentFine-tuningLLM

0 likes · 19 min read

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

Data Party THU

Jul 30, 2025 · Artificial Intelligence

When Metrics Mislead: Uncovering Simpson’s, Accuracy, and Goodhart Paradoxes in LLMs

The article examines three classic paradoxes—Simpson’s paradox, the accuracy paradox, and Goodhart’s law—showing how they arise in business intelligence and large language model contexts, and offers practical guidelines to detect and mitigate their misleading effects on data‑driven decisions.

Goodhart's lawLLMMetrics

0 likes · 12 min read

When Metrics Mislead: Uncovering Simpson’s, Accuracy, and Goodhart Paradoxes in LLMs

AsiaInfo Technology: New Tech Exploration

Jul 30, 2025 · Artificial Intelligence

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

This article analyzes the prompt‑inflation bottleneck that arises when large language models (LLMs) must handle thousands of Model Context Protocol (MCP) services, and introduces the MCP‑RAG architecture—a retrieval‑augmented generation solution that builds a metadata knowledge base and intelligent retrieval layer to enable precise, efficient MCP service discovery at scale.

AILLMMCP

0 likes · 21 min read

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

Java Architecture Diary

Jul 30, 2025 · Artificial Intelligence

What’s New in LangChain4j 1.2.0? Key AI Features and Enhancements

LangChain4j 1.2.0 introduces a suite of stable modules, advanced inference and thinking capabilities, streaming tool calls, and extensive AI service enhancements, offering developers finer control, lower latency, and richer debugging for LLM‑driven applications.

AIInferenceLLM

0 likes · 7 min read

What’s New in LangChain4j 1.2.0? Key AI Features and Enhancements

Ops Development Stories

Jul 29, 2025 · Artificial Intelligence

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

LLMLangGraphRAG

0 likes · 64 min read

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

Ops Development & AI Practice

Jul 29, 2025 · Artificial Intelligence

Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

This article explains how to create a RAG‑based intelligent QA system that fetches the latest documentation (e.g., PlantUML) before querying Gemini, detailing knowledge‑base creation, embedding, vector store management, LangChain integration, and deployment tips.

AI AssistantEmbeddingGemini

0 likes · 8 min read

Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

Alibaba Cloud Developer

Jul 29, 2025 · Artificial Intelligence

How to Transform Chaotic AI Prompts into Robust System Designs

This article examines the pitfalls of rule‑heavy prompt engineering, introduces a systematic four‑layer architecture for AI prompts, outlines six practical compilation principles, and demonstrates how to rewrite a tangled prompt into a clear, maintainable, and scalable system blueprint.

AI ArchitectureLLMSystem Design

0 likes · 84 min read

How to Transform Chaotic AI Prompts into Robust System Designs

Data Thinking Notes

Jul 27, 2025 · Databases

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

This article explains how Text2SQL technology converts natural language queries into executable SQL using large language models, and demonstrates how the open‑source Dify platform’s visual workflow and component‑based development dramatically lower the barrier for building, validating, and deploying secure, low‑code Text2SQL applications.

AIData-drivenDify

0 likes · 13 min read

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

Architecture and Beyond

Jul 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Secret to Powerful AI Agents

This article explains how AI agents work through perception, planning, and action, describes the four supporting systems—memory, tools, safety, and evaluation—and shows how the evolution from prompt engineering to context engineering, with strategies like selective saving, retrieval, compression, and modularization, addresses the core challenges of managing large‑scale context for reliable, efficient agent performance.

AI agentsContext EngineeringLLM

0 likes · 17 min read

Why Context Engineering Is the Secret to Powerful AI Agents

Full-Stack Cultivation Path

Jul 26, 2025 · Artificial Intelligence

Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development

This article provides a comprehensive, hands‑on tutorial for installing Ollama, Docker, and the open‑source Coze Studio on a local machine, configuring various LLM services such as Qwen 3, DeepSeek‑V3, and OpenRouter, and running the platform via Docker Compose to create and test AI agents.

Coze StudioDockerLLM

0 likes · 7 min read

Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development

DaTaobao Tech

Jul 23, 2025 · Artificial Intelligence

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

Alibaba introduces the ali‑langengine‑dflow framework, a hybrid distributed‑agent architecture that moves core intelligence to the cloud while keeping execution reachable on heterogeneous client devices, addressing data‑isolation, latency and security issues of existing cloud‑VM and local‑agent solutions for 2C internet services.

AIAgentDistributed Systems

0 likes · 21 min read

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

FunTester

Jul 23, 2025 · Artificial Intelligence

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

This article explains why a perfect answer from a large language model requires iterative prompt design, outlines a six‑step spiral loop for refining prompts, and offers practical tips such as starting with a minimal prompt, focusing on one improvement at a time, and preserving version history.

Artificial IntelligenceIterative DesignLLM

0 likes · 5 min read

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

Go Programming World

Jul 23, 2025 · Artificial Intelligence

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Vibe Coding, introduced by Andrej Karpathy in 2025, lets developers describe software goals in natural language while large language models generate the code, reshaping the developer’s role, outlining the workflow, discussing tools, risks, and future prospects of this AI‑driven programming paradigm.

AI-driven developmentLLMVibe Coding

0 likes · 6 min read

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Code Mala Tang

Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMOllama

0 likes · 12 min read

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)