Tagged articles
2014 articles
Page 9 of 21
BirdNest Tech Talk
BirdNest Tech Talk
Nov 18, 2025 · Industry Insights

A Practical Guide to Major LLM Services: URLs, Docs, and API Tips

This article compiles the entry points, documentation links, pricing details, and hands‑on API examples for several leading large‑language‑model providers—including DeepSeek, Alibaba Cloud, Baidu Qianfan, ByteDance Volcengine, OpenRouter, and Google Gemini—while comparing their usability, free‑tier offers, and developer experience.

APICloud AIComparison
0 likes · 13 min read
A Practical Guide to Major LLM Services: URLs, Docs, and API Tips
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 18, 2025 · Artificial Intelligence

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

This article breaks down why function‑call reliability is the biggest bottleneck for LLM agents and presents a systematic five‑step loop—schema quality, prompt context, sampling, training data, and runtime defenses—plus concrete optimization techniques such as dynamic tool routing, plan‑execute, validation layers, memory injection, and log‑driven tuning, illustrated with real‑world cases.

LLMTool Routingagent
0 likes · 12 min read
How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies
JakartaEE China Community
JakartaEE China Community
Nov 18, 2025 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This article explains why Retrieval‑Augmented Generation improves LLM accuracy, outlines the key Langchain4j and Ollama3 components, and provides a step‑by‑step Java example—including Maven setup, document ingestion, embedding, similarity search, prompt creation, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j
0 likes · 8 min read
How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 14, 2025 · Artificial Intelligence

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

This article explains why function‑call accuracy is critical for LLM agents, identifies four common failure causes, and presents a systematic, five‑step engineering framework—including dynamic routing, chain‑of‑thought planning, result validation, memory injection, and log‑driven optimization—backed by concrete examples and quantitative improvements.

Function CallingInterview PreparationLLM
0 likes · 10 min read
How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework
Programmer DD
Programmer DD
Nov 14, 2025 · Artificial Intelligence

Can TOON Format Cut LLM Token Costs by Up to 60%?

This article explains how the TOON data‑serialization format reduces token usage and improves accuracy for large language model calls compared with traditional JSON, provides benchmark results, outlines scenarios where TOON is advantageous or unsuitable, and shows Java integration examples.

LLMTOONToken Optimization
0 likes · 6 min read
Can TOON Format Cut LLM Token Costs by Up to 60%?
AI Tech Publishing
AI Tech Publishing
Nov 13, 2025 · Artificial Intelligence

Claude’s Prompt Engineering Best Practices: A Step‑by‑Step Guide

This guide outlines Claude team’s best practices for prompt engineering, covering core techniques such as clear instructions, background context, specificity, examples, and advanced methods like pre‑filled responses, chain‑of‑thought, output formatting, and prompt chaining, with concrete examples and code snippets.

AI promptingChain-of-ThoughtClaude
0 likes · 18 min read
Claude’s Prompt Engineering Best Practices: A Step‑by‑Step Guide
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 12, 2025 · Artificial Intelligence

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

This article breaks down the memory systems behind LLM‑based agents, explaining why persistent memory is needed, the differences between short‑term context buffers and long‑term vector stores, practical implementation choices, maintenance strategies, and how to articulate these concepts effectively in technical interviews.

LLMagentretrieval
0 likes · 14 min read
Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 12, 2025 · Artificial Intelligence

How Self‑Programming AI Agents Are Built: From LLM Brain to Dynamic Code Execution

This article explains how a self‑programming AI Agent is constructed by extending large language models as the brain, designing a multi‑area architecture, implementing memory layers, prompt engineering with segment mechanisms, and enabling code generation and execution through a Python‑Java bridge, while sharing practical insights and future directions.

AI AgentCode ExecutionLLM
0 likes · 34 min read
How Self‑Programming AI Agents Are Built: From LLM Brain to Dynamic Code Execution
HyperAI Super Neural
HyperAI Super Neural
Nov 11, 2025 · Artificial Intelligence

How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts

Deepseek-OCR leverages a visual‑compression approach, combining DeepEncoder and the DeepSeek3B‑MoE‑A570M decoder, to represent document text with far fewer visual tokens, achieving up to 97% OCR accuracy and surpassing GOT‑OCR2.0 and MinerU2.0 on OmniDocBench, while the article offers a one‑click deployment tutorial.

DeepEncoderLLMOCR
0 likes · 6 min read
How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts
Old Meng AI Explorer
Old Meng AI Explorer
Nov 10, 2025 · Mobile Development

How Cactus Turns Any Smartphone into a Powerful Offline AI Assistant

Cactus is a lightweight, open‑source mobile AI framework that runs large language models locally on iOS and Android without internet, offering chat, image recognition, and text‑to‑speech while consuming low resources, supporting older phones, and providing simple demo apps and Flutter integration for developers.

AIFlutterLLM
0 likes · 10 min read
How Cactus Turns Any Smartphone into a Powerful Offline AI Assistant
Data Party THU
Data Party THU
Nov 9, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

This article walks through the core RAG pipeline, explains why chunking is the linchpin of retrieval quality, and provides detailed definitions, trade‑offs, and implementation examples for five chunking techniques—fixed, recursive, semantic, structure‑aware, and delayed—so you can choose the right approach for any document‑heavy AI application.

AILLMRAG
0 likes · 10 min read
Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed
DataFunSummit
DataFunSummit
Nov 8, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents

This article examines Tencent's large language model deployments across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑playing, while deep‑diving into the RAG, GraphRAG, and Agent technologies that enable smarter, more reliable AI applications.

AILLMRAG
0 likes · 4 min read
How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents
AI Product Manager Community
AI Product Manager Community
Nov 8, 2025 · Artificial Intelligence

Why Prompt Engineering Fails: Embracing Context Engineering for Smarter LLMs

The article explains that prompt engineering alone cannot guarantee reliable AI responses because models lack situational awareness, and introduces context engineering as a systematic approach that structures memory, manages context flow, and integrates RAG and evaluation to make large language models truly useful in real‑world applications.

AIContext EngineeringLLM
0 likes · 7 min read
Why Prompt Engineering Fails: Embracing Context Engineering for Smarter LLMs
21CTO
21CTO
Nov 7, 2025 · Artificial Intelligence

any-llm 1.0: Seamlessly Switch Between Cloud and Local LLMs with One Python Library

Mozilla.ai's any-llm v1.0 is an open‑source Python library that unifies access to multiple large language model providers, enabling developers to move between cloud‑based and on‑premise LLMs without rewriting code, while offering async‑first APIs, reusable connections, and extensive compatibility features.

AI DevelopmentLLMPython
0 likes · 4 min read
any-llm 1.0: Seamlessly Switch Between Cloud and Local LLMs with One Python Library
DataFunSummit
DataFunSummit
Nov 7, 2025 · Artificial Intelligence

How Close Are Agents to AGI? Insights from Experiments and Benchmarks

Through a series of experiments, benchmark analyses, and theoretical discussions, this article explores the limits of current AI agents, their underlying mechanisms, performance gaps to human-level intelligence, and the challenges that remain on the path from agents to true AGI.

AGILLMPrompt engineering
0 likes · 26 min read
How Close Are Agents to AGI? Insights from Experiments and Benchmarks
DataFunSummit
DataFunSummit
Nov 7, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Content Creation, Smart Service, and Game NPCs

This article examines Tencent’s large language model deployments across content generation, intelligent customer service, and game role‑playing, and explains the underlying technologies—Supervised Fine‑Tuning, Retrieval‑Augmented Generation, and Agent systems—highlighting how they enhance performance, explainability, and multi‑step reasoning in real‑world business scenarios.

AILLMRAG
0 likes · 4 min read
How Tencent’s LLM Powers Content Creation, Smart Service, and Game NPCs
Ele.me Technology
Ele.me Technology
Nov 7, 2025 · Artificial Intelligence

LLM‑SM Hybrid Strategies: Boosting Decision Optimization and Store Design

Recent advances in large language models (LLMs) have sparked interest in their decision‑making capabilities, yet challenges remain; this article explores classic prediction‑optimization pipelines, introduces emerging LLM‑as‑Predictor/Ranker/Optimizer paradigms, and details practical case studies on delivery‑price optimization and intelligent store‑decoration recommendation using LLM‑SM hybrid systems.

Decision OptimizationHybrid ModelingLLM
0 likes · 30 min read
LLM‑SM Hybrid Strategies: Boosting Decision Optimization and Store Design
JD Tech
JD Tech
Nov 6, 2025 · Artificial Intelligence

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

This article surveys the evolution of generative recommendation systems powered by large language models, detailing their technical foundations, engineering challenges, recent breakthroughs, and future research directions, while highlighting why the paradigm shift is occurring now.

AI EngineeringGenerative RecommendationLLM
0 likes · 30 min read
LLMs Revolutionize Recommendation Systems: From Generative Models to Production
Tencent Cloud Developer
Tencent Cloud Developer
Nov 6, 2025 · Artificial Intelligence

From Prompt to Multi‑Agent: How LLMs Evolve into Autonomous Agents

Since ChatGPT's debut, the LLM landscape has progressed through four stages—prompt engineering, chain orchestration, autonomous agents, and multi‑agent systems—each enhancing intelligence and automation, with this article detailing their evolution, advantages, drawbacks, and practical implementation examples in Go.

GoLLMMulti-Agent
0 likes · 24 min read
From Prompt to Multi‑Agent: How LLMs Evolve into Autonomous Agents
AI Tech Publishing
AI Tech Publishing
Nov 5, 2025 · Artificial Intelligence

Why AI Agents Should Be Positioned as Assistants, Not Replacements

The article explains that marketing AI agents as human replacements leads to poor performance, professional resistance, and hallucination risks, and argues that repositioning them as assistants with human‑in‑the‑loop verification improves efficiency and acceptance.

AI AgentBI EngineerData Agent
0 likes · 3 min read
Why AI Agents Should Be Positioned as Assistants, Not Replacements
Kuaishou Tech
Kuaishou Tech
Nov 5, 2025 · Artificial Intelligence

How HiPO Gives LLMs a Smart Thinking Switch to Cut Costs and Boost Accuracy

This article explains the overthinking problem of large language models, introduces the HiPO framework with hybrid data cold‑start and reinforcement‑learning reward mechanisms that let models decide when to think deeply or answer directly, and shows experimental results demonstrating significant efficiency gains and accuracy improvements across multiple benchmarks.

Hybrid Policy OptimizationLLMReinforcement Learning
0 likes · 13 min read
How HiPO Gives LLMs a Smart Thinking Switch to Cut Costs and Boost Accuracy
Data Party THU
Data Party THU
Nov 5, 2025 · Artificial Intelligence

How to Give LLM Agents Memory, Reflection, and Goal Tracking

This article explains why current LLM agents lose context after each conversation and presents a practical architecture—using SQLite for structured storage, a vector database for semantic retrieval, and LLM‑driven reflection—to add persistent memory, self‑evaluation, and goal‑tracking capabilities that turn agents into learning partners.

Goal TrackingLLMMemory
0 likes · 10 min read
How to Give LLM Agents Memory, Reflection, and Goal Tracking
Code Mala Tang
Code Mala Tang
Nov 5, 2025 · Backend Development

How to Build a Production-Ready Async LLM API with FastAPI

Learn how to design and deploy a high‑performance, production‑grade LLM API using FastAPI, covering async routing, type‑safe Pydantic models, streaming via SSE/WebSockets, middleware, caching, rate limiting, observability, retries, and cost‑control strategies for robust AI services.

AsyncFastAPILLM
0 likes · 12 min read
How to Build a Production-Ready Async LLM API with FastAPI
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 5, 2025 · Artificial Intelligence

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

Building a Retrieval‑Augmented Generation (RAG) system may be straightforward in code, but making it reliable, accurate, and scalable in production involves challenges across data preparation, vector retrieval, query rewriting, generation control, and system integration, turning a demo into a truly useful AI service.

AILLMPrompt engineering
0 likes · 8 min read
Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo
JavaGuide
JavaGuide
Nov 5, 2025 · Artificial Intelligence

Cursor Goes Beyond the IDE with Agent Mode and Its Own Composer LLM

Cursor, once hailed as the leading AI‑enhanced IDE, has shifted its focus by making Agent mode the default and launching its own large‑model Composer, which the vendor claims runs four times faster than comparable models, though real‑world performance remains to be validated.

AI IDEClaudeCodex
0 likes · 4 min read
Cursor Goes Beyond the IDE with Agent Mode and Its Own Composer LLM
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Nov 4, 2025 · Artificial Intelligence

Common Debugging Signals for Large Language Models

This article outlines the end‑to‑end workflow for large‑model training, highlights typical debugging challenges such as memory OOM, performance bottlenecks, and gradient issues, and provides concrete strategies, tools (DeepSpeed, Megatron, Torchtitan, veScale) and best‑practice checklists to help engineers diagnose and resolve problems efficiently.

DeepSpeedLLMMegatron
0 likes · 12 min read
Common Debugging Signals for Large Language Models
DataFunTalk
DataFunTalk
Nov 4, 2025 · Artificial Intelligence

Can LLMs Trade Crypto Profitably? Inside the Alpha Arena Competition

Alpha Arena’s first season pitted six leading large language models against real crypto markets with $10,000 each, revealing stark differences in trading bias, risk management, and sensitivity to prompts, as Qwen3‑Max and DeepSeek outperformed GPT‑5, while detailed case studies expose model vulnerabilities and future research directions.

AI agentsAlpha ArenaLLM
0 likes · 12 min read
Can LLMs Trade Crypto Profitably? Inside the Alpha Arena Competition
Data STUDIO
Data STUDIO
Nov 4, 2025 · Artificial Intelligence

How to Build a Memory-Enabled AI Agent with SQLite and Vector Search

This article explains how to give AI agents persistent memory, reflection, and goal‑tracking by storing interaction summaries in SQLite, embedding them for semantic retrieval with a vector database, and using LLM‑generated prompts to recall, reflect, and manage objectives across sessions.

AI AgentGoal TrackingLLM
0 likes · 10 min read
How to Build a Memory-Enabled AI Agent with SQLite and Vector Search
dbaplus Community
dbaplus Community
Nov 3, 2025 · Artificial Intelligence

How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms

This article explains how Retrieval‑Augmented Generation (RAG) combines vector databases with large language models to let non‑technical users ask natural‑language questions and receive precise SQL statements, detailing the workflow, architecture, chunking methods, performance gains, and remaining challenges.

Data PlatformLLMRAG
0 likes · 17 min read
How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms
DataFunSummit
DataFunSummit
Nov 3, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World AI: From RAG to Agents

This article examines Tencent's large language model applications across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑playing, and explains the three key technologies—Supervised Fine‑Tuning, Retrieval‑Augmented Generation, and Agents—that enable these capabilities.

AI applicationsLLMRAG
0 likes · 4 min read
How Tencent’s LLM Powers Real‑World AI: From RAG to Agents
Meituan Technology Team
Meituan Technology Team
Nov 3, 2025 · Artificial Intelligence

Introducing VitaBench: A Real-World Agent Benchmark That Reveals a 30% Success Gap

VitaBench, a new open‑source benchmark from Meituan’s LongCat team, evaluates LLM‑driven agents across three realistic life‑service scenarios—food ordering, restaurant dining, and travel planning—using 66 tools and quantifying reasoning, tool, and interaction complexities, exposing a mere 30% success rate on complex cross‑scene tasks.

AILLMTool Use
0 likes · 14 min read
Introducing VitaBench: A Real-World Agent Benchmark That Reveals a 30% Success Gap
Goodme Frontend Team
Goodme Frontend Team
Nov 3, 2025 · Artificial Intelligence

Unlock AI Power with Model Context Protocol (MCP): Build LLM‑Enabled Servers in Minutes

This article introduces the Model Context Protocol (MCP) and Large Language Models (LLM), explains their core concepts, transmission mechanisms, lifecycle, and essential modules, and provides step‑by‑step code examples for creating an MCP server, adding tools, resources, prompts, and debugging workflows to accelerate AI‑driven development.

AILLMMCP
0 likes · 15 min read
Unlock AI Power with Model Context Protocol (MCP): Build LLM‑Enabled Servers in Minutes
Data Party THU
Data Party THU
Nov 2, 2025 · Artificial Intelligence

From RNN to LLM: How Transformers Power Modern Language Models

This article explains the evolution from RNNs through Encoder‑Decoder models to Transformers, detailing self‑attention, multi‑head attention, and masked attention, and then describes what Large Language Models are, their key components, capabilities, limitations, and common applications.

AIDeep LearningLLM
0 likes · 9 min read
From RNN to LLM: How Transformers Power Modern Language Models
Data Party THU
Data Party THU
Nov 1, 2025 · Artificial Intelligence

How to Blend Process‑Oriented and Agent‑Centric AI into a Hybrid Intelligent Pipeline

This article analyzes two contrasting AI agent design paradigms—process‑driven workflow orchestration and autonomous agent intelligence—examines their strengths and limitations, and proposes a hybrid architecture that fuses deterministic pipelines with dynamic planning, tool use, and memory mechanisms to achieve both reliability and adaptability.

AIHybridLLM
0 likes · 15 min read
How to Blend Process‑Oriented and Agent‑Centric AI into a Hybrid Intelligent Pipeline
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 1, 2025 · Artificial Intelligence

Turn a Basic RAG Demo into a High‑Impact Interview Project

This guide shows how to evolve a simple Retrieval‑Augmented Generation prototype into a production‑grade system by strengthening data ingestion, optimizing retrieval with hybrid and reranking techniques, adding query rewriting, long‑context handling, reinforcement learning, and multimodal support, so candidates can demonstrate real engineering depth in interviews.

AILLMRAG
0 likes · 7 min read
Turn a Basic RAG Demo into a High‑Impact Interview Project
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 31, 2025 · Artificial Intelligence

Weekly Quantitative Paper Digest (Oct 25‑31 2025)

This article summarizes six recent arXiv papers that explore how large language models, graph‑theoretic methods, generative frameworks, hypergraph multimodal architectures, GroupSHAP‑enhanced forecasting, and multi‑agent LLM workflows can improve financial signal extraction, portfolio optimization, and stock‑price prediction, providing empirical results on S&P 500 data.

Financial AILLMMultimodal Learning
0 likes · 13 min read
Weekly Quantitative Paper Digest (Oct 25‑31 2025)
Bilibili Tech
Bilibili Tech
Oct 31, 2025 · Artificial Intelligence

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

RIVAL (Reinforcement Learning with Iterative and Adversarial Optimization) introduces an adversarial game between a reward model and a translation LLM, combining qualitative preference rewards with quantitative metrics like BLEU, to overcome distribution shift in RLHF and achieve superior performance on conversational subtitle and WMT translation tasks.

BLEULLMReinforcement Learning
0 likes · 13 min read
RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 31, 2025 · Artificial Intelligence

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

Meta’s recent paper reveals a sigmoid‑shaped scaling law for LLM reinforcement learning, presents extensive 40‑k GPU‑hour experiments, compares various RL designs such as PPO‑off‑policy‑k and Pipeline‑RL‑k, and distills the findings into a practical “ScaleRL” recipe that improves performance and efficiency.

LLMRL OptimizationReinforcement Learning
0 likes · 10 min read
Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 31, 2025 · Artificial Intelligence

Why AI Agents Fail and 10 Proven Ways to Make Them Reliable

This article shares the practical lessons learned from building Alibaba Cloud’s digital employee "YunXiaoEr Aivis", explaining why large‑language‑model agents often miss expectations and presenting ten concrete strategies—ranging from clear prompt design to memory management—that dramatically improve multi‑agent reliability.

AI agentsAgent OptimizationContext Engineering
0 likes · 29 min read
Why AI Agents Fail and 10 Proven Ways to Make Them Reliable
BirdNest Tech Talk
BirdNest Tech Talk
Oct 30, 2025 · Artificial Intelligence

How to Build Multimodal Prompts with LangChain: A Step‑by‑Step Guide

Learn how LangChain enables multimodal interactions by preparing inputs, constructing prompts, invoking models like GPT‑4o, and processing responses, with a complete example that demonstrates image‑question answering, code walkthrough, environment setup, and key considerations for API keys and image URLs.

LLMLangChainMultimodal
0 likes · 9 min read
How to Build Multimodal Prompts with LangChain: A Step‑by‑Step Guide
BirdNest Tech Talk
BirdNest Tech Talk
Oct 30, 2025 · Artificial Intelligence

Mastering LangChain Tools: Define, Build, and Optimize Agent Functions

This guide explains what LangChain tools are, why clear descriptions matter, and walks through three ways to create them—using the @tool decorator, StructuredTool with Pydantic models, and custom BaseTool subclasses—plus examples of built‑in tools and reference links.

LLMLangChainPromptEngineering
0 likes · 7 min read
Mastering LangChain Tools: Define, Build, and Optimize Agent Functions
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 30, 2025 · Artificial Intelligence

Why LLM RL Training Crashes While SFT Stays Stable: Insights & Tricks

The article examines the fundamental similarity between SFT and RL loss functions for large language models, explains why RL training is prone to instability, discusses infrastructure and data quality challenges, and reviews practical tricks and reward‑model considerations for more reliable RL fine‑tuning.

AILLMReinforcement Learning
0 likes · 11 min read
Why LLM RL Training Crashes While SFT Stays Stable: Insights & Tricks
Aikesheng Open Source Community
Aikesheng Open Source Community
Oct 29, 2025 · Artificial Intelligence

What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?

BiomedSQL and LogicCat are two newly released Text‑to‑SQL datasets that challenge large language models with complex biomedical reasoning, multi‑step logical inference, and domain‑specific knowledge, offering detailed analyses of query types, scientific reasoning categories, and performance gaps that highlight current LLM limitations.

BiomedicalDatasetLLM
0 likes · 9 min read
What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?
DeWu Technology
DeWu Technology
Oct 29, 2025 · Artificial Intelligence

Why Chunking Can Make or Break Your RAG System – Practical Strategies & Code

This article explains how proper document chunking—choosing the right chunk size, overlap, and structure‑aware boundaries—directly impacts the relevance, factuality, and efficiency of Retrieval‑Augmented Generation pipelines, and provides multiple Python implementations ranging from simple fixed‑length splits to semantic and hybrid approaches.

EmbeddingLLMRAG
0 likes · 29 min read
Why Chunking Can Make or Break Your RAG System – Practical Strategies & Code
Tencent Cloud Developer
Tencent Cloud Developer
Oct 29, 2025 · Artificial Intelligence

How Tasking AI and Dify Redefine LLM‑Powered AI Application Development

This article analyzes the architecture, core capabilities, and workflow orchestration of LLM‑native application platforms Tasking AI and Dify, comparing their microservice designs, plugin management, multi‑tenant isolation, and GraphEngine execution to highlight strengths, trade‑offs, and future development trends.

AI PlatformDifyLLM
0 likes · 21 min read
How Tasking AI and Dify Redefine LLM‑Powered AI Application Development
DataFunSummit
DataFunSummit
Oct 28, 2025 · Artificial Intelligence

How Bilibili Uses LLMs to Tame Massive Data Platform Failures

Exploring Bilibili’s large‑scale data platform, this article details its five‑layer, storage‑compute separated architecture, the massive daily workload of offline and real‑time tasks, common failure and slowdown causes, and how an LLM‑powered intelligent assistant is being developed to help engineers troubleshoot efficiently.

BilibiliIntelligent AssistantLLM
0 likes · 5 min read
How Bilibili Uses LLMs to Tame Massive Data Platform Failures
Data Party THU
Data Party THU
Oct 28, 2025 · Artificial Intelligence

Can Low‑Quality Data Cause Irreversible ‘Brain Rot’ in Large Language Models?

Researchers from Texas A&M and UT Austin demonstrate that prolonged pre‑training on low‑quality, short‑form web content causes large language models to suffer irreversible cognitive decline—manifested as attention loss, broken reasoning chains, and personality distortion—highlighting data quality as a critical training‑time safety issue.

Artificial IntelligenceCognitive SafetyData Quality
0 likes · 7 min read
Can Low‑Quality Data Cause Irreversible ‘Brain Rot’ in Large Language Models?
JD Tech Talk
JD Tech Talk
Oct 27, 2025 · Artificial Intelligence

How Large Language Models Are Revolutionizing Generative Recommendation Systems

Over the past year, generative recommendation has made substantial progress by leveraging large language models' powerful sequence modeling and reasoning abilities, introducing a new paradigm that replaces complex handcrafted features, addresses traditional recommendation bottlenecks, and outlines the evolution, core technologies, engineering challenges, and future directions of LLM‑based recommendation systems.

AI EngineeringEncoder-DecoderLLM
0 likes · 29 min read
How Large Language Models Are Revolutionizing Generative Recommendation Systems
Bilibili Tech
Bilibili Tech
Oct 27, 2025 · Artificial Intelligence

How Bilibili’s LLM-Powered System Cuts Game Localization Costs by 80%

Bilibili’s game algorithm team built a four‑layer, LLM‑based translation platform that automates terminology extraction, retrieval‑augmented generation, and quality assessment, dramatically reducing localization cycles by over 85% and costs by up to 80% while supporting ten languages and ensuring consistent, culturally‑accurate game text.

LLMRAGgame localization
0 likes · 20 min read
How Bilibili’s LLM-Powered System Cuts Game Localization Costs by 80%
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Oct 27, 2025 · Artificial Intelligence

Designing Effective Generation Modules for RAG: Prompt Engineering, Multi‑Document Fusion, and Hallucination Control

This article explains how to design and optimize the generation module of Retrieval‑Augmented Generation systems by building robust prompts, merging multi‑source information, controlling answer formats, and applying post‑generation verification to reduce hallucinations and improve enterprise‑grade performance.

AIGeneration ModuleHallucination Control
0 likes · 9 min read
Designing Effective Generation Modules for RAG: Prompt Engineering, Multi‑Document Fusion, and Hallucination Control
KooFE Frontend Team
KooFE Frontend Team
Oct 26, 2025 · Artificial Intelligence

Master Zero-Shot Prompting: Advanced Techniques to Boost LLM Performance

Zero-shot prompting lets large language models perform tasks without examples, and by following principles of clarity and structured instructions, advanced strategies such as emotion prompting, zero-shot chain-of-thought, RE2 re-reading, Rephrase-and-Respond, role-play, and System-2 Attention can significantly improve accuracy and response quality across translation, reasoning, and QA tasks.

AI reasoningLLMLarge Language Models
0 likes · 13 min read
Master Zero-Shot Prompting: Advanced Techniques to Boost LLM Performance
dbaplus Community
dbaplus Community
Oct 26, 2025 · Artificial Intelligence

How MCP Turns AI into a Universal Plug‑In: A Deep Dive into Model Context Protocol

This article explains the Model Context Protocol (MCP) – an open, universal standard that lets large language models seamlessly interact with external tools and data – covering its core architecture, why it’s needed, underlying principles, tool‑selection mechanics, a step‑by‑step Python server implementation, and practical usage tips.

AI integrationLLMMCP
0 likes · 20 min read
How MCP Turns AI into a Universal Plug‑In: A Deep Dive into Model Context Protocol
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 24, 2025 · Artificial Intelligence

Weekly AI‑Finance Paper Digest (Oct 18‑24 2025)

This digest presents seven recent arXiv papers that explore large‑language‑model‑driven portfolio scoring, hybrid ResNet‑RMT covariance denoising for crypto, LLM‑enhanced financial causal analysis, multilingual news alignment for stock returns, three‑step bubble prediction with news and macro data, multimodal volatility forecasting, and news‑aware reinforcement trading, each with reported performance gains.

Financial AILLMMultimodal Learning
0 likes · 15 min read
Weekly AI‑Finance Paper Digest (Oct 18‑24 2025)
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Oct 24, 2025 · Artificial Intelligence

Beyond RAG: Three Emerging Knowledge‑Engineering Strategies (ICL, Online Learning, SLM)

The article outlines three post‑RAG knowledge‑engineering approaches—In‑Context Learning with dynamic few‑shot selection, Online Learning encompassing Meta‑Learning and Lifelong Learning to quickly adapt to new tasks, and the Small Language Model path that combines fine‑tuned task‑specific experts with LLM‑SLM collaboration for efficient, privacy‑preserving inference.

In-Context LearningKnowledge EngineeringLLM
0 likes · 4 min read
Beyond RAG: Three Emerging Knowledge‑Engineering Strategies (ICL, Online Learning, SLM)
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Oct 24, 2025 · Artificial Intelligence

7 Essential Agent Design Patterns for Building Autonomous AI Systems

This article explains the fundamental differences between workflows and agents, introduces seven core design patterns—including three workflow patterns and four agent patterns—provides Python examples using Ollama, and shows how to combine these patterns to create robust, autonomous AI applications.

AI agentsDesign PatternsLLM
0 likes · 30 min read
7 Essential Agent Design Patterns for Building Autonomous AI Systems
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Oct 24, 2025 · Artificial Intelligence

Can Large Language Models Truly Plan? Unpacking Agent Frameworks

This article explains why most LLM‑based agents only perform pseudo‑planning through prompts or hard‑coded loops, outlines when to rely on prompt‑driven versus program‑driven planning, compares popular frameworks such as ReAct, MRKL, BabyAGI and AutoGPT, and clarifies what true autonomous planning would require.

Artificial IntelligenceAutoGPTLLM
0 likes · 12 min read
Can Large Language Models Truly Plan? Unpacking Agent Frameworks
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Oct 22, 2025 · Artificial Intelligence

Mastering LLM Training: A Step‑by‑Step Blueprint from Data to Alignment

This guide walks through the complete end‑to‑end process of training a large language model from scratch, covering data collection, cleaning, tokenization, pre‑training objectives and engineering, post‑training alignment methods, scaling laws, over‑fitting mitigation, and gradient‑stability techniques.

AlignmentLLMgradient stability
0 likes · 9 min read
Mastering LLM Training: A Step‑by‑Step Blueprint from Data to Alignment
Instant Consumer Technology Team
Instant Consumer Technology Team
Oct 21, 2025 · Artificial Intelligence

Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling

This tutorial revisits a simple text‑generation function, explains how temperature scaling and top‑K sampling reshape token probability distributions, demonstrates their effects with PyTorch code and visualizations, and shows how to integrate both techniques into an improved generation routine for more diverse and human‑like outputs.

LLMPyTorchText Generation
0 likes · 13 min read
Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling
Baidu Tech Salon
Baidu Tech Salon
Oct 21, 2025 · Artificial Intelligence

Cut Data Integration Time from Months to Days with LLM-Powered Intelligent Ingestion

An LLM-driven intelligent data-ingestion framework replaces manual, months-long integration with an automated code-generation and execution loop that auto-recognizes schemas, maps structures, extracts quality rules, builds deployment packages, cutting onboarding time from three months to three days while eliminating human effort.

LLMautomated ETLcode-generation
0 likes · 19 min read
Cut Data Integration Time from Months to Days with LLM-Powered Intelligent Ingestion
Data STUDIO
Data STUDIO
Oct 21, 2025 · Artificial Intelligence

Building a Self‑Learning LangGraph Memory System with Feedback Loops and Dynamic Prompts

This article walks through the design and implementation of a two‑layer memory architecture for LangGraph agents, covering short‑term and long‑term stores, various storage back‑ends, prompt engineering, utility functions, node definitions, human‑in‑the‑loop interrupt handling, and how user feedback is captured and used to continuously update the agent’s behavior.

Feedback LoopHuman-in-the-LoopLLM
0 likes · 43 min read
Building a Self‑Learning LangGraph Memory System with Feedback Loops and Dynamic Prompts
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Oct 20, 2025 · Artificial Intelligence

nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation

This article revisits nanochat's core components, detailing the preparation of diverse training datasets, the scaling calculations for tokens and parameters, the model's MQA and KV‑cache design, the full training pipeline with gradient accumulation and mixed‑precision, cost breakdown, inference optimizations, evaluation tasks, and identified limitations with suggested improvements.

KV cacheLLMMQA
0 likes · 9 min read
nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation
Data Party THU
Data Party THU
Oct 20, 2025 · Artificial Intelligence

Fine-Tuning LLMs on TPU with Tunix: A Step‑by‑Step QLoRA Guide

This article introduces Google’s Tunix library for JAX‑based LLM post‑training, explains its core features such as supervised fine‑tuning, reinforcement learning and knowledge distillation, and provides detailed installation steps and a complete TPU‑accelerated QLoRA fine‑tuning workflow on the Gemma 2B model, including code snippets and inference testing.

AIFine-tuningJAX
0 likes · 8 min read
Fine-Tuning LLMs on TPU with Tunix: A Step‑by‑Step QLoRA Guide
Data Party THU
Data Party THU
Oct 20, 2025 · Artificial Intelligence

How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent

This article analyzes the rStar2‑Agent paper, revealing how Agentic Reinforcement Learning, the GRPO‑RoC algorithm, a high‑throughput code‑execution service, and a three‑stage training recipe let a modest 14‑billion‑parameter model surpass much larger LLMs on challenging math benchmarks.

AI researchArtificial IntelligenceLLM
0 likes · 18 min read
How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent
AI Large Model Application Practice
AI Large Model Application Practice
Oct 20, 2025 · Artificial Intelligence

Build a Local End‑to‑End DeepResearch Agent with Alibaba’s 30B MoE Model Using LangGraph

This guide walks through deploying Alibaba's open‑source Tongyi‑DeepResearch 30B MoE model locally, configuring FastAPI and A2A interfaces, implementing a ReAct‑style agent with LangGraph, setting up research tools, and testing the full UI‑API‑Agent pipeline via CLI and Streamlit.

A2ADeepResearchDeployment
0 likes · 14 min read
Build a Local End‑to‑End DeepResearch Agent with Alibaba’s 30B MoE Model Using LangGraph
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 19, 2025 · Artificial Intelligence

QuantAgent Unveiled: A Multi‑Agent LLM Framework for High‑Frequency Trading (Code Open)

QuantAgent introduces a multi‑agent LLM framework that replaces text‑based inputs with raw OHLC price signals, decomposes trading decisions into Indicator, Pattern, Trend, Risk, and Decision agents, and achieves substantially higher direction accuracy and returns across ten financial assets in zero‑shot HFT experiments.

Financial AILLMMulti-Agent System
0 likes · 10 min read
QuantAgent Unveiled: A Multi‑Agent LLM Framework for High‑Frequency Trading (Code Open)
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Oct 19, 2025 · Artificial Intelligence

Deep Dive into nanochat: Source Code, Model Size Calculations, and Optimization Techniques

This article provides a thorough analysis of nanochat’s source code, detailing transformer component differences, precise parameter‑size formulas, FlashNorm and ReLU² innovations, scaling‑law insights, memory‑usage estimations, and the distributed optimizer and training pipelines used to build the model.

Distributed TrainingLLMTransformer
0 likes · 20 min read
Deep Dive into nanochat: Source Code, Model Size Calculations, and Optimization Techniques
High Availability Architecture
High Availability Architecture
Oct 17, 2025 · Artificial Intelligence

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling, Human‑in‑the‑Loop, and Real‑World Use Cases

This article explores how Spring AI Alibaba enables the development of autonomous AI agents that run on schedules, interact with humans when needed, and handle tasks such as periodic business automation, batch processing, emergency response, and long‑cycle data analysis, illustrated with Java code examples.

LLMautonomous schedulingjava
0 likes · 12 min read
Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling, Human‑in‑the‑Loop, and Real‑World Use Cases
DataFunSummit
DataFunSummit
Oct 16, 2025 · Artificial Intelligence

How Chat BI Transforms Data Warehousing with AI: Unlock Real‑Time Insights

This presentation by iQIYI’s Technical Director Zhang Xiaoming details the evolution of BI systems, introduces the Chat BI framework, explains its three‑step implementation, outlines architectural design, data‑warehouse integration, performance optimizations, and user‑operation strategies, revealing how AI and RAG empower smarter data analytics.

AIBIChatBI
0 likes · 18 min read
How Chat BI Transforms Data Warehousing with AI: Unlock Real‑Time Insights
Baidu Geek Talk
Baidu Geek Talk
Oct 15, 2025 · Artificial Intelligence

Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?

This article presents an LLM‑driven intelligent data platform ingestion solution that automates schema recognition, mapping, quality rule extraction, and package building, reducing integration cycles from three months to three days while eliminating manual effort and enhancing scalability and control.

AIData PlatformLLM
0 likes · 21 min read
Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?
AI Cyberspace
AI Cyberspace
Oct 15, 2025 · Artificial Intelligence

Why MCP Is Poised to Replace Function Calling for LLM Agents

The Model Context Protocol (MCP) introduced by Anthropic addresses the scalability, integration, and context‑transfer limitations of traditional Function Calling by offering a standardized, bidirectional, and context‑aware communication layer that simplifies tool discovery, security, and workflow orchestration for LLM‑driven agents.

AI integrationFunction CallingLLM
0 likes · 24 min read
Why MCP Is Poised to Replace Function Calling for LLM Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 15, 2025 · Artificial Intelligence

Mastering Structured Output in Large Language Models: Techniques, Challenges, and Future Trends

Large language models are evolving from free‑form text generators to reliable data providers by mastering structured output through prompt engineering, validation frameworks, constrained decoding, supervised fine‑tuning, reinforcement learning, and API‑level capabilities, enabling seamless integration with software systems while addressing hallucinations and format reliability.

APILLMPrompt engineering
0 likes · 28 min read
Mastering Structured Output in Large Language Models: Techniques, Challenges, and Future Trends
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 14, 2025 · Artificial Intelligence

How TS‑Agent Uses LLMs and Reflective Feedback to Automate Financial Time‑Series Modeling

TS‑Agent is a modular LLM‑driven framework that formalizes financial time‑series modeling as a three‑stage iterative decision process, leveraging structured knowledge bases, dynamic memory, and a feedback‑driven code‑editing loop to outperform AutoML baselines in accuracy, robustness, and auditability.

AutoMLFeedback LoopKnowledge Base
0 likes · 12 min read
How TS‑Agent Uses LLMs and Reflective Feedback to Automate Financial Time‑Series Modeling
Volcano Engine Developer Services
Volcano Engine Developer Services
Oct 14, 2025 · Artificial Intelligence

How CollabLLM Redefines LLM Collaboration with Multi‑Turn Training

CollabLLM tackles the limitations of large language models in everyday multi‑turn dialogues by introducing a user‑centric, multi‑turn training framework that leverages simulated interactions, multi‑round reward modeling, and veRL toolchain support, achieving superior performance over single‑turn baselines.

LLMReinforcement Learningcollaborative training
0 likes · 13 min read
How CollabLLM Redefines LLM Collaboration with Multi‑Turn Training
AntTech
AntTech
Oct 13, 2025 · Artificial Intelligence

How dInfer Accelerates Diffusion LLM Inference Over 10× Faster Than Fast‑dLLM

Ant Group's open‑source dInfer framework dramatically speeds up diffusion language model inference—achieving more than a ten‑fold boost over Fast‑dLLM, surpassing autoregressive baselines, and delivering 1011 tokens per second on HumanEval—by tackling computational cost, KV‑cache invalidation, and parallel decoding challenges through modular system‑level innovations.

AI PerformanceDiffusion Language ModelInference Optimization
0 likes · 11 min read
How dInfer Accelerates Diffusion LLM Inference Over 10× Faster Than Fast‑dLLM
AI Large Model Application Practice
AI Large Model Application Practice
Oct 13, 2025 · Artificial Intelligence

How to Tame LLM Agents: Proven Strategies to Reduce Uncertainty and Boost Reliability

This article outlines practical techniques—including prompt engineering, domain fine‑tuning, retrieval‑augmented generation, structured outputs, workflow constraints, model parameter control, behavior rules, risk‑based AI participation, and comprehensive governance—to curb the unpredictability of large language model agents in enterprise settings.

AI AgentAI GovernanceLLM
0 likes · 18 min read
How to Tame LLM Agents: Proven Strategies to Reduce Uncertainty and Boost Reliability
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 12, 2025 · Artificial Intelligence

Trading-R1: Open-Source LLM Framework for Explainable Financial Trading

This article reviews Trading‑R1, an open‑source LLM inference framework that integrates multimodal financial data, three‑stage supervised‑fine‑tuning and reinforcement learning to generate structured investment arguments and risk‑adjusted trade decisions, achieving superior Sharpe ratio and drawdown performance on real‑world stock and ETF tests.

DatasetFinancial TradingLLM
0 likes · 11 min read
Trading-R1: Open-Source LLM Framework for Explainable Financial Trading
DataFunSummit
DataFunSummit
Oct 12, 2025 · Artificial Intelligence

How Kuaishou Uses Large Models to Supercharge Ad Targeting with COPE and LEARN

This article reviews Kuaishou's two‑year exploration of multimodal large‑model techniques for advertising, outlining challenges in content‑domain ad estimation, the COPE unified product representation framework, and the LEARN LLM knowledge‑transfer approach that together improve ad system performance.

AdvertisingKuaishouLLM
0 likes · 6 min read
How Kuaishou Uses Large Models to Supercharge Ad Targeting with COPE and LEARN
Architecture and Beyond
Architecture and Beyond
Oct 12, 2025 · Artificial Intelligence

How Do AI Agents Know When to Stop? Strategies and Real-World Implementations

This article explores the essential stop‑condition designs for AI agents, detailing hard limits, task‑completion checks, explicit termination tools, loop detection, error accumulation, and user interruption, and then examines concrete implementations in OpenManus and Gemini CLI with code examples and multi‑layer safeguards.

AI AgentGemini CLILLM
0 likes · 17 min read
How Do AI Agents Know When to Stop? Strategies and Real-World Implementations
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Oct 12, 2025 · Artificial Intelligence

How to Upgrade Dify to 1.9.1 and Resolve LLM Iterator Errors

This guide walks you through upgrading Dify using Docker Compose or source code deployment, running required migration commands, backing up data, and fixing the "Invalid context structure" error caused by iterator output changes in version 1.9.1, with detailed code snippets and troubleshooting steps.

DifyDockerLLM
0 likes · 8 min read
How to Upgrade Dify to 1.9.1 and Resolve LLM Iterator Errors
BirdNest Tech Talk
BirdNest Tech Talk
Oct 11, 2025 · Artificial Intelligence

How to Load Documents into LangChain: From Files to APIs

Learn how to use LangChain's Document Loaders to import data from files, web pages, databases, and APIs, understand the Document object structure, compare load() versus lazy_load(), and follow a step‑by‑step Python example that demonstrates loading, inspecting, and optionally processing documents with an LLM.

Data IntegrationDocument LoaderLLM
0 likes · 12 min read
How to Load Documents into LangChain: From Files to APIs
DataFunTalk
DataFunTalk
Oct 11, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World Apps with RAG, GraphRAG & Agents

This article explores Tencent’s large language model deployments across diverse business scenarios—content generation, intelligent customer service, and role‑playing—detailing the underlying RAG, GraphRAG, and Agent technologies, their principles, practical implementations, and the advantages they bring to enterprise AI solutions.

AILLMRAG
0 likes · 5 min read
How Tencent’s LLM Powers Real‑World Apps with RAG, GraphRAG & Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 11, 2025 · Artificial Intelligence

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling & Real-World Cases

Spring AI Alibaba (SAA) provides a robust framework for building autonomous, scheduled AI agents that can operate independently, respond to events, and involve human oversight, enabling use cases such as automated business reporting, batch data processing, emergency response, and sentiment analysis, with detailed code examples and deployment guidance.

AI agentsEnterprise AutomationLLM
0 likes · 13 min read
Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling & Real-World Cases
Data Party THU
Data Party THU
Oct 11, 2025 · Artificial Intelligence

From Transformers to LLaMA 4: A Journey Through the Biggest LLMs

This article surveys the most influential large language models released since 2017, detailing the core innovations of Transformer, BERT, GPT series, T5, Retrieval‑Augmented Generation, and the latest LLaMA and Meta models, while highlighting their architectures, training paradigms, and impact on NLP research.

LLMLarge Language ModelsModel Scaling
0 likes · 21 min read
From Transformers to LLaMA 4: A Journey Through the Biggest LLMs