Tagged articles

2014 articles

Page 9 of 21

Nov 18, 2025 · Industry Insights

A Practical Guide to Major LLM Services: URLs, Docs, and API Tips

This article compiles the entry points, documentation links, pricing details, and hands‑on API examples for several leading large‑language‑model providers—including DeepSeek, Alibaba Cloud, Baidu Qianfan, ByteDance Volcengine, OpenRouter, and Google Gemini—while comparing their usability, free‑tier offers, and developer experience.

APICloud AIComparison

0 likes · 13 min read

A Practical Guide to Major LLM Services: URLs, Docs, and API Tips

Wu Shixiong's Large Model Academy

Nov 18, 2025 · Artificial Intelligence

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

This article breaks down why function‑call reliability is the biggest bottleneck for LLM agents and presents a systematic five‑step loop—schema quality, prompt context, sampling, training data, and runtime defenses—plus concrete optimization techniques such as dynamic tool routing, plan‑execute, validation layers, memory injection, and log‑driven tuning, illustrated with real‑world cases.

LLMTool Routingagent

0 likes · 12 min read

How to Make LLM Agents’ Function Calls Stable and Accurate: 5 Proven Strategies

JakartaEE China Community

Nov 18, 2025 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This article explains why Retrieval‑Augmented Generation improves LLM accuracy, outlines the key Langchain4j and Ollama3 components, and provides a step‑by‑step Java example—including Maven setup, document ingestion, embedding, similarity search, prompt creation, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j

0 likes · 8 min read

How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

Architect

Nov 17, 2025 · Artificial Intelligence

Comparing Tasking AI and Dify: Architecture, Core Capabilities, and AI Workflow Engines

This article examines the design of LLM‑native AI application platforms Tasking AI and Dify, comparing their LLM integration, plugin management, multi‑tenant isolation, system architecture, and especially Dify’s GraphEngine for complex AI workflow orchestration.

AI PlatformDifyGraphEngine

0 likes · 22 min read

Comparing Tasking AI and Dify: Architecture, Core Capabilities, and AI Workflow Engines

BirdNest Tech Talk

Nov 17, 2025 · Artificial Intelligence

How to Parse and Use Claude Skills with Go: A Deep Dive into LLM Tool Integration

This article explains the concept of Claude Skills, walks through a Go library that parses skill packages, demonstrates a CLI inspector, shows how to run skills with Deepseek‑v3 via an OpenAI‑compatible API, and outlines future security enhancements.

ClaudeDeepSeekGo

0 likes · 13 min read

How to Parse and Use Claude Skills with Go: A Deep Dive into LLM Tool Integration

Wu Shixiong's Large Model Academy

Nov 14, 2025 · Artificial Intelligence

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

This article explains why function‑call accuracy is critical for LLM agents, identifies four common failure causes, and presents a systematic, five‑step engineering framework—including dynamic routing, chain‑of‑thought planning, result validation, memory injection, and log‑driven optimization—backed by concrete examples and quantitative improvements.

Function CallingInterview PreparationLLM

0 likes · 10 min read

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

Programmer DD

Nov 14, 2025 · Artificial Intelligence

Can TOON Format Cut LLM Token Costs by Up to 60%?

This article explains how the TOON data‑serialization format reduces token usage and improves accuracy for large language model calls compared with traditional JSON, provides benchmark results, outlines scenarios where TOON is advantageous or unsuitable, and shows Java integration examples.

LLMTOONToken Optimization

0 likes · 6 min read

Can TOON Format Cut LLM Token Costs by Up to 60%?

AI Tech Publishing

Nov 13, 2025 · Artificial Intelligence

Claude’s Prompt Engineering Best Practices: A Step‑by‑Step Guide

This guide outlines Claude team’s best practices for prompt engineering, covering core techniques such as clear instructions, background context, specificity, examples, and advanced methods like pre‑filled responses, chain‑of‑thought, output formatting, and prompt chaining, with concrete examples and code snippets.

AI promptingChain-of-ThoughtClaude

0 likes · 18 min read

Claude’s Prompt Engineering Best Practices: A Step‑by‑Step Guide

Wu Shixiong's Large Model Academy

Nov 12, 2025 · Artificial Intelligence

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

This article breaks down the memory systems behind LLM‑based agents, explaining why persistent memory is needed, the differences between short‑term context buffers and long‑term vector stores, practical implementation choices, maintenance strategies, and how to articulate these concepts effectively in technical interviews.

LLMagentretrieval

0 likes · 14 min read

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

Alibaba Cloud Developer

Nov 12, 2025 · Artificial Intelligence

How Self‑Programming AI Agents Are Built: From LLM Brain to Dynamic Code Execution

This article explains how a self‑programming AI Agent is constructed by extending large language models as the brain, designing a multi‑area architecture, implementing memory layers, prompt engineering with segment mechanisms, and enabling code generation and execution through a Python‑Java bridge, while sharing practical insights and future directions.

AI AgentCode ExecutionLLM

0 likes · 34 min read

How Self‑Programming AI Agents Are Built: From LLM Brain to Dynamic Code Execution

HyperAI Super Neural

Nov 11, 2025 · Artificial Intelligence

How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts

Deepseek-OCR leverages a visual‑compression approach, combining DeepEncoder and the DeepSeek3B‑MoE‑A570M decoder, to represent document text with far fewer visual tokens, achieving up to 97% OCR accuracy and surpassing GOT‑OCR2.0 and MinerU2.0 on OmniDocBench, while the article offers a one‑click deployment tutorial.

DeepEncoderLLMOCR

0 likes · 6 min read

How Deepseek-OCR Achieves SOTA Using Ultra‑Low Visual Token Counts

Code Mala Tang

Nov 11, 2025 · Artificial Intelligence

Unlock Structured Data from Any Text with LangExtract – A Free Python LLM Tool

LangExtract is an open‑source Python library that uses LLMs to turn messy documents—such as medical records, contracts, novels, or news articles—into structured data with just a few lines of code and optional visualisation.

Data ParsingLLMLangExtract

0 likes · 11 min read

Unlock Structured Data from Any Text with LangExtract – A Free Python LLM Tool

Mingyi World Elasticsearch

Nov 10, 2025 · Backend Development

Is Elasticsearch API Really That Hard? Explore the New 9.x Docs and Java Client

The article explains how the Elasticsearch 9.x documentation now provides language‑specific, strongly‑typed examples—especially a fluent Java client—that eliminate the manual cURL/JSON workflow, improve developer productivity, and serve as reliable references for large language models.

APICode ExamplesElasticsearch

0 likes · 9 min read

Is Elasticsearch API Really That Hard? Explore the New 9.x Docs and Java Client

Old Meng AI Explorer

Nov 10, 2025 · Mobile Development

How Cactus Turns Any Smartphone into a Powerful Offline AI Assistant

Cactus is a lightweight, open‑source mobile AI framework that runs large language models locally on iOS and Android without internet, offering chat, image recognition, and text‑to‑speech while consuming low resources, supporting older phones, and providing simple demo apps and Flutter integration for developers.

AIFlutterLLM

0 likes · 10 min read

How Cactus Turns Any Smartphone into a Powerful Offline AI Assistant

Data Party THU

Nov 9, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

This article walks through the core RAG pipeline, explains why chunking is the linchpin of retrieval quality, and provides detailed definitions, trade‑offs, and implementation examples for five chunking techniques—fixed, recursive, semantic, structure‑aware, and delayed—so you can choose the right approach for any document‑heavy AI application.

AILLMRAG

0 likes · 10 min read

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

DataFunSummit

Nov 8, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents

This article examines Tencent's large language model deployments across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑playing, while deep‑diving into the RAG, GraphRAG, and Agent technologies that enable smarter, more reliable AI applications.

AILLMRAG

0 likes · 4 min read

How Tencent’s LLM Powers Real‑World AI Solutions with RAG and Agents

AI Product Manager Community

Nov 8, 2025 · Artificial Intelligence

Why Prompt Engineering Fails: Embracing Context Engineering for Smarter LLMs

The article explains that prompt engineering alone cannot guarantee reliable AI responses because models lack situational awareness, and introduces context engineering as a systematic approach that structures memory, manages context flow, and integrates RAG and evaluation to make large language models truly useful in real‑world applications.

AIContext EngineeringLLM

0 likes · 7 min read

Why Prompt Engineering Fails: Embracing Context Engineering for Smarter LLMs

21CTO

Nov 7, 2025 · Artificial Intelligence

any-llm 1.0: Seamlessly Switch Between Cloud and Local LLMs with One Python Library

Mozilla.ai's any-llm v1.0 is an open‑source Python library that unifies access to multiple large language model providers, enabling developers to move between cloud‑based and on‑premise LLMs without rewriting code, while offering async‑first APIs, reusable connections, and extensive compatibility features.

AI DevelopmentLLMPython

0 likes · 4 min read

any-llm 1.0: Seamlessly Switch Between Cloud and Local LLMs with One Python Library

DataFunSummit

Nov 7, 2025 · Artificial Intelligence

How Close Are Agents to AGI? Insights from Experiments and Benchmarks

Through a series of experiments, benchmark analyses, and theoretical discussions, this article explores the limits of current AI agents, their underlying mechanisms, performance gaps to human-level intelligence, and the challenges that remain on the path from agents to true AGI.

AGILLMPrompt engineering

0 likes · 26 min read

How Close Are Agents to AGI? Insights from Experiments and Benchmarks

DataFunSummit

Nov 7, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Content Creation, Smart Service, and Game NPCs

This article examines Tencent’s large language model deployments across content generation, intelligent customer service, and game role‑playing, and explains the underlying technologies—Supervised Fine‑Tuning, Retrieval‑Augmented Generation, and Agent systems—highlighting how they enhance performance, explainability, and multi‑step reasoning in real‑world business scenarios.

AILLMRAG

0 likes · 4 min read

How Tencent’s LLM Powers Content Creation, Smart Service, and Game NPCs

Ele.me Technology

Nov 7, 2025 · Artificial Intelligence

LLM‑SM Hybrid Strategies: Boosting Decision Optimization and Store Design

Recent advances in large language models (LLMs) have sparked interest in their decision‑making capabilities, yet challenges remain; this article explores classic prediction‑optimization pipelines, introduces emerging LLM‑as‑Predictor/Ranker/Optimizer paradigms, and details practical case studies on delivery‑price optimization and intelligent store‑decoration recommendation using LLM‑SM hybrid systems.

Decision OptimizationHybrid ModelingLLM

0 likes · 30 min read

LLM‑SM Hybrid Strategies: Boosting Decision Optimization and Store Design

JD Tech

Nov 6, 2025 · Artificial Intelligence

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

This article surveys the evolution of generative recommendation systems powered by large language models, detailing their technical foundations, engineering challenges, recent breakthroughs, and future research directions, while highlighting why the paradigm shift is occurring now.

AI EngineeringGenerative RecommendationLLM

0 likes · 30 min read

LLMs Revolutionize Recommendation Systems: From Generative Models to Production

Tencent Cloud Developer

Nov 6, 2025 · Artificial Intelligence

From Prompt to Multi‑Agent: How LLMs Evolve into Autonomous Agents

Since ChatGPT's debut, the LLM landscape has progressed through four stages—prompt engineering, chain orchestration, autonomous agents, and multi‑agent systems—each enhancing intelligence and automation, with this article detailing their evolution, advantages, drawbacks, and practical implementation examples in Go.

GoLLMMulti-Agent

0 likes · 24 min read

From Prompt to Multi‑Agent: How LLMs Evolve into Autonomous Agents

AI Tech Publishing

Nov 5, 2025 · Artificial Intelligence

Why AI Agents Should Be Positioned as Assistants, Not Replacements

The article explains that marketing AI agents as human replacements leads to poor performance, professional resistance, and hallucination risks, and argues that repositioning them as assistants with human‑in‑the‑loop verification improves efficiency and acceptance.

AI AgentBI EngineerData Agent

0 likes · 3 min read

Why AI Agents Should Be Positioned as Assistants, Not Replacements

Kuaishou Tech

Nov 5, 2025 · Artificial Intelligence

How HiPO Gives LLMs a Smart Thinking Switch to Cut Costs and Boost Accuracy

This article explains the overthinking problem of large language models, introduces the HiPO framework with hybrid data cold‑start and reinforcement‑learning reward mechanisms that let models decide when to think deeply or answer directly, and shows experimental results demonstrating significant efficiency gains and accuracy improvements across multiple benchmarks.

Hybrid Policy OptimizationLLMReinforcement Learning

0 likes · 13 min read

How HiPO Gives LLMs a Smart Thinking Switch to Cut Costs and Boost Accuracy

Data Party THU

Nov 5, 2025 · Artificial Intelligence

How to Give LLM Agents Memory, Reflection, and Goal Tracking

This article explains why current LLM agents lose context after each conversation and presents a practical architecture—using SQLite for structured storage, a vector database for semantic retrieval, and LLM‑driven reflection—to add persistent memory, self‑evaluation, and goal‑tracking capabilities that turn agents into learning partners.

Goal TrackingLLMMemory

0 likes · 10 min read

How to Give LLM Agents Memory, Reflection, and Goal Tracking

Code Mala Tang

Nov 5, 2025 · Backend Development

How to Build a Production-Ready Async LLM API with FastAPI

Learn how to design and deploy a high‑performance, production‑grade LLM API using FastAPI, covering async routing, type‑safe Pydantic models, streaming via SSE/WebSockets, middleware, caching, rate limiting, observability, retries, and cost‑control strategies for robust AI services.

AsyncFastAPILLM

0 likes · 12 min read

How to Build a Production-Ready Async LLM API with FastAPI

Wu Shixiong's Large Model Academy

Nov 5, 2025 · Artificial Intelligence

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

Building a Retrieval‑Augmented Generation (RAG) system may be straightforward in code, but making it reliable, accurate, and scalable in production involves challenges across data preparation, vector retrieval, query rewriting, generation control, and system integration, turning a demo into a truly useful AI service.

AILLMPrompt engineering

0 likes · 8 min read

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

JavaGuide

Nov 5, 2025 · Artificial Intelligence

Cursor Goes Beyond the IDE with Agent Mode and Its Own Composer LLM

Cursor, once hailed as the leading AI‑enhanced IDE, has shifted its focus by making Agent mode the default and launching its own large‑model Composer, which the vendor claims runs four times faster than comparable models, though real‑world performance remains to be validated.

AI IDEClaudeCodex

0 likes · 4 min read

Cursor Goes Beyond the IDE with Agent Mode and Its Own Composer LLM

AI2ML AI to Machine Learning

Nov 4, 2025 · Artificial Intelligence

Common Debugging Signals for Large Language Models

This article outlines the end‑to‑end workflow for large‑model training, highlights typical debugging challenges such as memory OOM, performance bottlenecks, and gradient issues, and provides concrete strategies, tools (DeepSpeed, Megatron, Torchtitan, veScale) and best‑practice checklists to help engineers diagnose and resolve problems efficiently.

DeepSpeedLLMMegatron

0 likes · 12 min read

Common Debugging Signals for Large Language Models

DataFunTalk

Nov 4, 2025 · Artificial Intelligence

Can LLMs Trade Crypto Profitably? Inside the Alpha Arena Competition

Alpha Arena’s first season pitted six leading large language models against real crypto markets with $10,000 each, revealing stark differences in trading bias, risk management, and sensitivity to prompts, as Qwen3‑Max and DeepSeek outperformed GPT‑5, while detailed case studies expose model vulnerabilities and future research directions.

AI agentsAlpha ArenaLLM

0 likes · 12 min read

Can LLMs Trade Crypto Profitably? Inside the Alpha Arena Competition

Data STUDIO

Nov 4, 2025 · Artificial Intelligence

How to Build a Memory-Enabled AI Agent with SQLite and Vector Search

This article explains how to give AI agents persistent memory, reflection, and goal‑tracking by storing interaction summaries in SQLite, embedding them for semantic retrieval with a vector database, and using LLM‑generated prompts to recall, reflect, and manage objectives across sessions.

AI AgentGoal TrackingLLM

0 likes · 10 min read

How to Build a Memory-Enabled AI Agent with SQLite and Vector Search

dbaplus Community

Nov 3, 2025 · Artificial Intelligence

How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms

This article explains how Retrieval‑Augmented Generation (RAG) combines vector databases with large language models to let non‑technical users ask natural‑language questions and receive precise SQL statements, detailing the workflow, architecture, chunking methods, performance gains, and remaining challenges.

Data PlatformLLMRAG

0 likes · 17 min read

How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms

DataFunSummit

Nov 3, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World AI: From RAG to Agents

This article examines Tencent's large language model applications across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑playing, and explains the three key technologies—Supervised Fine‑Tuning, Retrieval‑Augmented Generation, and Agents—that enable these capabilities.

AI applicationsLLMRAG

0 likes · 4 min read

How Tencent’s LLM Powers Real‑World AI: From RAG to Agents

Meituan Technology Team

Nov 3, 2025 · Artificial Intelligence

Introducing VitaBench: A Real-World Agent Benchmark That Reveals a 30% Success Gap

VitaBench, a new open‑source benchmark from Meituan’s LongCat team, evaluates LLM‑driven agents across three realistic life‑service scenarios—food ordering, restaurant dining, and travel planning—using 66 tools and quantifying reasoning, tool, and interaction complexities, exposing a mere 30% success rate on complex cross‑scene tasks.

AILLMTool Use

0 likes · 14 min read

Introducing VitaBench: A Real-World Agent Benchmark That Reveals a 30% Success Gap

Baobao Algorithm Notes

Nov 3, 2025 · Artificial Intelligence

Inside Kimi Linear: How Aggressive MoE Sparsity and Hybrid Linear Attention Boost a 3B‑Scale LLM

The author details Kimi Linear's architecture, training challenges, aggressive MoE sparsity, hybrid linear attention design, benchmark gains, and post‑training insights, offering a transparent technical review of this 48B‑parameter MoE LLM built on 5.7 T tokens.

Hybrid ModelKimi LinearLLM

0 likes · 9 min read

Inside Kimi Linear: How Aggressive MoE Sparsity and Hybrid Linear Attention Boost a 3B‑Scale LLM

Goodme Frontend Team

Nov 3, 2025 · Artificial Intelligence

Unlock AI Power with Model Context Protocol (MCP): Build LLM‑Enabled Servers in Minutes

This article introduces the Model Context Protocol (MCP) and Large Language Models (LLM), explains their core concepts, transmission mechanisms, lifecycle, and essential modules, and provides step‑by‑step code examples for creating an MCP server, adding tools, resources, prompts, and debugging workflows to accelerate AI‑driven development.

AILLMMCP

0 likes · 15 min read

Unlock AI Power with Model Context Protocol (MCP): Build LLM‑Enabled Servers in Minutes

Data Party THU

Nov 2, 2025 · Artificial Intelligence

From RNN to LLM: How Transformers Power Modern Language Models

This article explains the evolution from RNNs through Encoder‑Decoder models to Transformers, detailing self‑attention, multi‑head attention, and masked attention, and then describes what Large Language Models are, their key components, capabilities, limitations, and common applications.

AIDeep LearningLLM

0 likes · 9 min read

From RNN to LLM: How Transformers Power Modern Language Models

Data Party THU

Nov 1, 2025 · Artificial Intelligence

How to Blend Process‑Oriented and Agent‑Centric AI into a Hybrid Intelligent Pipeline

This article analyzes two contrasting AI agent design paradigms—process‑driven workflow orchestration and autonomous agent intelligence—examines their strengths and limitations, and proposes a hybrid architecture that fuses deterministic pipelines with dynamic planning, tool use, and memory mechanisms to achieve both reliability and adaptability.

AIHybridLLM

0 likes · 15 min read

How to Blend Process‑Oriented and Agent‑Centric AI into a Hybrid Intelligent Pipeline

Wu Shixiong's Large Model Academy

Nov 1, 2025 · Artificial Intelligence

Turn a Basic RAG Demo into a High‑Impact Interview Project

This guide shows how to evolve a simple Retrieval‑Augmented Generation prototype into a production‑grade system by strengthening data ingestion, optimizing retrieval with hybrid and reranking techniques, adding query rewriting, long‑context handling, reinforcement learning, and multimodal support, so candidates can demonstrate real engineering depth in interviews.

AILLMRAG

0 likes · 7 min read

Turn a Basic RAG Demo into a High‑Impact Interview Project

Bighead's Algorithm Notes

Oct 31, 2025 · Artificial Intelligence

Weekly Quantitative Paper Digest (Oct 25‑31 2025)

This article summarizes six recent arXiv papers that explore how large language models, graph‑theoretic methods, generative frameworks, hypergraph multimodal architectures, GroupSHAP‑enhanced forecasting, and multi‑agent LLM workflows can improve financial signal extraction, portfolio optimization, and stock‑price prediction, providing empirical results on S&P 500 data.

Financial AILLMMultimodal Learning

0 likes · 13 min read

Weekly Quantitative Paper Digest (Oct 25‑31 2025)

Bilibili Tech

Oct 31, 2025 · Artificial Intelligence

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

RIVAL (Reinforcement Learning with Iterative and Adversarial Optimization) introduces an adversarial game between a reward model and a translation LLM, combining qualitative preference rewards with quantitative metrics like BLEU, to overcome distribution shift in RLHF and achieve superior performance on conversational subtitle and WMT translation tasks.

BLEULLMReinforcement Learning

0 likes · 13 min read

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

Baobao Algorithm Notes

Oct 31, 2025 · Artificial Intelligence

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

Meta’s recent paper reveals a sigmoid‑shaped scaling law for LLM reinforcement learning, presents extensive 40‑k GPU‑hour experiments, compares various RL designs such as PPO‑off‑policy‑k and Pipeline‑RL‑k, and distills the findings into a practical “ScaleRL” recipe that improves performance and efficiency.

LLMRL OptimizationReinforcement Learning

0 likes · 10 min read

Unlocking LLM RL Scaling: The Best Practices from Meta’s New Study

Alibaba Cloud Developer

Oct 31, 2025 · Artificial Intelligence

Why AI Agents Fail and 10 Proven Ways to Make Them Reliable

This article shares the practical lessons learned from building Alibaba Cloud’s digital employee "YunXiaoEr Aivis", explaining why large‑language‑model agents often miss expectations and presenting ten concrete strategies—ranging from clear prompt design to memory management—that dramatically improve multi‑agent reliability.

AI agentsAgent OptimizationContext Engineering

0 likes · 29 min read

Why AI Agents Fail and 10 Proven Ways to Make Them Reliable

BirdNest Tech Talk

Oct 30, 2025 · Artificial Intelligence

Unlocking Context: Why Memory Is Crucial in LangChain and How to Build Custom Memory

LangChain’s stateless LLMs require a memory component to retain conversation context, and this article explains the importance of memory, compares built‑in memory types, and walks through two practical examples—custom buffer memory and basic message history—showing how to implement and use them with code.

ChatbotLLMLangChain

0 likes · 10 min read

Unlocking Context: Why Memory Is Crucial in LangChain and How to Build Custom Memory

BirdNest Tech Talk

Oct 30, 2025 · Artificial Intelligence

Building Stateful Multi‑Agent LLM Workflows with LangGraph: A Step‑by‑Step Guide

LangGraph extends LangChain by letting developers define stateful, multi‑agent LLM workflows as graphs with nodes, edges, and shared state, and the article walks through core concepts, typical use cases, and a detailed example that shows how to define state, nodes, edges, compile and run the graph.

LLMLangChainLangGraph

0 likes · 7 min read

Building Stateful Multi‑Agent LLM Workflows with LangGraph: A Step‑by‑Step Guide

BirdNest Tech Talk

Oct 30, 2025 · Artificial Intelligence

How to Build Multimodal Prompts with LangChain: A Step‑by‑Step Guide

Learn how LangChain enables multimodal interactions by preparing inputs, constructing prompts, invoking models like GPT‑4o, and processing responses, with a complete example that demonstrates image‑question answering, code walkthrough, environment setup, and key considerations for API keys and image URLs.

LLMLangChainMultimodal

0 likes · 9 min read

How to Build Multimodal Prompts with LangChain: A Step‑by‑Step Guide

BirdNest Tech Talk

Oct 30, 2025 · Artificial Intelligence

Mastering LangChain Tools: Define, Build, and Optimize Agent Functions

This guide explains what LangChain tools are, why clear descriptions matter, and walks through three ways to create them—using the @tool decorator, StructuredTool with Pydantic models, and custom BaseTool subclasses—plus examples of built‑in tools and reference links.

LLMLangChainPromptEngineering

0 likes · 7 min read

Mastering LangChain Tools: Define, Build, and Optimize Agent Functions

BirdNest Tech Talk

Oct 30, 2025 · Artificial Intelligence

How LangChain Agents Empower LLMs with Dynamic Reasoning and Tool Use

This article explains the core concept of LangChain agents—combining an LLM, a set of tools, and a reasoning‑action loop—to enable dynamic decision‑making, tool invocation, and iterative observation for solving complex, multi‑step tasks.

AI reasoningLLMLangChain

0 likes · 6 min read

How LangChain Agents Empower LLMs with Dynamic Reasoning and Tool Use

Baobao Algorithm Notes

Oct 30, 2025 · Artificial Intelligence

Why LLM RL Training Crashes While SFT Stays Stable: Insights & Tricks

The article examines the fundamental similarity between SFT and RL loss functions for large language models, explains why RL training is prone to instability, discusses infrastructure and data quality challenges, and reviews practical tricks and reward‑model considerations for more reliable RL fine‑tuning.

AILLMReinforcement Learning

0 likes · 11 min read

Why LLM RL Training Crashes While SFT Stays Stable: Insights & Tricks

Aikesheng Open Source Community

Oct 29, 2025 · Artificial Intelligence

What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?

BiomedSQL and LogicCat are two newly released Text‑to‑SQL datasets that challenge large language models with complex biomedical reasoning, multi‑step logical inference, and domain‑specific knowledge, offering detailed analyses of query types, scientific reasoning categories, and performance gaps that highlight current LLM limitations.

BiomedicalDatasetLLM

0 likes · 9 min read

What Makes BiomedSQL and LogicCat the Toughest Text‑to‑SQL Benchmarks for LLMs?

DeWu Technology

Oct 29, 2025 · Artificial Intelligence

Why Chunking Can Make or Break Your RAG System – Practical Strategies & Code

This article explains how proper document chunking—choosing the right chunk size, overlap, and structure‑aware boundaries—directly impacts the relevance, factuality, and efficiency of Retrieval‑Augmented Generation pipelines, and provides multiple Python implementations ranging from simple fixed‑length splits to semantic and hybrid approaches.

EmbeddingLLMRAG

0 likes · 29 min read

Why Chunking Can Make or Break Your RAG System – Practical Strategies & Code

Tencent Cloud Developer

Oct 29, 2025 · Artificial Intelligence

How Tasking AI and Dify Redefine LLM‑Powered AI Application Development

This article analyzes the architecture, core capabilities, and workflow orchestration of LLM‑native application platforms Tasking AI and Dify, comparing their microservice designs, plugin management, multi‑tenant isolation, and GraphEngine execution to highlight strengths, trade‑offs, and future development trends.

AI PlatformDifyLLM

0 likes · 21 min read

How Tasking AI and Dify Redefine LLM‑Powered AI Application Development

DataFunSummit

Oct 28, 2025 · Artificial Intelligence

How Bilibili Uses LLMs to Tame Massive Data Platform Failures

Exploring Bilibili’s large‑scale data platform, this article details its five‑layer, storage‑compute separated architecture, the massive daily workload of offline and real‑time tasks, common failure and slowdown causes, and how an LLM‑powered intelligent assistant is being developed to help engineers troubleshoot efficiently.

BilibiliIntelligent AssistantLLM

0 likes · 5 min read

How Bilibili Uses LLMs to Tame Massive Data Platform Failures

Data Party THU

Oct 28, 2025 · Artificial Intelligence

Can Low‑Quality Data Cause Irreversible ‘Brain Rot’ in Large Language Models?

Researchers from Texas A&M and UT Austin demonstrate that prolonged pre‑training on low‑quality, short‑form web content causes large language models to suffer irreversible cognitive decline—manifested as attention loss, broken reasoning chains, and personality distortion—highlighting data quality as a critical training‑time safety issue.

Artificial IntelligenceCognitive SafetyData Quality

0 likes · 7 min read

Can Low‑Quality Data Cause Irreversible ‘Brain Rot’ in Large Language Models?

JD Tech Talk

Oct 27, 2025 · Artificial Intelligence

How Large Language Models Are Revolutionizing Generative Recommendation Systems

Over the past year, generative recommendation has made substantial progress by leveraging large language models' powerful sequence modeling and reasoning abilities, introducing a new paradigm that replaces complex handcrafted features, addresses traditional recommendation bottlenecks, and outlines the evolution, core technologies, engineering challenges, and future directions of LLM‑based recommendation systems.

AI EngineeringEncoder-DecoderLLM

0 likes · 29 min read

How Large Language Models Are Revolutionizing Generative Recommendation Systems

AI Large Model Application Practice

Oct 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Next Evolution Beyond Prompt Engineering

The article explains how traditional prompt engineering is giving way to Context Engineering and the Agentic Context Engineering (ACE) framework, which lets large language model agents continuously learn and improve through evolving, well‑structured context without fine‑tuning.

AI ArchitectureAgentic AIContext Engineering

0 likes · 12 min read

Why Context Engineering Is the Next Evolution Beyond Prompt Engineering

Bilibili Tech

Oct 27, 2025 · Artificial Intelligence

How Bilibili’s LLM-Powered System Cuts Game Localization Costs by 80%

Bilibili’s game algorithm team built a four‑layer, LLM‑based translation platform that automates terminology extraction, retrieval‑augmented generation, and quality assessment, dramatically reducing localization cycles by over 85% and costs by up to 80% while supporting ten languages and ensuring consistent, culturally‑accurate game text.

LLMRAGgame localization

0 likes · 20 min read

How Bilibili’s LLM-Powered System Cuts Game Localization Costs by 80%

Wu Shixiong's Large Model Academy

Oct 27, 2025 · Artificial Intelligence

Designing Effective Generation Modules for RAG: Prompt Engineering, Multi‑Document Fusion, and Hallucination Control

This article explains how to design and optimize the generation module of Retrieval‑Augmented Generation systems by building robust prompts, merging multi‑source information, controlling answer formats, and applying post‑generation verification to reduce hallucinations and improve enterprise‑grade performance.

AIGeneration ModuleHallucination Control

0 likes · 9 min read

Designing Effective Generation Modules for RAG: Prompt Engineering, Multi‑Document Fusion, and Hallucination Control

KooFE Frontend Team

Oct 26, 2025 · Artificial Intelligence

Master Zero-Shot Prompting: Advanced Techniques to Boost LLM Performance

Zero-shot prompting lets large language models perform tasks without examples, and by following principles of clarity and structured instructions, advanced strategies such as emotion prompting, zero-shot chain-of-thought, RE2 re-reading, Rephrase-and-Respond, role-play, and System-2 Attention can significantly improve accuracy and response quality across translation, reasoning, and QA tasks.

AI reasoningLLMLarge Language Models

0 likes · 13 min read

Master Zero-Shot Prompting: Advanced Techniques to Boost LLM Performance

Deepin Linux

Oct 26, 2025 · Information Security

How PacketScope Combines eBPF and LLMs for Real‑Time Kernel‑Level Attack Defense

PacketScope leverages eBPF to trace every packet inside the TCP/IP stack, visualizes protocol interactions, and uses large language models to automatically detect and block sophisticated network attacks with zero‑delay, addressing the growing $10.5 trillion cyber‑crime threat projected for 2025.

LLMeBPFkernel defense

0 likes · 8 min read

How PacketScope Combines eBPF and LLMs for Real‑Time Kernel‑Level Attack Defense

dbaplus Community

Oct 26, 2025 · Artificial Intelligence

How MCP Turns AI into a Universal Plug‑In: A Deep Dive into Model Context Protocol

This article explains the Model Context Protocol (MCP) – an open, universal standard that lets large language models seamlessly interact with external tools and data – covering its core architecture, why it’s needed, underlying principles, tool‑selection mechanics, a step‑by‑step Python server implementation, and practical usage tips.

AI integrationLLMMCP

0 likes · 20 min read

How MCP Turns AI into a Universal Plug‑In: A Deep Dive into Model Context Protocol

Wu Shixiong's Large Model Academy

Oct 25, 2025 · Artificial Intelligence

How to Build a High‑Quality RAG Knowledge Base: A Step‑by‑Step Guide

This article breaks down the end‑to‑end engineering pipeline for constructing a Retrieval‑Augmented Generation (RAG) knowledge base, covering document parsing, data cleaning, semantic chunking, embedding, and index creation, plus practical optimization tips and a concise interview answer framework.

LLMRAGvector indexing

0 likes · 10 min read

How to Build a High‑Quality RAG Knowledge Base: A Step‑by‑Step Guide

Bighead's Algorithm Notes

Oct 24, 2025 · Artificial Intelligence

Weekly AI‑Finance Paper Digest (Oct 18‑24 2025)

This digest presents seven recent arXiv papers that explore large‑language‑model‑driven portfolio scoring, hybrid ResNet‑RMT covariance denoising for crypto, LLM‑enhanced financial causal analysis, multilingual news alignment for stock returns, three‑step bubble prediction with news and macro data, multimodal volatility forecasting, and news‑aware reinforcement trading, each with reported performance gains.

Financial AILLMMultimodal Learning

0 likes · 15 min read

Weekly AI‑Finance Paper Digest (Oct 18‑24 2025)

AI2ML AI to Machine Learning

Oct 24, 2025 · Artificial Intelligence

Beyond RAG: Three Emerging Knowledge‑Engineering Strategies (ICL, Online Learning, SLM)

The article outlines three post‑RAG knowledge‑engineering approaches—In‑Context Learning with dynamic few‑shot selection, Online Learning encompassing Meta‑Learning and Lifelong Learning to quickly adapt to new tasks, and the Small Language Model path that combines fine‑tuned task‑specific experts with LLM‑SLM collaboration for efficient, privacy‑preserving inference.

In-Context LearningKnowledge EngineeringLLM

0 likes · 4 min read

Beyond RAG: Three Emerging Knowledge‑Engineering Strategies (ICL, Online Learning, SLM)

360 Zhihui Cloud Developer

Oct 24, 2025 · Artificial Intelligence

7 Essential Agent Design Patterns for Building Autonomous AI Systems

This article explains the fundamental differences between workflows and agents, introduces seven core design patterns—including three workflow patterns and four agent patterns—provides Python examples using Ollama, and shows how to combine these patterns to create robust, autonomous AI applications.

AI agentsDesign PatternsLLM

0 likes · 30 min read

7 Essential Agent Design Patterns for Building Autonomous AI Systems

Wu Shixiong's Large Model Academy

Oct 24, 2025 · Artificial Intelligence

Can Large Language Models Truly Plan? Unpacking Agent Frameworks

This article explains why most LLM‑based agents only perform pseudo‑planning through prompts or hard‑coded loops, outlines when to rely on prompt‑driven versus program‑driven planning, compares popular frameworks such as ReAct, MRKL, BabyAGI and AutoGPT, and clarifies what true autonomous planning would require.

Artificial IntelligenceAutoGPTLLM

0 likes · 12 min read

Can Large Language Models Truly Plan? Unpacking Agent Frameworks

Wu Shixiong's Large Model Academy

Oct 22, 2025 · Artificial Intelligence

Mastering LLM Training: A Step‑by‑Step Blueprint from Data to Alignment

This guide walks through the complete end‑to‑end process of training a large language model from scratch, covering data collection, cleaning, tokenization, pre‑training objectives and engineering, post‑training alignment methods, scaling laws, over‑fitting mitigation, and gradient‑stability techniques.

AlignmentLLMgradient stability

0 likes · 9 min read

Mastering LLM Training: A Step‑by‑Step Blueprint from Data to Alignment

Instant Consumer Technology Team

Oct 21, 2025 · Artificial Intelligence

Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling

This tutorial revisits a simple text‑generation function, explains how temperature scaling and top‑K sampling reshape token probability distributions, demonstrates their effects with PyTorch code and visualizations, and shows how to integrate both techniques into an improved generation routine for more diverse and human‑like outputs.

LLMPyTorchText Generation

0 likes · 13 min read

Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling

Baidu Tech Salon

Oct 21, 2025 · Artificial Intelligence

Cut Data Integration Time from Months to Days with LLM-Powered Intelligent Ingestion

An LLM-driven intelligent data-ingestion framework replaces manual, months-long integration with an automated code-generation and execution loop that auto-recognizes schemas, maps structures, extracts quality rules, builds deployment packages, cutting onboarding time from three months to three days while eliminating human effort.

LLMautomated ETLcode-generation

0 likes · 19 min read

Cut Data Integration Time from Months to Days with LLM-Powered Intelligent Ingestion

Data STUDIO

Oct 21, 2025 · Artificial Intelligence

Building a Self‑Learning LangGraph Memory System with Feedback Loops and Dynamic Prompts

This article walks through the design and implementation of a two‑layer memory architecture for LangGraph agents, covering short‑term and long‑term stores, various storage back‑ends, prompt engineering, utility functions, node definitions, human‑in‑the‑loop interrupt handling, and how user feedback is captured and used to continuously update the agent’s behavior.

Feedback LoopHuman-in-the-LoopLLM

0 likes · 43 min read

Building a Self‑Learning LangGraph Memory System with Feedback Loops and Dynamic Prompts

AI2ML AI to Machine Learning

Oct 20, 2025 · Artificial Intelligence

nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation

This article revisits nanochat's core components, detailing the preparation of diverse training datasets, the scaling calculations for tokens and parameters, the model's MQA and KV‑cache design, the full training pipeline with gradient accumulation and mixed‑precision, cost breakdown, inference optimizations, evaluation tasks, and identified limitations with suggested improvements.

KV cacheLLMMQA

0 likes · 9 min read

nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation

Data Party THU

Oct 20, 2025 · Artificial Intelligence

Fine-Tuning LLMs on TPU with Tunix: A Step‑by‑Step QLoRA Guide

This article introduces Google’s Tunix library for JAX‑based LLM post‑training, explains its core features such as supervised fine‑tuning, reinforcement learning and knowledge distillation, and provides detailed installation steps and a complete TPU‑accelerated QLoRA fine‑tuning workflow on the Gemma 2B model, including code snippets and inference testing.

AIFine-tuningJAX

0 likes · 8 min read

Fine-Tuning LLMs on TPU with Tunix: A Step‑by‑Step QLoRA Guide

Data Party THU

Oct 20, 2025 · Artificial Intelligence

How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent

This article analyzes the rStar2‑Agent paper, revealing how Agentic Reinforcement Learning, the GRPO‑RoC algorithm, a high‑throughput code‑execution service, and a three‑stage training recipe let a modest 14‑billion‑parameter model surpass much larger LLMs on challenging math benchmarks.

AI researchArtificial IntelligenceLLM

0 likes · 18 min read

How Agentic RL Enables a 14B LLM to Outperform Giant Models – Inside rStar2‑Agent

AI Large Model Application Practice

Oct 20, 2025 · Artificial Intelligence

Build a Local End‑to‑End DeepResearch Agent with Alibaba’s 30B MoE Model Using LangGraph

This guide walks through deploying Alibaba's open‑source Tongyi‑DeepResearch 30B MoE model locally, configuring FastAPI and A2A interfaces, implementing a ReAct‑style agent with LangGraph, setting up research tools, and testing the full UI‑API‑Agent pipeline via CLI and Streamlit.

A2ADeepResearchDeployment

0 likes · 14 min read

Build a Local End‑to‑End DeepResearch Agent with Alibaba’s 30B MoE Model Using LangGraph

Alibaba Cloud Developer

Oct 20, 2025 · Artificial Intelligence

How LLM-Powered Agents Transform Secure Code Review in Enterprise Repositories

This article details the implementation of an LLM‑based code‑review agent in a C3‑level secure repository, describing its RAG‑enhanced knowledge base, CI pipeline integration, real‑world results, prompt engineering, and ongoing optimization to boost review efficiency and defect detection.

AI agentsCode reviewLLM

0 likes · 18 min read

How LLM-Powered Agents Transform Secure Code Review in Enterprise Repositories

Bighead's Algorithm Notes

Oct 19, 2025 · Artificial Intelligence

QuantAgent Unveiled: A Multi‑Agent LLM Framework for High‑Frequency Trading (Code Open)

QuantAgent introduces a multi‑agent LLM framework that replaces text‑based inputs with raw OHLC price signals, decomposes trading decisions into Indicator, Pattern, Trend, Risk, and Decision agents, and achieves substantially higher direction accuracy and returns across ten financial assets in zero‑shot HFT experiments.

Financial AILLMMulti-Agent System

0 likes · 10 min read

QuantAgent Unveiled: A Multi‑Agent LLM Framework for High‑Frequency Trading (Code Open)

AI2ML AI to Machine Learning

Oct 19, 2025 · Artificial Intelligence

Deep Dive into nanochat: Source Code, Model Size Calculations, and Optimization Techniques

This article provides a thorough analysis of nanochat’s source code, detailing transformer component differences, precise parameter‑size formulas, FlashNorm and ReLU² innovations, scaling‑law insights, memory‑usage estimations, and the distributed optimizer and training pipelines used to build the model.

Distributed TrainingLLMTransformer

0 likes · 20 min read

Deep Dive into nanochat: Source Code, Model Size Calculations, and Optimization Techniques

DataFunTalk

Oct 17, 2025 · Artificial Intelligence

Why Rude Prompts Boost LLM Accuracy: Surprising Findings from Recent Research

A recent study reveals that increasingly impolite prompts can significantly improve large language model accuracy, challenging common assumptions about politeness and prompting while offering practical insights for effective AI interaction.

AI behaviorGPT-4LLM

0 likes · 11 min read

Why Rude Prompts Boost LLM Accuracy: Surprising Findings from Recent Research

High Availability Architecture

Oct 17, 2025 · Artificial Intelligence

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling, Human‑in‑the‑Loop, and Real‑World Use Cases

This article explores how Spring AI Alibaba enables the development of autonomous AI agents that run on schedules, interact with humans when needed, and handle tasks such as periodic business automation, batch processing, emergency response, and long‑cycle data analysis, illustrated with Java code examples.

LLMautonomous schedulingjava

0 likes · 12 min read

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling, Human‑in‑the‑Loop, and Real‑World Use Cases

Instant Consumer Technology Team

Oct 16, 2025 · Artificial Intelligence

How to Enable LLMs to Call MySQL via MCP: A Step‑by‑Step Guide

This tutorial shows how to let large language models autonomously invoke external services—specifically a MySQL database—by using the Model Context Protocol (MCP), covering environment setup, MCP service implementation, agent integration, and real‑world execution results.

AI integrationLLMLangChain

0 likes · 8 min read

How to Enable LLMs to Call MySQL via MCP: A Step‑by‑Step Guide

DataFunSummit

Oct 16, 2025 · Artificial Intelligence

How Chat BI Transforms Data Warehousing with AI: Unlock Real‑Time Insights

This presentation by iQIYI’s Technical Director Zhang Xiaoming details the evolution of BI systems, introduces the Chat BI framework, explains its three‑step implementation, outlines architectural design, data‑warehouse integration, performance optimizations, and user‑operation strategies, revealing how AI and RAG empower smarter data analytics.

AIBIChatBI

0 likes · 18 min read

How Chat BI Transforms Data Warehousing with AI: Unlock Real‑Time Insights

BirdNest Tech Talk

Oct 15, 2025 · Artificial Intelligence

How DeepSeek‑V3.2‑Exp Achieves Fast Distributed LLM Inference with FP8 and MoE

This article walks through the DeepSeek‑V3.2‑Exp inference codebase, detailing its MoE architecture, Multi‑Head Latent Attention, FP8 quantization, custom CUDA kernels, and 8‑GPU NCCL‑based distributed execution from initialization through prefill and decode stages.

CUDADistributed inferenceFP8 quantization

0 likes · 9 min read

How DeepSeek‑V3.2‑Exp Achieves Fast Distributed LLM Inference with FP8 and MoE

Baidu Geek Talk

Oct 15, 2025 · Artificial Intelligence

Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?

This article presents an LLM‑driven intelligent data platform ingestion solution that automates schema recognition, mapping, quality rule extraction, and package building, reducing integration cycles from three months to three days while eliminating manual effort and enhancing scalability and control.

AIData PlatformLLM

0 likes · 21 min read

Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?

AI Cyberspace

Oct 15, 2025 · Artificial Intelligence

Why MCP Is Poised to Replace Function Calling for LLM Agents

The Model Context Protocol (MCP) introduced by Anthropic addresses the scalability, integration, and context‑transfer limitations of traditional Function Calling by offering a standardized, bidirectional, and context‑aware communication layer that simplifies tool discovery, security, and workflow orchestration for LLM‑driven agents.

AI integrationFunction CallingLLM

0 likes · 24 min read

Why MCP Is Poised to Replace Function Calling for LLM Agents

Alibaba Cloud Developer

Oct 15, 2025 · Artificial Intelligence

Mastering Structured Output in Large Language Models: Techniques, Challenges, and Future Trends

Large language models are evolving from free‑form text generators to reliable data providers by mastering structured output through prompt engineering, validation frameworks, constrained decoding, supervised fine‑tuning, reinforcement learning, and API‑level capabilities, enabling seamless integration with software systems while addressing hallucinations and format reliability.

APILLMPrompt engineering

0 likes · 28 min read

Mastering Structured Output in Large Language Models: Techniques, Challenges, and Future Trends

Bighead's Algorithm Notes

Oct 14, 2025 · Artificial Intelligence

How TS‑Agent Uses LLMs and Reflective Feedback to Automate Financial Time‑Series Modeling

TS‑Agent is a modular LLM‑driven framework that formalizes financial time‑series modeling as a three‑stage iterative decision process, leveraging structured knowledge bases, dynamic memory, and a feedback‑driven code‑editing loop to outperform AutoML baselines in accuracy, robustness, and auditability.

AutoMLFeedback LoopKnowledge Base

0 likes · 12 min read

How TS‑Agent Uses LLMs and Reflective Feedback to Automate Financial Time‑Series Modeling

Volcano Engine Developer Services

Oct 14, 2025 · Artificial Intelligence

How CollabLLM Redefines LLM Collaboration with Multi‑Turn Training

CollabLLM tackles the limitations of large language models in everyday multi‑turn dialogues by introducing a user‑centric, multi‑turn training framework that leverages simulated interactions, multi‑round reward modeling, and veRL toolchain support, achieving superior performance over single‑turn baselines.

LLMReinforcement Learningcollaborative training

0 likes · 13 min read

How CollabLLM Redefines LLM Collaboration with Multi‑Turn Training

AntTech

Oct 13, 2025 · Artificial Intelligence

How dInfer Accelerates Diffusion LLM Inference Over 10× Faster Than Fast‑dLLM

Ant Group's open‑source dInfer framework dramatically speeds up diffusion language model inference—achieving more than a ten‑fold boost over Fast‑dLLM, surpassing autoregressive baselines, and delivering 1011 tokens per second on HumanEval—by tackling computational cost, KV‑cache invalidation, and parallel decoding challenges through modular system‑level innovations.

AI PerformanceDiffusion Language ModelInference Optimization

0 likes · 11 min read

How dInfer Accelerates Diffusion LLM Inference Over 10× Faster Than Fast‑dLLM

Instant Consumer Technology Team

Oct 13, 2025 · Artificial Intelligence

Mastering AI Agents: Building Knowledge Bases, Workflows, and Prompt Engineering

This article explains how to design a high‑performing AI Agent by constructing a robust knowledge base, orchestrating efficient workflows, and crafting precise prompts, covering vector storage, graph databases, retrieval strategies, and practical prompt‑engineering techniques.

AI AgentKnowledge BaseLLM

0 likes · 15 min read

Mastering AI Agents: Building Knowledge Bases, Workflows, and Prompt Engineering

Aikesheng Open Source Community

Oct 13, 2025 · Artificial Intelligence

Can LLMs Fix Real-World SQL Bugs? Inside the BIRD-CRITIC Benchmark

This article introduces the BIRD-CRITIC benchmark, a comprehensive SQL diagnostic dataset spanning multiple dialects, evaluates large language models' ability to repair real-world database queries, and discusses its design, multi‑dialect support, data quality processes, and experimental results.

DatasetLLMText2SQL

0 likes · 9 min read

Can LLMs Fix Real-World SQL Bugs? Inside the BIRD-CRITIC Benchmark

AI Large Model Application Practice

Oct 13, 2025 · Artificial Intelligence

How to Tame LLM Agents: Proven Strategies to Reduce Uncertainty and Boost Reliability

This article outlines practical techniques—including prompt engineering, domain fine‑tuning, retrieval‑augmented generation, structured outputs, workflow constraints, model parameter control, behavior rules, risk‑based AI participation, and comprehensive governance—to curb the unpredictability of large language model agents in enterprise settings.

AI AgentAI GovernanceLLM

0 likes · 18 min read

How to Tame LLM Agents: Proven Strategies to Reduce Uncertainty and Boost Reliability

Bighead's Algorithm Notes

Oct 12, 2025 · Artificial Intelligence

Trading-R1: Open-Source LLM Framework for Explainable Financial Trading

This article reviews Trading‑R1, an open‑source LLM inference framework that integrates multimodal financial data, three‑stage supervised‑fine‑tuning and reinforcement learning to generate structured investment arguments and risk‑adjusted trade decisions, achieving superior Sharpe ratio and drawdown performance on real‑world stock and ETF tests.

DatasetFinancial TradingLLM

0 likes · 11 min read

Trading-R1: Open-Source LLM Framework for Explainable Financial Trading

DataFunSummit

Oct 12, 2025 · Artificial Intelligence

How Kuaishou Uses Large Models to Supercharge Ad Targeting with COPE and LEARN

This article reviews Kuaishou's two‑year exploration of multimodal large‑model techniques for advertising, outlining challenges in content‑domain ad estimation, the COPE unified product representation framework, and the LEARN LLM knowledge‑transfer approach that together improve ad system performance.

AdvertisingKuaishouLLM

0 likes · 6 min read

How Kuaishou Uses Large Models to Supercharge Ad Targeting with COPE and LEARN

Architecture and Beyond

Oct 12, 2025 · Artificial Intelligence

How Do AI Agents Know When to Stop? Strategies and Real-World Implementations

This article explores the essential stop‑condition designs for AI agents, detailing hard limits, task‑completion checks, explicit termination tools, loop detection, error accumulation, and user interruption, and then examines concrete implementations in OpenManus and Gemini CLI with code examples and multi‑layer safeguards.

AI AgentGemini CLILLM

0 likes · 17 min read

How Do AI Agents Know When to Stop? Strategies and Real-World Implementations

Architect's Alchemy Furnace

Oct 12, 2025 · Artificial Intelligence

How to Upgrade Dify to 1.9.1 and Resolve LLM Iterator Errors

This guide walks you through upgrading Dify using Docker Compose or source code deployment, running required migration commands, backing up data, and fixing the "Invalid context structure" error caused by iterator output changes in version 1.9.1, with detailed code snippets and troubleshooting steps.

DifyDockerLLM

0 likes · 8 min read

How to Upgrade Dify to 1.9.1 and Resolve LLM Iterator Errors

BirdNest Tech Talk

Oct 11, 2025 · Artificial Intelligence

How to Load Documents into LangChain: From Files to APIs

Learn how to use LangChain's Document Loaders to import data from files, web pages, databases, and APIs, understand the Document object structure, compare load() versus lazy_load(), and follow a step‑by‑step Python example that demonstrates loading, inspecting, and optionally processing documents with an LLM.

Data IntegrationDocument LoaderLLM

0 likes · 12 min read

How to Load Documents into LangChain: From Files to APIs

DataFunTalk

Oct 11, 2025 · Artificial Intelligence

How Tencent’s LLM Powers Real‑World Apps with RAG, GraphRAG & Agents

This article explores Tencent’s large language model deployments across diverse business scenarios—content generation, intelligent customer service, and role‑playing—detailing the underlying RAG, GraphRAG, and Agent technologies, their principles, practical implementations, and the advantages they bring to enterprise AI solutions.

AILLMRAG

0 likes · 5 min read

How Tencent’s LLM Powers Real‑World Apps with RAG, GraphRAG & Agents

Alibaba Cloud Developer

Oct 11, 2025 · Artificial Intelligence

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling & Real-World Cases

Spring AI Alibaba (SAA) provides a robust framework for building autonomous, scheduled AI agents that can operate independently, respond to events, and involve human oversight, enabling use cases such as automated business reporting, batch data processing, emergency response, and sentiment analysis, with detailed code examples and deployment guidance.

AI agentsEnterprise AutomationLLM

0 likes · 13 min read

Unlock Autonomous AI Agents with Spring AI Alibaba: Scheduling & Real-World Cases

Data Party THU

Oct 11, 2025 · Artificial Intelligence

From Transformers to LLaMA 4: A Journey Through the Biggest LLMs

This article surveys the most influential large language models released since 2017, detailing the core innovations of Transformer, BERT, GPT series, T5, Retrieval‑Augmented Generation, and the latest LLaMA and Meta models, while highlighting their architectures, training paradigms, and impact on NLP research.

LLMLarge Language ModelsModel Scaling

0 likes · 21 min read

From Transformers to LLaMA 4: A Journey Through the Biggest LLMs