Tagged articles
2014 articles
Page 8 of 21
Baidu Geek Talk
Baidu Geek Talk
Dec 24, 2025 · Artificial Intelligence

Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs

The article explains how Baidu’s Baige team integrated a Context Parallelism strategy into DeepSeek V3.2, detailing the DSA architecture, the limitations of traditional tensor and sequence parallelism, and how CP distributes computation and memory across GPUs to achieve up to an 80 % reduction in token‑to‑first‑token latency for ultra‑long 128K‑token contexts.

Context ParallelismDeepSeekLLM
0 likes · 9 min read
Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs
Tencent Technical Engineering
Tencent Technical Engineering
Dec 24, 2025 · Artificial Intelligence

Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer

This article walks through constructing a small large‑language model from the ground up, covering model architecture, tokenization methods, BPE vocabulary building, embedding, positional encoding, attention mechanisms, multi‑head attention, transformer blocks, training pipelines, inference, and sampling strategies, all with runnable Python code.

Deep LearningLLMPython
0 likes · 34 min read
Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 24, 2025 · Artificial Intelligence

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

The article explains how the newly merged Context Parallelism (CP) technique in SGLang, combined with DeepSeek V3.2's Sparse Attention architecture, reduces first‑token latency by up to 80% and alleviates memory pressure for ultra‑long 128K‑token sequences, detailing both algorithmic innovations and engineering solutions.

AI InfrastructureContext ParallelismDistributed inference
0 likes · 10 min read
How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 23, 2025 · Artificial Intelligence

How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction

The paper introduces H3M‑SSMoEs, a framework that integrates a multi‑context hypergraph for fine‑grained spatio‑temporal dynamics with a frozen Llama‑3.2‑1B LLM adapter, and a style‑structured expert mixture to jointly model stock relationships, multimodal semantics, and market regimes, achieving superior accuracy and investment returns on DJIA, NASDAQ‑100, and S&P‑100 benchmarks.

Financial AIHypergraphLLM
0 likes · 14 min read
How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Dec 22, 2025 · Artificial Intelligence

Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE

This article explains why KV‑Cache hit rate is critical for large‑model inference, describes vLLM's automatic prefix caching, outlines the distributed cache challenges, and provides a step‑by‑step guide to deploying Alibaba Cloud ACK Gateway with Inference Extension's precise‑mode prefix‑cache‑aware routing, backed by benchmark results.

Alibaba CloudInferenceKV cache
0 likes · 18 min read
Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 22, 2025 · Artificial Intelligence

How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services

This article examines four cutting‑edge Retrieval‑Augmented Generation frameworks—Adaptive RAG, Agentic RAG, OG‑RAG, and OAG—detailing their definitions, core mechanisms, performance gains, and practical selection guidance for complex enterprise scenarios, while highlighting future research directions.

Agentic AIEnterprise KnowledgeLLM
0 likes · 21 min read
How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services
JD Tech
JD Tech
Dec 22, 2025 · Artificial Intelligence

Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled

The OxyGent 1.0.8 release introduces multimodal messaging, fine‑grained control, MCP reconnection, and front‑end streaming, while detailing its stateless AOP architecture, execution lifecycle, four data scopes, real‑world use cases, community feedback, and a step‑by‑step tutorial for rapid adoption.

AIFrameworkLLM
0 likes · 11 min read
Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 22, 2025 · Artificial Intelligence

Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations

Traditional recommendation systems lag behind fast‑moving external trends, missing the freshness and surprise users crave. This article details an end‑to‑end AI pipeline that perceives, understands, and reacts to hotspots within hours, automatically generating high‑quality product selections and continuously optimizing through feedback loops.

AI recommendationLLMMultimodal
0 likes · 25 min read
Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 21, 2025 · Artificial Intelligence

Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform

Open WebUI is a self‑hosted, extensible AI platform that runs fully offline, supports multiple LLM back‑ends such as Ollama and OpenAI‑compatible APIs, offers built‑in RAG, role‑based access, multi‑model chat, markdown/LaTeX, image generation, and provides detailed Docker, pip, and Kubernetes installation guides with ready‑to‑run commands.

AI PlatformDockerLLM
0 likes · 11 min read
Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform
Advanced AI Application Practice
Advanced AI Application Practice
Dec 20, 2025 · Artificial Intelligence

Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs

This article explains how the System, User, and Assistant roles in large-language-model chat APIs shape response quality, demonstrates their impact with concrete Python code examples, compares outcomes with and without System prompts, and offers practical tips for crafting effective prompts to achieve concise, relevant AI testing guidance.

AI testingAssistant RoleLLM
0 likes · 14 min read
Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs
Design Hub
Design Hub
Dec 20, 2025 · Artificial Intelligence

Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World

The article reviews six 2025 paradigm shifts in large language models—from the rise of verifiable‑reward reinforcement learning and the emergence of AI "ghosts" to new "Cursor for X" middle layers, local agents like Claude Code, Vibe Coding that lets users program by conversation, and visual interaction driven by Gemini Nano Banana—highlighting their technical impact and design implications.

AI agentsLLMLocal AI
0 likes · 12 min read
Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World
PaperAgent
PaperAgent
Dec 20, 2025 · Industry Insights

What 2025 Tells Us About the Future of Large Language Models

The 2025 LLM year‑in‑review highlights paradigm shifts such as RLVR training, uneven “saw‑tooth” intelligence, the rise of Cursor‑style applications, Claude Code agents running locally, Vibe Coding, and the Nano Banana GUI revolution, concluding that current models only exploit about 10 % of their potential.

AI agentsLLMNano Banana
0 likes · 10 min read
What 2025 Tells Us About the Future of Large Language Models
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 19, 2025 · Artificial Intelligence

Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights

This digest presents recent arXiv papers (Dec 13‑19 2025) on AI‑driven quantitative finance, covering LLM‑based portfolio recommendation, reinforcement‑learning deep hedging, hybrid SV‑LSTM volatility forecasting, dynamic stacking ensembles, GA‑optimized SVR forecasting, and interpretable deep learning asset pricing, each with abstracts and key findings.

Deep LearningLLMQuantitative Finance
0 likes · 16 min read
Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights
Alibaba Cloud Native
Alibaba Cloud Native
Dec 19, 2025 · Artificial Intelligence

What Enterprises Are Learning from the State of Agent Engineering Report

The recent LangChain "State of Agent Engineering" report, combined with data from the AI‑Native Application Architecture whitepaper, reveals rapid production adoption of AI agents, persistent quality challenges, widespread observability, multi‑model strategies, and evolving evaluation practices across organizations of all sizes.

AI agentsLLMenterprise adoption
0 likes · 10 min read
What Enterprises Are Learning from the State of Agent Engineering Report
Bilibili Tech
Bilibili Tech
Dec 19, 2025 · Artificial Intelligence

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

SABER introduces a reinforcement‑learning framework that lets large language models dynamically switch among four token‑budgeted reasoning modes, dramatically cutting inference length while preserving or improving accuracy across math, code, and logic tasks.

Budgeted ComputationChain-of-ThoughtEfficient Reasoning
0 likes · 13 min read
SABER: Switchable and Balanced Training for Efficient LLM Reasoning
PaperAgent
PaperAgent
Dec 18, 2025 · Artificial Intelligence

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

This article presents an ontology‑aware knowledge‑graph RAG framework that transforms complex, hierarchical industrial standard documents into a graph of sections, atomic propositions, and refined triples, achieving nearly double F1 scores on table‑based QA tasks and robust performance on long documents.

LLMOntologyRAG
0 likes · 6 min read
Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?
21CTO
21CTO
Dec 17, 2025 · Artificial Intelligence

Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui

Japanese data scientist Takato Honda introduces Sui, an open‑source programming language designed to eliminate syntax and spelling errors and to let large language models generate code with claimed 100% accuracy, offering token‑efficiency optimizations for AI‑assisted programming.

AILLMProgramming Language
0 likes · 4 min read
Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui
PaperAgent
PaperAgent
Dec 17, 2025 · Artificial Intelligence

Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys over 200 recent papers on AI agent memory, introducing a three‑dimensional framework of form, function, and dynamics, classifying memory into token‑level, parametric, and latent types, outlining their roles, lifecycle operations, benchmark datasets, open‑source frameworks, and seven emerging research directions.

AI agentsLLMknowledge management
0 likes · 6 min read
Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics
Architects' Tech Alliance
Architects' Tech Alliance
Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM
0 likes · 15 min read
Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment
PaperAgent
PaperAgent
Dec 16, 2025 · Artificial Intelligence

Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM

Open Notebook is a fully local, open‑source AI notebook that rivals Google Notebook LM by supporting over 16 LLM providers, handling multimodal content, and enabling advanced multi‑speaker podcast generation while giving users complete data sovereignty and flexible deployment options.

AI NotebookLLMMultimodal
0 likes · 4 min read
Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM
Fighter's World
Fighter's World
Dec 16, 2025 · Artificial Intelligence

Boosting Large Language Model Domain Expertise with Claude Skills

The article analyzes why generic LLMs struggle with domain‑specific reasoning, critiques fine‑tuning, RAG and prompt engineering, and presents Claude Skills—using progressive disclosure, Pydantic validation, and state‑machine control—to encode expert constraints as executable rules, illustrated with finance compliance and legal reasoning case studies and backed by Anthropic research.

ClaudeDomain-specificLLM
0 likes · 20 min read
Boosting Large Language Model Domain Expertise with Claude Skills
JakartaEE China Community
JakartaEE China Community
Dec 16, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This guide walks through the importance of Retrieval‑Augmented Generation, outlines the core Langchain4j and Ollama 3 components, and provides a complete Java example—including Maven setup, document ingestion, embedding creation, similarity search, prompt construction, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j
0 likes · 9 min read
Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
PaperAgent
PaperAgent
Dec 16, 2025 · Artificial Intelligence

Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families

This article analyzes recent research by East China Normal University and Fudan University on whether eight major LLM families exhibit a systematic “Chain-of-Affective,” revealing how internal emotional structures influence model outputs, multi‑agent interactions, and user experience, and offering practical guidelines for mitigating emotional loops in AI systems.

AI SafetyChain-of-AffectiveEmotion
0 likes · 8 min read
Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families
Qborfy AI
Qborfy AI
Dec 16, 2025 · Artificial Intelligence

Mastering AI Function Calling: Turn LLMs into Actionable Assistants

Function Calling lets large language models invoke external tools or APIs during a conversation, transforming them from passive responders into proactive assistants; this guide explains the concept, workflow, and practical implementations with weather, parallel queries, and stock price examples using OpenAI’s Python SDK.

AI Function CallingChatbotLLM
0 likes · 9 min read
Mastering AI Function Calling: Turn LLMs into Actionable Assistants
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 16, 2025 · Artificial Intelligence

How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale

This article details the design and implementation of Matra, an AI‑driven data assistant for a large e‑commerce platform, covering the challenges of legacy data assets, knowledge‑base construction, GraphRAG integration, multi‑stage agent frameworks, practical results, and future plans for continuous improvement.

AIData RetrievalLLM
0 likes · 22 min read
How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale
AI Large Model Application Practice
AI Large Model Application Practice
Dec 16, 2025 · Artificial Intelligence

Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

This guide shows how to use the open‑source BISHENG low‑code platform, ByteDance’s Seed‑1.6 and Seedream‑4.5 models, and a custom MCP server to build a workflow that uploads documents, performs RAG, generates structured PPT outlines with LLMs, creates page images via text‑to‑image models, and assembles a downloadable PDF, all while incorporating human‑in‑the‑loop controls.

BISHENGHITLLLM
0 likes · 17 min read
Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow
Old Meng AI Explorer
Old Meng AI Explorer
Dec 15, 2025 · Artificial Intelligence

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

LLM Council, an open‑source platform created by former OpenAI researcher Andrej Karpathy, lets users simultaneously query top LLMs such as GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5 and Grok 4, anonymously peer‑review their answers, and synthesize a final report, dramatically improving accuracy for research, tech selection and learning while remaining easy to install and run locally.

AI toolLLMOpen-source
0 likes · 11 min read
Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide
Architect
Architect
Dec 15, 2025 · Artificial Intelligence

Demystifying LLM Architecture: From Transformers to Modern MoE Designs

This comprehensive guide explains the fundamentals of large language model (LLM) architectures, covering the original Transformer, tokenization, embeddings, positional encoding, attention mechanisms, feed‑forward networks, layer stacking, a step‑by‑step translation example, and the latest open‑source and hybrid LLM designs shaping the field.

EmbeddingLLMMoE
0 likes · 41 min read
Demystifying LLM Architecture: From Transformers to Modern MoE Designs
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 15, 2025 · Artificial Intelligence

Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances

The article details Baidu Baige’s next‑generation distributed inference platform for trillion‑parameter LLMs, explaining how automated orchestration, the FedDeployment abstraction, SplitService unified view, Adaptive HPA predictive scaling, Silent Instances for second‑level activation, and the Staggered Batched Scheduler eliminate scaling limits, reduce TTFT by 30‑40%, boost throughput by up to 20%, and achieve cost‑effective, elastic AI compute.

Distributed inferenceKubernetesLLM
0 likes · 23 min read
Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 15, 2025 · Artificial Intelligence

Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines

This article explains how Text2SQL works by teaching LLMs to understand a closed‑world database schema, constructing tightly constrained prompts, validating generated SQL, handling execution errors, and using a second LLM call to translate results into natural language, while highlighting common pitfalls and engineering best practices.

LLMSQL ValidationText2SQL
0 likes · 9 min read
Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Dec 15, 2025 · Artificial Intelligence

Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis

The paper introduces Artanis, an intent‑based network configuration update framework that combines large‑language‑model generation with a verification‑feedback loop and reinforcement‑learning optimization, addressing hallucination‑induced errors and ensuring safe, policy‑compliant deployments across diverse network scales.

Configuration ManagementIntent-based NetworkingLLM
0 likes · 9 min read
Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 13, 2025 · Artificial Intelligence

Explore 100+ Open‑Source LLM Apps and How to Run Them Locally

This guide presents a curated collection of over a hundred open‑source large language model applications—including AI agents, RAG pipelines, and domain‑specific tools—explains their categories, showcases example projects, and provides step‑by‑step instructions to clone and run them on your own machine.

AI agentsGitHubLLM
0 likes · 8 min read
Explore 100+ Open‑Source LLM Apps and How to Run Them Locally
PaperAgent
PaperAgent
Dec 12, 2025 · Artificial Intelligence

How BookRAG Redefines Long-Document Retrieval with Hierarchical Indexing

BookRAG introduces a hierarchical, structure‑aware indexing method that combines tree‑based document representation with graph‑based entity linking and an agent‑driven retrieval pipeline, achieving up to 71.2% recall improvement on multimodal long‑document benchmarks while cutting token usage and latency dramatically.

Agent RetrievalHierarchical IndexingLLM
0 likes · 7 min read
How BookRAG Redefines Long-Document Retrieval with Hierarchical Indexing
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Dec 11, 2025 · Artificial Intelligence

Fine‑Grained Activation Offloading: Cutting Memory Use While Preserving LLM Throughput

The article introduces a fine‑grained activation offloading technique implemented in Megatron‑Core that offloads module‑level activations to CPU, overlaps transfer with computation, and remains compatible with pipeline and virtual pipeline parallelism, dramatically reducing peak GPU memory for large language models while incurring minimal throughput loss.

LLMMegatronMemory Optimization
0 likes · 18 min read
Fine‑Grained Activation Offloading: Cutting Memory Use While Preserving LLM Throughput
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 11, 2025 · Artificial Intelligence

Why Reward Models Need Reasoning: From Scalar Scores to RM‑R1

Interviewers increasingly ask why modern reward models must go beyond scalar scores to incorporate reasoning, and this article explains the limitations of traditional scalar reward models, the benefits of the RM‑R1 framework, and how reasoning‑based rewards improve alignment, stability, and task performance in large language model training.

AI AlignmentLLMRLHF
0 likes · 11 min read
Why Reward Models Need Reasoning: From Scalar Scores to RM‑R1
Sohu Tech Products
Sohu Tech Products
Dec 10, 2025 · Artificial Intelligence

Build a Next.js Chatbot Quickly with Vercel AI SDK

This guide explains how to integrate large language models into modern web applications using Vercel AI SDK, covering core modules, package responsibilities, when to choose each package, installation steps, example code for both backend API routes and React front‑end, and a complete quick‑start workflow.

AI integrationLLMNext.js
0 likes · 12 min read
Build a Next.js Chatbot Quickly with Vercel AI SDK
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 10, 2025 · Artificial Intelligence

Accelerate LLM Deployment on Baidu Kunlun XPU with the Open‑Source vLLM‑Kunlun Plugin

The vLLM‑Kunlun Plugin, built on the vLLM hardware‑plugin RFC, lets developers deploy any major large language model on Baidu's Kunlun XPU instantly without modifying vLLM core code, dramatically shortening migration time, providing high‑performance fusion operators, and offering open‑source tools for precision verification and profiling.

InferenceKunlunLLM
0 likes · 8 min read
Accelerate LLM Deployment on Baidu Kunlun XPU with the Open‑Source vLLM‑Kunlun Plugin
BirdNest Tech Talk
BirdNest Tech Talk
Dec 9, 2025 · Artificial Intelligence

How BettaFish Uses Multi‑Agent AI to Break the Information Filter Bubble

BettaFish is a Go‑based, AI‑driven multi‑agent opinion analysis platform that tackles information silos, overload, and bias by aggregating data from diverse sources, iteratively refining results through reflection loops, and delivering visualized, actionable reports for scientific decision‑making.

AIData visualizationGo
0 likes · 24 min read
How BettaFish Uses Multi‑Agent AI to Break the Information Filter Bubble
PaperAgent
PaperAgent
Dec 9, 2025 · Artificial Intelligence

How Code Graph Model (CGM) Redefines Repository‑Level Code Understanding

The Code Graph Model (CGM) introduced by Ant's multimodal code team integrates repository‑level graph structures into open‑source LLMs, achieving a 44% solve rate on SWE‑bench Lite, eliminating agent dependence, and demonstrating a novel graph‑enhanced code model through multi‑granular graph construction, dual‑modal alignment, and a lightweight GraphRAG framework.

AICode GraphGraphRAG
0 likes · 9 min read
How Code Graph Model (CGM) Redefines Repository‑Level Code Understanding
DeWu Technology
DeWu Technology
Dec 8, 2025 · Artificial Intelligence

Unlocking Model Context Protocol (MCP): A Deep Dive into AI‑Database Integration

This article provides a comprehensive technical overview of the Model Context Protocol (MCP), an open‑standard JSON‑RPC 2.0 protocol that enables large language models to securely interact with external data sources, tools, and services, detailing its design, architecture, Python SDK implementation, transport mechanisms, and real‑world deployment examples such as the DW‑DBA‑MCP project.

LLMMCPModel Context Protocol
0 likes · 45 min read
Unlocking Model Context Protocol (MCP): A Deep Dive into AI‑Database Integration
Tencent Technical Engineering
Tencent Technical Engineering
Dec 8, 2025 · Artificial Intelligence

Building Persistent Long‑Term Memory for LLM Agents with LangGraph – A Complete Guide

This article explains how to give large language model agents lasting memory by combining short‑term and long‑term storage in LangGraph, covering concepts, implementation details, database persistence, tool integration, semantic search, memory‑management strategies, checkpoint handling, and a multi‑agent supervisor example.

Agent MemoryLLMLangGraph
0 likes · 43 min read
Building Persistent Long‑Term Memory for LLM Agents with LangGraph – A Complete Guide
Wuming AI
Wuming AI
Dec 7, 2025 · Artificial Intelligence

What Is MCP and How It Revolutionizes AI Tool Integration

This article explains the MCP protocol for AI agents, detailing why a universal tool‑calling standard is needed, how it solves the M×N integration nightmare, the roles and execution stages involved, and demonstrates its use with Cherry Studio while highlighting current limitations.

AI AgentCherry StudioLLM
0 likes · 20 min read
What Is MCP and How It Revolutionizes AI Tool Integration
BirdNest Tech Talk
BirdNest Tech Talk
Dec 7, 2025 · Artificial Intelligence

Recreating DeerFlow’s Multi‑Agent Research Pipeline with LangGraphGo in 30 Minutes

This article walks through the open‑source DeerFlow framework—its multi‑agent architecture, core features, and a step‑by‑step implementation using the Go‑based LangGraphGo library, covering planner, researcher, reporter and podcast nodes, state‑graph design, CLI/web modes, and deployment instructions.

AI researchLLMLangGraphGo
0 likes · 14 min read
Recreating DeerFlow’s Multi‑Agent Research Pipeline with LangGraphGo in 30 Minutes
21CTO
21CTO
Dec 7, 2025 · Backend Development

Top Laravel AI Packages to Power Intelligent Web Apps

This article reviews the most popular and actively maintained Laravel AI packages—including Prism, LarAgent, Laravel AI Toolkit, and Laravel MCP—detailing their features, typical use‑cases, and how to choose the right one for building chatbots, automation agents, content generators, and AI‑enhanced Laravel applications.

AILLMLaravel
0 likes · 6 min read
Top Laravel AI Packages to Power Intelligent Web Apps
PaperAgent
PaperAgent
Dec 7, 2025 · Industry Insights

What 1,000 Trillion Tokens Reveal About the Rise of Open‑Source LLMs

A massive 1 000 trillion‑token study by a16z and OpenRouter shows open‑source models now hold a third of the LLM market, programming tasks have surged to over 50 % of usage, role‑play scenarios dominate open‑source traffic, and price elasticity is surprisingly low, reshaping the competitive landscape.

AI MarketIndustry AnalysisLLM
0 likes · 6 min read
What 1,000 Trillion Tokens Reveal About the Rise of Open‑Source LLMs
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 7, 2025 · Artificial Intelligence

Can RL Really Boost LLM Reasoning? A Critical Review of Recent Findings

This article critically examines recent RL‑for‑LLM studies, revealing that reinforcement learning improves search efficiency but does not extend the intrinsic reasoning capabilities of base models, and explores the underlying model‑conditioned optimization bias, comparisons with SFT distillation, and the trade‑off with catastrophic forgetting.

Catastrophic ForgettingLLMModel Optimization
0 likes · 11 min read
Can RL Really Boost LLM Reasoning? A Critical Review of Recent Findings
Data Party THU
Data Party THU
Dec 6, 2025 · Artificial Intelligence

Why Adding Toxic Data Can Make Language Models Safer and More Capable

A recent study shows that deliberately mixing a moderate amount of toxic content into large‑language‑model pre‑training actually sharpens the model’s internal representation of toxicity, enabling post‑training interventions to more effectively detoxify the model while preserving or even improving its general capabilities.

LLMModel AlignmentToxic Data
0 likes · 10 min read
Why Adding Toxic Data Can Make Language Models Safer and More Capable
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 5, 2025 · Artificial Intelligence

Quantitative Finance Paper Summaries (Nov 29–Dec 5 2025)

This article presents concise summaries of five recent AI‑driven finance papers, covering a stress‑testing framework for LLM trading agents, an orchestration framework for financial agents, an event‑reflection memory model for stock forecasting, a hybrid LLM‑Bayesian network architecture for options wheel strategies, and their experimental results.

BenchmarkingFinancial AILLM
0 likes · 12 min read
Quantitative Finance Paper Summaries (Nov 29–Dec 5 2025)
PaperAgent
PaperAgent
Dec 5, 2025 · Artificial Intelligence

Can LLMs Be Trained to Confess? Inside the “Confession” Method for Honest AI

The article reviews OpenAI’s “Confession” training approach for large language models, explains why traditional RLHF fails to ensure honesty, details the confession methodology and PPO update, presents experimental results showing higher honesty rates, analyzes error cases, and discusses limitations and future risks.

AI HonestyArtificial IntelligenceConfession Training
0 likes · 6 min read
Can LLMs Be Trained to Confess? Inside the “Confession” Method for Honest AI
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 5, 2025 · Artificial Intelligence

Why Do LLM Function Calls Hallucinate Parameters and How to Prevent It?

This article explains the root causes of hallucinated parameters in LLM Function Calls, outlines five common failure patterns, and presents a systematic five‑step engineering framework—including schema design, prompt rules, dynamic routing, result validation, and clarification—to reliably eliminate such errors in real‑world AI agents.

AI AgentLLMfunction call
0 likes · 11 min read
Why Do LLM Function Calls Hallucinate Parameters and How to Prevent It?
Frontend AI Walk
Frontend AI Walk
Dec 5, 2025 · Artificial Intelligence

Master Prompt Engineering: From Random Chat to Precise Control with Zero-shot, Few-shot, and Chain‑of‑Thought

This article explains how to converse effectively with large language models by mastering three core prompting techniques—Zero‑shot, Few‑shot, and Chain‑of‑Thought—illustrated with front‑end analogies, code snippets, and a step‑by‑step DeepSeek JSON‑generation exercise that shows common pitfalls and best practices.

Chain-of-ThoughtDeepSeekFew-Shot
0 likes · 12 min read
Master Prompt Engineering: From Random Chat to Precise Control with Zero-shot, Few-shot, and Chain‑of‑Thought
Fun with Large Models
Fun with Large Models
Dec 5, 2025 · Artificial Intelligence

DeepSeek Math V2 & V3.2: A Plain‑Language Deep Dive into Core Innovations

This article provides a detailed, easy‑to‑understand analysis of DeepSeek‑Math‑V2’s self‑verification training method and DeepSeek‑V3.2’s GRPO framework, sparse‑attention DSA mechanism, massive agent data pipeline, and benchmark results that place both models among the world’s top open‑source large language models.

DeepSeekGRPOLLM
0 likes · 19 min read
DeepSeek Math V2 & V3.2: A Plain‑Language Deep Dive into Core Innovations
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 4, 2025 · Artificial Intelligence

Paper Review: RETuning Boosts Large‑Model Stock Trend Prediction Reasoning

This article analyzes the RETuning framework, which addresses LLMs' bias toward analyst opinions and lack of evidence weighting in stock movement prediction by introducing a two‑stage cold‑start fine‑tuning and reinforcement learning pipeline, evaluating it on the large Fin‑2024 dataset and demonstrating significant F1 gains, inference‑time scaling, and out‑of‑distribution robustness.

Fin-2024GRPOInference Scaling
0 likes · 12 min read
Paper Review: RETuning Boosts Large‑Model Stock Trend Prediction Reasoning
DataFunTalk
DataFunTalk
Dec 4, 2025 · Artificial Intelligence

Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking: Cutting‑Edge AI Search Techniques

This article reviews three advanced AI search solutions—Alibaba Cloud's Agentic RAG architecture for multi‑modal retrieval, Huawei's LLM‑enhanced recommendation system with factorized prompting, and Baidu's generative ranking model GRAB—detailing their technical challenges, design choices, performance gains, and deployment insights.

AI searchBaiduGenerative Ranking
0 likes · 8 min read
Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking: Cutting‑Edge AI Search Techniques
ShiZhen AI
ShiZhen AI
Dec 4, 2025 · Artificial Intelligence

What Is a Context Window? Explaining LLM Memory Capacity

The article explains that a context window defines an LLM's token‑level memory capacity, shows how longer windows cause quadratic computation growth, introduces KV Cache as a way to extend context without exploding resources, and covers advanced techniques like Ring Attention, NIAH benchmarking, and attention decay in long sequences.

Context WindowKV cacheLLM
0 likes · 6 min read
What Is a Context Window? Explaining LLM Memory Capacity
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 4, 2025 · Artificial Intelligence

Gemini 3 Pro vs DeepSeek‑V3.2‑Exp: Which LLM Dominates SQL Understanding, Optimization, and Dialect Conversion?

This report evaluates the professional‑grade LLMs Gemini 3 Pro and DeepSeek‑V3.2‑Exp on three SQL‑related dimensions—understanding, optimization, and dialect conversion—using the SCALE benchmark, presenting detailed scores, strengths, weaknesses, and practical recommendations for database engineers and decision makers.

DeepSeekGeminiLLM
0 likes · 16 min read
Gemini 3 Pro vs DeepSeek‑V3.2‑Exp: Which LLM Dominates SQL Understanding, Optimization, and Dialect Conversion?
Wuming AI
Wuming AI
Dec 3, 2025 · Artificial Intelligence

How to Reduce LLM Hallucinations: Model Selection, Web Search, and Verification Agents

This article explains a step‑by‑step workflow for mitigating large‑language‑model hallucinations by picking low‑hallucination models, leveraging internet‑enabled search tools, rephrasing queries, and creating a dedicated verification assistant with concrete prompts and a Claude implementation.

LLMPrompt engineeringhallucination
0 likes · 6 min read
How to Reduce LLM Hallucinations: Model Selection, Web Search, and Verification Agents
Tencent Technical Engineering
Tencent Technical Engineering
Dec 3, 2025 · Artificial Intelligence

Why Transformers Power Modern LLMs: A Deep Dive into Architecture and Mechanics

This article provides a comprehensive, step‑by‑step explanation of the Transformer architecture that underpins large language models, covering tokenization, embeddings, positional encoding, attention mechanisms, feed‑forward networks, layer stacking, a detailed translation example, visualized attention weights, and a survey of recent open‑source LLM designs such as DeepSeek V3, OLMo 2, and Gemma 3.

EmbeddingLLMNeural Network
0 likes · 38 min read
Why Transformers Power Modern LLMs: A Deep Dive into Architecture and Mechanics
360 Smart Cloud
360 Smart Cloud
Dec 3, 2025 · Artificial Intelligence

How Model Distillation Enhances LLM Performance on the TLM Platform

This article explains the TLM large‑model development platform and details how knowledge distillation—using soft labels, temperature scaling, and combined loss functions—compresses teacher models into efficient student models, with practical steps and evaluation on the platform.

AILLMTLM platform
0 likes · 5 min read
How Model Distillation Enhances LLM Performance on the TLM Platform
AntData
AntData
Dec 3, 2025 · Artificial Intelligence

How to Build and Refine Your Personal AI Agent Assistant

This article walks through turning a generic AI model into a personal assistant by explaining user‑centric workflows, crafting effective natural‑language prompts, adding clarification steps, validating AI‑generated results through multiple methods, and handling errors with product interactions to create a reliable, evolving assistant.

ChatBILLMresult validation
0 likes · 10 min read
How to Build and Refine Your Personal AI Agent Assistant
DataFunTalk
DataFunTalk
Dec 2, 2025 · Artificial Intelligence

How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search

This article reviews three cutting‑edge AI search and recommendation techniques—Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's GRAB generative ranking model—detailing their design challenges, multi‑modal retrieval strategies, performance gains, and real‑world deployment results.

AI agentsAI searchGenerative Ranking
0 likes · 8 min read
How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 2, 2025 · Artificial Intelligence

How LLMs Can Revolutionize Test Case Generation: Methods, Benefits, and Challenges

This article examines the shortcomings of manual test case creation, explains how large language models (LLMs) can dramatically improve efficiency, coverage, consistency, and knowledge sharing in software testing, outlines the key capabilities required, and presents a detailed end‑to‑end solution with practical steps, evaluation metrics, and future outlook.

AI automationKnowledge BaseLLM
0 likes · 20 min read
How LLMs Can Revolutionize Test Case Generation: Methods, Benefits, and Challenges
Frontend AI Walk
Frontend AI Walk
Dec 2, 2025 · Artificial Intelligence

Understanding LLMs: A Frontend Developer’s Primer on Large Language Models

The article demystifies large language models for frontend developers by likening token prediction to autocomplete, explaining tokens, context windows, temperature, the two-stage training process, and the critical role of prompts, using concrete code examples and analogies to familiar frontend concepts.

Fine-tuningFrontend AnalogyLLM
0 likes · 10 min read
Understanding LLMs: A Frontend Developer’s Primer on Large Language Models
Tencent Technical Engineering
Tencent Technical Engineering
Dec 1, 2025 · Artificial Intelligence

Do Machines Really Think? Inside Deep Reasoning, Scaling Laws & RLHF for LLMs

This article examines whether large language models truly think, explores the origins of deep reasoning through transformer architectures and scaling laws, reviews chain‑of‑thought and its variants, and analyzes how reinforcement learning from human feedback—including PPO, DPO, and GRPO—helps internalise step‑by‑step reasoning while pointing to future directions such as atomic thought, hierarchical models, and training‑free in‑context knowledge bases.

AI AlignmentChain-of-ThoughtLLM
0 likes · 35 min read
Do Machines Really Think? Inside Deep Reasoning, Scaling Laws & RLHF for LLMs
AI Large Model Application Practice
AI Large Model Application Practice
Dec 1, 2025 · Artificial Intelligence

Which Open‑Source Agent Memory Engine Wins? Deep Dive into Mem0, Graphiti & Cognee

This article examines the limitations of LLM short‑term context windows and compares three open‑source long‑term memory frameworks—Mem0, Graphiti, and Cognee—by detailing their architectures, storage modes, integration steps, code examples, strengths, drawbacks, and practical selection guidance for building smarter AI agents.

Agent MemoryGraphitiLLM
0 likes · 20 min read
Which Open‑Source Agent Memory Engine Wins? Deep Dive into Mem0, Graphiti & Cognee
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Nov 30, 2025 · Artificial Intelligence

How TSci Uses LLMs to Automate End‑to‑End Time‑Series Forecasting

The article reviews the TSci framework, an LLM‑driven multi‑agent system that automates data diagnosis, model selection, ensemble forecasting, and report generation for time‑series prediction, achieving up to 38 % lower MAE than LLM baselines and improving report quality across five evaluation dimensions.

Agent FrameworkLLMTSci
0 likes · 10 min read
How TSci Uses LLMs to Automate End‑to‑End Time‑Series Forecasting
DataFunSummit
DataFunSummit
Nov 29, 2025 · Artificial Intelligence

How LLMs Are Transforming Long-Term Cross-Domain Interest Modeling for Recommendations

The Datafun Summit 2025 talk by JD’s algorithm engineer Tian Mingyang explains how generative AI is driving a paradigm shift in recommendation systems, detailing the limits of traditional models, the new dynamic cross‑domain inference chain technique, joint engineering‑algorithm optimizations, and the remaining challenges for future deployment.

AICross-Domain ModelingEngineering Optimization
0 likes · 32 min read
How LLMs Are Transforming Long-Term Cross-Domain Interest Modeling for Recommendations
Data Party THU
Data Party THU
Nov 29, 2025 · Artificial Intelligence

Unlocking AI Agents: From Fundamentals to Building Your First LLM‑Powered Agent

This comprehensive guide explores the concept of AI agents, detailing their definitions, classifications, and core interaction loops, then walks you through building a functional LLM‑driven travel assistant with step‑by‑step code, tool integration, and practical insights on agent versus workflow paradigms.

AI agentsAgent ArchitectureLLM
0 likes · 39 min read
Unlocking AI Agents: From Fundamentals to Building Your First LLM‑Powered Agent
PaperAgent
PaperAgent
Nov 29, 2025 · Industry Insights

NeurIPS 2025 Insights: AI Agents, Reasoning, and the Shift to Real-World Systems

An analysis of the 5,984 papers accepted at NeurIPS 2025 shows a decisive move from ever‑larger models toward agents, reasoning‑focused LLMs, efficiency engineering, AI for Science, and trustworthy AI, signaling the transition from a research‑toy era to an engineering‑driven AI ecosystem.

AI for ScienceAI trendsLLM
0 likes · 7 min read
NeurIPS 2025 Insights: AI Agents, Reasoning, and the Shift to Real-World Systems
Huya Tech Engineering
Huya Tech Engineering
Nov 28, 2025 · Operations

How LLMs Accelerate Root‑Cause Diagnosis in Large‑Scale Microservices

By abstracting a massive microservice system as a dynamic multi‑layer graph and integrating large language models, the article outlines three evolution stages—from manual expert debugging to rule‑based AIOps and finally LLM‑driven cognitive reasoning—detailing practical workflows, context engineering, and real‑world case studies that dramatically improve MTTR and accuracy.

Context EngineeringLLMMicroservices
0 likes · 20 min read
How LLMs Accelerate Root‑Cause Diagnosis in Large‑Scale Microservices
Bilibili Tech
Bilibili Tech
Nov 28, 2025 · Artificial Intelligence

How We Built an LLM‑Powered AI Hub to Read and Analyze Community Chats

This article details the design and deployment of a multi‑layer LLM system that automatically reads massive creator group chats, extracts structured insights, mitigates hallucinations with dual‑model verification, uses few‑shot prompting for stable output, and delivers real‑time risk alerts and operational reports.

AI OperationsFew‑Shot LearningLLM
0 likes · 14 min read
How We Built an LLM‑Powered AI Hub to Read and Analyze Community Chats
ShiZhen AI
ShiZhen AI
Nov 28, 2025 · Artificial Intelligence

DeepSeekMath‑V2 Scores 118/120 on Putnam and Achieves Gold‑Level IMO Performance

DeepSeekMath‑V2, released open‑source on 27 Nov 2025, attains gold‑level results on IMO 2025, scores 118 out of 120 on the Putnam 2024 competition, introduces a generator‑verifier self‑verification architecture, uses GRPO training, and outperforms leading closed‑source models on IMO‑ProofBench.

DeepSeekMath-V2GRPOLLM
0 likes · 7 min read
DeepSeekMath‑V2 Scores 118/120 on Putnam and Achieves Gold‑Level IMO Performance
phodal
phodal
Nov 27, 2025 · Artificial Intelligence

How AutoDev’s Agentic RAG Turns Docs into a Programmable Knowledge Base

This article explains how AutoDev builds an Agentic Retrieval‑Augmented Generation system with a Document Query Language (DocQL) that lets LLM agents navigate hierarchical code and documentation structures using JSONPath‑like queries, detailing implementation, multi‑level keyword expansion, and experimental findings.

AIAgentic RAGDocQL
0 likes · 12 min read
How AutoDev’s Agentic RAG Turns Docs into a Programmable Knowledge Base
Data Party THU
Data Party THU
Nov 27, 2025 · Artificial Intelligence

Choosing an Agent Framework: AutoGen, AgentScope, CAMEL, LangGraph Compared

This article examines the evolution of intelligent agent frameworks, presenting a comprehensive overview of AutoGen, AgentScope, CAMEL, and LangGraph, analyzing their architectures, strengths, limitations, and suitable use cases, and offering guidance on selecting the most appropriate framework for complex multi‑agent applications.

Agent FrameworksLLMcomparative analysis
0 likes · 31 min read
Choosing an Agent Framework: AutoGen, AgentScope, CAMEL, LangGraph Compared
Bilibili Tech
Bilibili Tech
Nov 27, 2025 · Artificial Intelligence

Mastering Agentic Systems with Blades: Concepts, Code, and Workflow Patterns

This article explains what an AI Agent is, distinguishes it from traditional workflows, and demonstrates how to build and customize agents using the Go‑based Blades framework, covering core concepts, code examples, five workflow patterns, best‑practice guidelines, and reference resources.

AIBladesGo
0 likes · 11 min read
Mastering Agentic Systems with Blades: Concepts, Code, and Workflow Patterns
phodal
phodal
Nov 26, 2025 · Artificial Intelligence

How Multi‑Agent AI Transforms Code Review into Automated Fixes

AutoDev leverages a multi‑agent architecture and comprehensive information aggregation to turn traditional, fragmented code review into an intelligent, end‑to‑end process that not only detects issues but also generates and applies corrective patches automatically.

AICode reviewDevOps
0 likes · 9 min read
How Multi‑Agent AI Transforms Code Review into Automated Fixes
Java Tech Enthusiast
Java Tech Enthusiast
Nov 26, 2025 · Artificial Intelligence

How LLM, RAG, and AI Agents Work Together

The article clarifies how large language models (LLM), retrieval‑augmented generation (RAG), and AI agents complement each other, describing the brain‑like reasoning of LLMs, the dynamic knowledge access provided by RAG, and the autonomous action capabilities of AI agents, plus practical usage scenarios.

AI AgentArtificial IntelligenceLLM
0 likes · 7 min read
How LLM, RAG, and AI Agents Work Together
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Nov 25, 2025 · Artificial Intelligence

FinSentLLM: A Multi‑LLM Framework for Financial Sentiment Prediction

FinSentLLM integrates multiple LLM experts with structured financial semantic signals, achieving 3‑6% higher accuracy and F1 on the Financial PhraseBank compared to baselines, while DCC‑GARCH and Johansen cointegration analyses confirm a statistically significant long‑term co‑movement between the predicted sentiment signals and stock market dynamics.

DCC-GARCHFinSentLLMFinancial Sentiment Analysis
0 likes · 12 min read
FinSentLLM: A Multi‑LLM Framework for Financial Sentiment Prediction
AI Info Trend
AI Info Trend
Nov 25, 2025 · Artificial Intelligence

Why Claude Opus 4.5 Is the New Powerhouse for Enterprise AI Agents

Claude Opus 4.5, Anthropic’s latest flagship LLM, dramatically upgrades reasoning, tool use, and multi‑step automation, targeting high‑intensity enterprise scenarios, offering stronger coding, longer context handling, and better cost‑effectiveness, while still requiring careful prompt engineering and budgeting for token usage.

Claude Opus 4.5Coding AutomationEnterprise AI
0 likes · 7 min read
Why Claude Opus 4.5 Is the New Powerhouse for Enterprise AI Agents
Tencent Technical Engineering
Tencent Technical Engineering
Nov 24, 2025 · Artificial Intelligence

Inside Google gemini-cli: Turning the Terminal into an AI Agent with ReAct Architecture

This article systematically dissects Google’s open‑source gemini‑cli, revealing how it transforms a traditional command‑line terminal into an AI‑driven collaborative interface by detailing its ReAct loop, tool‑calling mechanisms, context management, and extensible architecture for building similar terminal agents.

AI AgentCLIGemini CLI
0 likes · 27 min read
Inside Google gemini-cli: Turning the Terminal into an AI Agent with ReAct Architecture
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 24, 2025 · Artificial Intelligence

Why Dynamic Function Routing Is the Key to Stable LLM Agents

In real‑world LLM agents, giving the model too many tools at once leads to frequent function‑call errors, but applying dynamic function routing to narrow the candidate set dramatically reduces the error rate—from over 20% down to around 1%—and provides clear guidelines on when and how to implement it.

Function CallingLLMagent
0 likes · 9 min read
Why Dynamic Function Routing Is the Key to Stable LLM Agents
Architect's Guide
Architect's Guide
Nov 24, 2025 · Artificial Intelligence

Building Java LLM Applications with LangChain4j: A Hands‑On Guide

This tutorial walks through the fundamentals of large language models, prompt engineering, and word embeddings, then shows how to set up a LangChain‑based LLM stack in Java using LangChain4j, covering core modules, memory, retrieval, chains, agents, and complete code examples.

AI agentsLLMLangChain
0 likes · 15 min read
Building Java LLM Applications with LangChain4j: A Hands‑On Guide
AI Large Model Application Practice
AI Large Model Application Practice
Nov 24, 2025 · Artificial Intelligence

How to Turn Text into an AI‑Powered PPT Video: A Step‑by‑Step Guide

This article breaks down the end‑to‑end engineering pipeline that converts a knowledge source such as a URL or PDF into a narrated PPT‑style video, detailing six core stages—from knowledge extraction and script generation to image creation, voice synthesis, and final video stitching—while highlighting practical model choices, prompt design, and stability tricks.

Artificial IntelligenceLLMMultimodal
0 likes · 16 min read
How to Turn Text into an AI‑Powered PPT Video: A Step‑by‑Step Guide
AI Tech Publishing
AI Tech Publishing
Nov 23, 2025 · Artificial Intelligence

How Agents Leverage File Systems for Context Engineering

The article examines why file system access is crucial for autonomous agents, outlining common context‑engineering failures such as missing, excessive, or irrelevant information, and demonstrates how using file‑system tools like ls, grep, and write‑file can reduce token waste, enable dynamic storage, improve targeted search, and support continual learning.

Autonomous AgentsContext EngineeringLLM
0 likes · 11 min read
How Agents Leverage File Systems for Context Engineering
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 21, 2025 · Artificial Intelligence

How to Build a Multi‑Layer Cache for Dynamic RAG Systems

This article explains why dynamic Retrieval‑Augmented Generation (RAG) requires a layered caching strategy rather than simple result caching, details a four‑level cache architecture—including embedding, search, answer, and pipeline caches—provides practical key‑generation and TTL guidelines, and outlines dirty‑data defenses to keep caches consistent and performant.

AI EngineeringLLMRAG
0 likes · 10 min read
How to Build a Multi‑Layer Cache for Dynamic RAG Systems
Youzan Coder
Youzan Coder
Nov 21, 2025 · Artificial Intelligence

How to Build, Evaluate, and Optimize AI Test Agents: A Practical Guide

This guide walks you through creating AI‑powered test agents, defining success metrics, building evaluation datasets, crafting and refining system prompts with techniques like chain‑of‑thought, XML, few‑shot and concise inputs, and scaling the workflow by splitting agents and managing prompt versions.

AI agentsLLMPrompt engineering
0 likes · 21 min read
How to Build, Evaluate, and Optimize AI Test Agents: A Practical Guide
Qunhe Technology Quality Tech
Qunhe Technology Quality Tech
Nov 20, 2025 · Artificial Intelligence

How to Build a Quantifiable Quality Assurance System for AI‑Native Products

This article explains the background of AI‑native products, uses VoxDeck as a case study to illustrate typical generation successes and failures, and proposes a systematic, metric‑driven quality‑assurance framework—including data sampling, multi‑dimensional anomaly detection, AI‑assisted checks, and continuous improvement—to boost efficiency, reliability, and business value of AI‑generated content.

AI-nativeLLMPrompt engineering
0 likes · 14 min read
How to Build a Quantifiable Quality Assurance System for AI‑Native Products
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 20, 2025 · Artificial Intelligence

Why Reinforcement Learning Preserves LLM Generality Better Than Supervised Fine‑Tuning

The article analyzes why reinforcement learning (RL) fine‑tuning retains a large language model's general abilities better than supervised fine‑tuning (SFT), explaining the off‑policy distribution shift of SFT and the on‑policy data consistency, KL penalty, and trust‑region mechanisms that give RL its anti‑forgetting properties.

Catastrophic ForgettingLLMOn-Policy Data
0 likes · 8 min read
Why Reinforcement Learning Preserves LLM Generality Better Than Supervised Fine‑Tuning
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 19, 2025 · Big Data

How We Migrated 100k BigQuery SQL Scripts to MaxCompute Using AST and LLM Automation

This article details a real‑world migration of a Southeast Asian tech group’s data warehouse from Google BigQuery to Alibaba Cloud MaxCompute, describing the challenges of converting 100,000 SQL scripts, the AST‑driven and LLM‑assisted automation pipeline, rule‑engine iteration, quality control, and the measurable performance and cost benefits achieved.

ASTBigQueryLLM
0 likes · 12 min read
How We Migrated 100k BigQuery SQL Scripts to MaxCompute Using AST and LLM Automation
Baidu Maps Tech Team
Baidu Maps Tech Team
Nov 19, 2025 · Artificial Intelligence

Boosting Socio‑Economic Q&A: The ARAG Framework Merges Structured Data Analysis with RAG

ARAG introduces a novel Retrieval‑Augmented Generation framework that tightly integrates LLM‑driven structured data analysis with unstructured information retrieval, addressing the “structured + unstructured” reasoning gap in socio‑economic queries, and demonstrates superior accuracy, robustness, and hallucination resistance through extensive evaluations.

LLMRAGSocio-economic AI
0 likes · 12 min read
Boosting Socio‑Economic Q&A: The ARAG Framework Merges Structured Data Analysis with RAG
Data STUDIO
Data STUDIO
Nov 19, 2025 · Artificial Intelligence

Why TOON Beats JSON for LLM Data Exchange: Token Savings and Accuracy Gains

The article explains how the Token‑Oriented Object Notation (TOON) format reduces token usage by 30‑60% and improves accuracy compared to JSON when feeding structured data to large language models, offering concrete syntax, benchmark results, code examples, and guidance on when to adopt it.

JSON alternativeLLMPython
0 likes · 10 min read
Why TOON Beats JSON for LLM Data Exchange: Token Savings and Accuracy Gains