Tagged articles

2016 articles

Page 12 of 21

Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMOllama

0 likes · 12 min read

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

DaTaobao Tech

Jul 18, 2025 · Artificial Intelligence

Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial

This tutorial walks you through constructing a lightweight ReAct agent using Java, explaining the Thought‑Action‑Observation loop, providing a 200‑line code example, and demonstrating a real‑world approval workflow with prompts, tool definitions, and step‑by‑step interaction logs.

AgentLLMReact

0 likes · 21 min read

Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial

Architect's Alchemy Furnace

Jul 17, 2025 · Artificial Intelligence

Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

This article compiles a comprehensive, up‑to‑date inventory of open‑source large language models from Chinese and international organizations, detailing each model’s architecture, parameter count, multilingual capabilities, deployment requirements, and associated tools, offering a valuable reference for AI researchers and developers.

AILLMlarge language model

0 likes · 50 min read

Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

Tencent Advertising Technology

Jul 17, 2025 · Artificial Intelligence

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

The paper introduces LEADRE, a multi‑faceted knowledge‑enhanced large language model‑driven display advertisement recommender that tackles user interest modeling, knowledge alignment, and low‑latency deployment, achieving significant GMV gains in Tencent’s ad platforms through innovative prompt engineering, semantic alignment, and TensorRT‑accelerated inference.

Knowledge AlignmentLLMTensorRT

0 likes · 16 min read

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

Tech Freedom Circle

Jul 17, 2025 · Artificial Intelligence

DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction

This article provides a detailed technical analysis of DeepSeek‑V3, covering its MOE architecture, the novel Multi‑head Latent Attention (MLA) mechanism, the DualPipe pipeline‑parallel algorithm, mixed‑precision FP8 training, and the Multi‑Token Prediction (MTP) inference improvements that together boost performance and efficiency.

DeepSeekDistributed TrainingDualPipe

0 likes · 44 min read

DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction

Alimama Tech

Jul 17, 2025 · Artificial Intelligence

How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code

This article details the author's experience designing a top‑performing AI Werewolf agent for the Taotian Group's AI Werewolf Challenge, covering game rules, core challenges, prompt engineering, caching, concurrent requests, model selection, reinforcement‑learning‑style tuning, and tactical strategies for each role, with code examples.

AI AgentLLMReinforcement Learning

0 likes · 25 min read

How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code

Alibaba Cloud Big Data AI Platform

Jul 16, 2025 · Artificial Intelligence

Master Post-Training: Fine-Tune LLMs with SFT, DPO, and GRPO on Alibaba PAI

This article explains post‑training concepts, compares SFT, DPO, and GRPO fine‑tuning methods, and provides step‑by‑step guidance for using Alibaba Cloud's PAI platform—including Model Gallery and DSW—to fine‑tune large language models with code examples and practical tips.

DPOFine-tuningGRPO

0 likes · 14 min read

Master Post-Training: Fine-Tune LLMs with SFT, DPO, and GRPO on Alibaba PAI

DataFunSummit

Jul 16, 2025 · Artificial Intelligence

How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations

This article explores how Tencent Cloud Elasticsearch combines decades of text search expertise with cutting‑edge vector retrieval and large language models to deliver a one‑stop Retrieval‑Augmented Generation solution, detailing the underlying models, hybrid search architecture, performance tricks, and real‑world case studies.

ElasticsearchHybrid SearchLLM

0 likes · 24 min read

How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations

Volcano Engine Developer Services

Jul 16, 2025 · Information Security

Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach

This article explains how Volcanic Engine safeguards the Model Context Protocol (MCP) throughout its lifecycle, detailing MCP fundamentals, core components, a step‑by‑step interaction example, seven major security risks, official design principles, and a comprehensive security architecture covering admission control, native design, and runtime protection.

LLMMCPModel Context Protocol

0 likes · 21 min read

Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach

Instant Consumer Technology Team

Jul 16, 2025 · Artificial Intelligence

How to Build a Text‑to‑Video Workflow in Dify Using LLMs

This guide walks you through creating a Dify workflow that turns user prompts into videos by chaining LLM‑generated descriptions with a Text‑to‑Video model, covering workflow types, system variables, model setup, node configuration, plugin installation, and final testing steps.

AIDifyLLM

0 likes · 14 min read

How to Build a Text‑to‑Video Workflow in Dify Using LLMs

DaTaobao Tech

Jul 16, 2025 · Artificial Intelligence

From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)

Since GPT‑4’s 2023 debut, large language models have shifted from sheer scale to efficiency‑driven designs, advanced reasoning with chain‑of‑thought, and agentic tool use, as illustrated by MoE, MLA, and new attention mechanisms, reshaping benchmarks, commercial strategies, and the future of AI.

Agentic AILLMModel Scaling

0 likes · 24 min read

From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)

AntTech

Jul 16, 2025 · Artificial Intelligence

Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review

The EXPRESS Workshop at ISSTA 2025, hosted by Ant Group, featured a keynote by Purdue’s Prof. Zhang on an LLM‑driven “Human‑like AI Auditor” called RepoAudit, which demonstrated high‑accuracy automated code review, uncovering dozens of real bugs and hundreds of zero‑day vulnerabilities across major open‑source projects.

AILLMRepoAudit

0 likes · 6 min read

Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review

IT Services Circle

Jul 16, 2025 · Artificial Intelligence

How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix

A recent study reveals that tiny symbols like colons or generic reasoning prefixes can cause large language models used as reward judges to issue false‑positive rewards, but an enhanced reward model called Master‑RM, trained with adversarial data, eliminates this vulnerability across multiple LLMs and languages.

AI SafetyLLMMaster-RM

0 likes · 10 min read

How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix

Architects' Tech Alliance

Jul 15, 2025 · Artificial Intelligence

Why High‑Bandwidth Memory (HBM) Is Critical for Modern AI and How It Works

This article explains what high‑bandwidth memory (HBM) is, outlines its brief history, compares it with DDR, LPDDR and GDDR, describes why large language models and generative AI drive its demand, and reviews its architecture, PCB requirements, market status, and future outlook.

AI hardwareHBMLLM

0 likes · 3 min read

Why High‑Bandwidth Memory (HBM) Is Critical for Modern AI and How It Works

Alibaba Cloud Developer

Jul 15, 2025 · Information Security

Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation

This article explores how large language models can be integrated with MCP Server and Burp Suite to automate web application vulnerability detection, detailing environment setup, workflow steps, code snippets, challenges such as token limits and payload formatting, and the advantages and limitations of the approach.

Automated Vulnerability ScanningBurp SuiteKotlin

0 likes · 12 min read

Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation

Tencent Cloud Developer

Jul 15, 2025 · Artificial Intelligence

How RAG Evolved: From Naive to Agentic – A Complete Guide

This article systematically outlines the evolution of Retrieval‑Augmented Generation (RAG) from its naive three‑step pipeline to advanced, modular, and agentic architectures, highlighting each generation's motivations, core features, advantages, drawbacks, and practical implementation details for large language model applications.

Agentic RAGArtificial IntelligenceLLM

0 likes · 20 min read

How RAG Evolved: From Naive to Agentic – A Complete Guide

Fun with Large Models

Jul 15, 2025 · Artificial Intelligence

Getting Started with LangChain & LangGraph: Core Concepts of AI Agents

This article introduces AI Agents and explains why LangChain is the leading framework, detailing its core concepts, three‑layer architecture, key features, comparison with other agent frameworks, and showcasing popular projects built with LangChain and LangGraph.

AI AgentLLMLangChain

0 likes · 10 min read

Getting Started with LangChain & LangGraph: Core Concepts of AI Agents

Tencent Technical Engineering

Jul 14, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact

This article provides a concise overview of the latest AI concepts—including AIGC, Retrieval‑Augmented Generation, Function‑Calling models, intelligent agents, and the Model Context Protocol—explaining their principles, differences, and how they can be combined to build more powerful AI applications for developers outside the AI field.

AIGCAgentFunction Calling

0 likes · 15 min read

Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact

Architect's Alchemy Furnace

Jul 12, 2025 · Artificial Intelligence

Why GraphRAG Is the Future of Retrieval‑Augmented Generation

This article explains how GraphRAG combines knowledge graphs with retrieval‑augmented generation to overcome the limitations of vector‑only RAG, delivering higher accuracy, better explainability, easier development, and stronger governance for generative AI applications across various domains.

AIGraphRAGLLM

0 likes · 23 min read

Why GraphRAG Is the Future of Retrieval‑Augmented Generation

AI Frontier Lectures

Jul 11, 2025 · Artificial Intelligence

Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test

The article evaluates several large language models—including ChatGPT, Gemini, Grok, Qwen, and o3‑Pro—on a visual illusion that requires squinting to identify the Mona Lisa, revealing varied success rates, reasoning differences, and insights into model capabilities and limitations.

LLMmodel comparisonprompt engineering

0 likes · 6 min read

Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test

Instant Consumer Technology Team

Jul 11, 2025 · Artificial Intelligence

Boost LLM Performance with Prompt‑Optimizer: Open‑Source Prompt Tuning Made Easy

Prompt‑Optimizer is an open‑source tool that uses AI models to automatically refine and compare prompts, offering multi‑model support, security features, and cross‑platform access, while providing step‑by‑step Docker deployment instructions for developers and prompt engineers.

AI toolsDockerLLM

0 likes · 7 min read

Boost LLM Performance with Prompt‑Optimizer: Open‑Source Prompt Tuning Made Easy

Qborfy AI

Jul 11, 2025 · Artificial Intelligence

Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide

This tutorial walks through creating a full‑featured LLM Agent workflow using LangGraph, covering goal definition, task decomposition, execution nodes, state updates, re‑planning logic, and user feedback, while comparing ReAct and Reflexion approaches and providing complete Python code examples.

LLMLangChainLangGraph

0 likes · 11 min read

Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide

Tech Freedom Circle

Jul 11, 2025 · Artificial Intelligence

The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI

This article explains the three foundational protocols—MCP for tool access, A2A for inter‑agent communication, and AG‑UI for Agent‑UI interaction—detailing their origins, technical roles, example implementations, and how they together form the communication backbone of modern AI applications.

A2AAG-UIAI Agent

0 likes · 18 min read

The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI

Fun with Large Models

Jul 10, 2025 · Artificial Intelligence

Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation

The article reviews Grok 4’s flashy launch and claimed first‑principles advantage, then presents benchmark results—showing strong reasoning, multimodal and agent scores but disappointing coding performance versus DeepSeek‑R1—concluding that the model’s real‑world capabilities fall short of its hype.

AgentGrok4LLM

0 likes · 11 min read

Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation

Instant Consumer Technology Team

Jul 10, 2025 · Artificial Intelligence

How LLMs and Vector Search Power Real-Time Icon Recommendations

This article explains a system that combines large language models with multimodal vector retrieval to automatically understand user intent and instantly recommend the most relevant icons, detailing the workflow, semantic vectorization, offline indexing, online inference, and evaluation methods.

CLIPHNSWLLM

0 likes · 13 min read

How LLMs and Vector Search Power Real-Time Icon Recommendations

Tencent Cloud Developer

Jul 10, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers

This article provides a concise, developer‑focused overview of emerging AI concepts—including AIGC, multimodal models, Retrieval‑Augmented Generation, intelligent agents, Function‑Calling, and the Model Context Protocol (MCP)—explaining their core principles, differences, and how they interrelate to enable advanced AI applications.

AIAIGCAgent

0 likes · 16 min read

Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers

Instant Consumer Technology Team

Jul 9, 2025 · Artificial Intelligence

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

The article introduces Easy Dataset, a GUI‑driven framework that transforms heterogeneous documents into high‑quality, persona‑driven fine‑tuning data for large language models, details its architecture, core contributions, experimental validation on financial QA, and compares it with existing data‑synthesis tools.

Artificial IntelligenceFine-tuningGUI

0 likes · 12 min read

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

Alimama Tech

Jul 9, 2025 · Artificial Intelligence

How to Make LLMs Recognize and Resolve Their Own Uncertainty

This article introduces ConfuseBench, a benchmark that classifies LLM uncertainty into document‑missing, ability‑limited, and ambiguous types, and presents methods—including retrieval, chain‑of‑thought, and clarification—to detect and actively resolve uncertainty, improving answer quality across diverse tasks.

Chain-of-ThoughtClarificationInquiry

0 likes · 17 min read

How to Make LLMs Recognize and Resolve Their Own Uncertainty

AntTech

Jul 9, 2025 · Artificial Intelligence

How KAG-Thinker Boosts Structured Reasoning in Large Language Models

The KAG-Thinker model, a collaborative effort by Ant Group, Zhejiang University, and Tongji University, introduces a hierarchical "breadth splitting + depth solving" framework that enhances logical stability, knowledge utilization, and retrieval robustness for complex multi‑hop reasoning tasks across general and specialized domains.

AIKAG-ThinkerKnowledge Retrieval

0 likes · 10 min read

How KAG-Thinker Boosts Structured Reasoning in Large Language Models

High Availability Architecture

Jul 9, 2025 · Artificial Intelligence

How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions

This article analyzes the rapid evolution of large language models from the GPT‑4 era through efficiency‑focused sparsity and attention innovations, to inference‑time reasoning and tool‑using agents, highlighting key architectures, benchmark breakthroughs, competitive strategies, and emerging research directions toward embodied AI.

Agentic AILLMTransformer

0 likes · 24 min read

How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions

Alibaba Cloud Big Data AI Platform

Jul 8, 2025 · Artificial Intelligence

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

This article explains the end‑to‑end implementation of Video RAG in OpenSearch LLM, covering offline parsing, key‑frame extraction, audio transcription, slice creation, multimodal vectorization, hybrid indexing, and online query processing while addressing challenges like recall performance and long‑video efficiency.

ASRKey Frame ExtractionLLM

0 likes · 10 min read

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

Alibaba Cloud Developer

Jul 8, 2025 · Artificial Intelligence

From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023

This article traces the evolution of large language models from the GPT‑4 era through 2024‑2025, highlighting the shift from pure scaling to efficiency‑focused architectures, the rise of reasoning‑centric "thinking" models, and the emergence of agentic capabilities that enable tools and real‑world interaction.

LLMTransformeragents

0 likes · 27 min read

From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023

Instant Consumer Technology Team

Jul 4, 2025 · Artificial Intelligence

How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering

This article explains how AI agents, using the ReAct framework, enable a human‑machine pair‑programming workflow, details the reasoning‑acting‑observation loop, showcases practical Python examples with smolagents and DeepSeek, and provides prompt‑engineering guidelines for effective tool‑calling.

AI AgentLLMPython

0 likes · 19 min read

How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering

DaTaobao Tech

Jul 4, 2025 · Artificial Intelligence

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

This article details the end‑to‑end design of Taobao Live's AI digital human system, covering six core components such as LLM‑driven content creation, interactive dialogue, TTS voice synthesis, visual synchronization, audio‑video engineering, and a scalable backend, while also discussing product evolution, automation challenges, and future roadmap.

AIDigital HumanLLM

0 likes · 19 min read

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

macrozheng

Jul 4, 2025 · Artificial Intelligence

Build Java LLM Applications with LangChain4j: A Hands‑On Guide

This tutorial walks through the fundamentals of large language models, prompt engineering, word embeddings, and shows how to use the LangChain framework (including its Java implementation LangChain4j) to build, memory‑manage, retrieve, and chain AI‑driven applications with practical code examples.

AIEmbeddingLLM

0 likes · 17 min read

Build Java LLM Applications with LangChain4j: A Hands‑On Guide

Alipay Experience Technology

Jul 3, 2025 · Artificial Intelligence

How MCP Transforms Agent Development: From Complex Tools to Plug‑and‑Play

This talk explains the Model Context Protocol (MCP), how it simplifies agent tool integration by replacing numerous custom interfaces with a single standardized protocol, and details its adoption, architecture, security, and future directions within Ant Group's ecosystem.

AIAgentInfrastructure

0 likes · 21 min read

How MCP Transforms Agent Development: From Complex Tools to Plug‑and‑Play

DataFunTalk

Jul 3, 2025 · Artificial Intelligence

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

In an interview with Vivo AI engineer Liang Tianan, the article explores the challenges of post‑Q&A recommendation, the integration of large language models into recall, ranking and evaluation pipelines, and the engineering trade‑offs required to deliver high‑quality, diverse suggestions on mobile devices.

LLMMobile AIMultimodal

0 likes · 15 min read

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

DataFunSummit

Jul 3, 2025 · Artificial Intelligence

Boosting LLM Function Call Capabilities: From Data Construction to RLHF Optimization

On July 12, 2025, the DataFun Summit will feature a technical session where China Telecom AI Research Institute engineer Yao Yitong presents a deep dive into enhancing large language model Function Call abilities through systematic data and training optimizations, offering practical insights for AI practitioners.

AILLMRLHF

0 likes · 4 min read

Boosting LLM Function Call Capabilities: From Data Construction to RLHF Optimization

DaTaobao Tech

Jul 2, 2025 · Artificial Intelligence

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

This article presents a comprehensive overview of the AI‑driven digital‑human live‑streaming solution used by Taobao, detailing six core components—including LLM‑based content generation and interaction, TTS, visual driving, audio‑video engineering, and backend services—while sharing architectural diagrams, cost‑reduction strategies, productization insights, and future directions.

AIDigital HumanLLM

0 likes · 8 min read

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

Cognitive Technology Team

Jul 1, 2025 · Artificial Intelligence

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

This article presents a comprehensive practice summary of building an intelligent digital‑human system, covering six core modules—LLM content generation, LLM interaction, TTS synthesis, visual driving, audio‑video engineering, and backend services—while detailing data collection, signal processing, ASR annotation, speaker clustering, model optimization (V1‑V4), evaluation metrics, and future research directions.

AI voiceAudio ProcessingDigital Human

0 likes · 23 min read

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

Go Programming World

Jul 1, 2025 · Artificial Intelligence

What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development

Model Context Protocol (MCP), an open-source standard from Anthropic, standardizes how large language models interact with external tools and data sources, introducing a client‑server architecture with hosts, clients, and servers, and promises to simplify AI application development compared to traditional function‑calling approaches.

AILLMMCP

0 likes · 5 min read

What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development

JavaEdge

Jun 30, 2025 · Artificial Intelligence

How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java

GPULlama3.java, released by Manchester University's Beehive Lab, is the first native Java implementation of Llama 3 that leverages TornadoVM to automatically accelerate inference on GPUs without writing CUDA or native code, supporting NVIDIA, Intel and Apple Silicon back‑ends and modern Java 21 features.

AIGPU AccelerationLLM

0 likes · 7 min read

How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java

Alibaba Cloud Big Data AI Platform

Jun 30, 2025 · Artificial Intelligence

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

This article introduces a variable‑length chain‑of‑thought distillation technique built on Alibaba Cloud PAI’s EasyDistill toolkit, presents the high‑quality OmniThought‑0528 dataset, details the training of the DistillQwen‑ThoughtY 4B/8B/32B models, and provides code and usage examples for researchers and practitioners.

Chain-of-ThoughtDatasetDistillation

0 likes · 15 min read

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

DaTaobao Tech

Jun 30, 2025 · Artificial Intelligence

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

This article outlines the end‑to‑end architecture and practical solutions behind creating intelligent digital humans for live commerce, covering LLM‑driven content generation, real‑time lip‑sync, image‑driven avatar creation, automated material review, lightweight model training, and a roadmap toward fully automated, high‑performance virtual presenters.

AIDigital HumanLLM

0 likes · 19 min read

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

Qborfy AI

Jun 28, 2025 · Artificial Intelligence

Mastering LangGraph: Build Stateful, Looping LLM Agents with Python

This tutorial walks through the limitations of linear LangChain workflows, introduces LangGraph’s state‑node‑edge architecture, and provides step‑by‑step code examples—including a Hello‑World tool, conditional branching, multi‑turn conversation handling, and graph visualization—so readers can construct robust, persistent LLM agents.

AgentLLMLangChain

0 likes · 9 min read

Mastering LangGraph: Build Stateful, Looping LLM Agents with Python

MaGe Linux Operations

Jun 28, 2025 · Artificial Intelligence

Master Dify: From Local Deployment to Advanced AI Workflows in 2025

This guide walks you through installing and configuring Dify—a open‑source LLM application platform—on your local machine using Docker, integrating it with Ollama for custom models, and exploring its core features such as chat assistants, agents, workflows, and tool extensions, all illustrated with step‑by‑step screenshots and code snippets.

AI workflowDifyDocker

0 likes · 12 min read

Master Dify: From Local Deployment to Advanced AI Workflows in 2025

Fighter's World

Jun 28, 2025 · Artificial Intelligence

What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning

The article explains the Generator‑Verifier Gap (GVG)—the asymmetry where verifying a solution is far cheaper than generating it—covers its origin, its impact on test‑time scaling for large language models, reinforcement‑learning approaches, and how the concept can shape agent architectures and AI product strategy.

Agent ArchitectureGenerator-Verifier GapLLM

0 likes · 21 min read

What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning

AI Algorithm Path

Jun 28, 2025 · Artificial Intelligence

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

This article walks through the mechanics of greedy search and beam search in large language models, demonstrates both methods with GPT‑2 on the prompt "I have a dream", visualizes the decoding trees, compares their scores, and discusses the trade‑offs between efficiency and output quality.

Beam SearchGPT-2Greedy Search

0 likes · 16 min read

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

JavaEdge

Jun 27, 2025 · Artificial Intelligence

Why Inference Engines Are Essential for Deploying Large Language Models in Production

The article explains what inference engines are, why they are needed beyond raw Python scripts, and outlines best practices such as model quantization, batching, and parallelism, while comparing popular open‑source and commercial options for production AI workloads.

AI deploymentBatchingInference Engine

0 likes · 14 min read

Why Inference Engines Are Essential for Deploying Large Language Models in Production

360 Zhihui Cloud Developer

Jun 27, 2025 · Operations

How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers

This article details the design, technology choices, functional modules, core implementation, performance optimizations, and future roadmap of Ops‑Nexus, an AI‑driven intelligent operations platform that streamlines alarm analysis, log processing, and host health checks for large‑scale monitoring environments.

AI OpsIntelligent OperationsLLM

0 likes · 12 min read

How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers

Fun with Large Models

Jun 27, 2025 · Artificial Intelligence

Boost Answer Accuracy: Detailed GraphRAG Retrieval Steps with Knowledge Graphs

This article walks through GraphRAG’s retrieval phase, showing how knowledge‑graph entities, relationships, and community reports are assembled into a query context, comparing local and global modes with traditional RAG, and illustrating the process with a concrete “Age of Big Data” example.

GraphRAGLLMglobal mode

0 likes · 14 min read

Boost Answer Accuracy: Detailed GraphRAG Retrieval Steps with Knowledge Graphs

AI Algorithm Path

Jun 26, 2025 · Artificial Intelligence

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

This guide breaks down the ten core building blocks of a production‑ready RAG pipeline—from input handling and vector stores to prompt engineering, LLM inference, observability, and evaluation—showing why each piece matters, common pitfalls, and practical best‑practice recommendations.

LLMRAGRetrieval Augmented Generation

0 likes · 9 min read

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

Java Architecture Diary

Jun 25, 2025 · Artificial Intelligence

Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM

This tutorial walks through creating a natural‑language‑to‑SQL chatbot using Spring AI, configuring a MySQL school database with Flyway, defining system prompts for a DeepSeek LLM, implementing service beans and a REST API, and interacting with the bot via curl commands.

ChatbotDeepSeekLLM

0 likes · 15 min read

Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM

Continuous Delivery 2.0

Jun 25, 2025 · Artificial Intelligence

How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants

The Model Context Protocol (MCP) is an open, standardized adapter that lets large language models seamlessly connect to tools, data sources, and workflows, offering plug‑and‑play intelligence, cross‑platform compatibility, security, and modular extensibility for building real‑world AI applications.

AI integrationLLMMCP

0 likes · 11 min read

How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants

AntTech

Jun 23, 2025 · Artificial Intelligence

Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA

The EXPRESS 2025 workshop at ISSTA in Norway will showcase AI‑driven code auditing, present cutting‑edge research on trustworthy software systems, and invite researchers and practitioners to discuss transparency, reliability, and security challenges in modern software engineering.

AI auditingISSTA 2025LLM

0 likes · 5 min read

Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA

Alibaba Cloud Native

Jun 23, 2025 · Artificial Intelligence

From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0

The article reflects on Andrej Karpathy’s AI Startup School talk, outlining the evolution from traditional if‑else programming (Software 1.0) through data‑driven models (Software 2.0) to goal‑oriented natural‑language agents (Software 3.0), and examines LLMs as operating‑system‑like infrastructure, prompting, and engineering challenges.

LLMsoftware evolution

0 likes · 5 min read

From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0

DaTaobao Tech

Jun 23, 2025 · Artificial Intelligence

How We Built a High‑Accuracy AI‑Powered Digital Human Script Engine for Live Commerce

This article details the end‑to‑end AI pipeline for creating intelligent digital humans in live streaming, covering LLM‑driven script generation, multimodal data integration, error‑prone number handling, DPO fine‑tuning, experimental results, and future directions for more human‑like presentations.

AIDigital HumanLLM

0 likes · 35 min read

How We Built a High‑Accuracy AI‑Powered Digital Human Script Engine for Live Commerce

Architecture & Thinking

Jun 23, 2025 · Artificial Intelligence

Building AI Assistants with Eino: A Go Framework for Large‑Model Applications

This article introduces Eino, an open‑source Golang framework for large‑model AI applications, explains its core capabilities, walks through creating a simple AI assistant with message templates and chat model integration, and demonstrates how to extend the system with tools and a modular architecture for future expansion.

AI AssistantEinoFramework

0 likes · 17 min read

Building AI Assistants with Eino: A Go Framework for Large‑Model Applications

DataFunSummit

Jun 22, 2025 · Artificial Intelligence

How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs

In a detailed interview, Vivo AI engineer Liang Tianan explains how the BlueHeart Small V assistant leverages large language models, multi‑stage recall, ranking, and reward‑model fine‑tuning (SFT/DPO) to generate high‑quality, diverse post‑dialogue recommendation items while balancing latency, cost, and evaluation challenges.

DPOLLMSFT

0 likes · 15 min read

How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs

Tech Freedom Circle

Jun 21, 2025 · Artificial Intelligence

How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure

The article explains the Model Context Protocol (MCP) as a zero‑code bridge that lets large language models seamlessly access databases, external APIs, and execute code, detailing its benefits for developers and everyday users, its core components, step‑by‑step workflow, real‑world examples, and how it outperforms traditional APIs in modern AI agent systems.

AI AgentLLMMCP

0 likes · 37 min read

How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure

Spring Full-Stack Practical Cases

Jun 21, 2025 · Artificial Intelligence

Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators

This article introduces the fundamentals of augmented large language model agents, explains six workflow patterns—including chain, parallel, routing, orchestrator‑workers, evaluator‑optimizer, and autonomous agents—and provides complete Spring Boot 3 code examples, configuration, and test results for each pattern.

BackendLLMSpring Boot

0 likes · 15 min read

Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators

Fighter's World

Jun 21, 2025 · Artificial Intelligence

Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context

The article analyzes why context engineering is crucial for multi‑agent AI systems, illustrates the fragility caused by fragmented context with a Flappy Bird analogy, and proposes three detailed speculative components—a compression‑to‑structure pipeline, a hybrid layered memory architecture, and a context‑aware coordination mechanism—culminating in a unified reference design for long‑horizon agents.

Agent CoordinationCompression PipelineContext Engineering

0 likes · 22 min read

Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context

Alibaba Cloud Developer

Jun 20, 2025 · Artificial Intelligence

How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights

This article explores the evolving concept of AI agents, debates their definitions, outlines four major deployment challenges—including prompt instability, planning balance, domain knowledge integration, and response speed—and presents practical strategies such as prompt engineering, workflow design, multi‑agent architectures, and model optimization to build reliable, high‑availability agents.

AI AgentLLMMulti-Agent

0 likes · 32 min read

How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights

dbaplus Community

Jun 19, 2025 · Artificial Intelligence

How Constrained Decoding Guarantees 100% Correct SQL from Large Language Models

This article explains how constrained decoding, built on context‑free grammars, Jinja templates, and the XGrammar engine, can enforce strict SQL syntax and custom business rules during LLM generation, enabling reliable, production‑grade NL‑to‑SQL services.

CFGJinjaLLM

0 likes · 37 min read

How Constrained Decoding Guarantees 100% Correct SQL from Large Language Models

Alibaba Cloud Developer

Jun 19, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Empowers LLMs?

The article introduces Model Context Protocol (MCP), explains its architecture of Host, Client, and Server, describes its components—Resources, Tools, Prompts—and demonstrates practical integration with IDE plugins to extend LLM capabilities such as real‑time ticket queries, highlighting its significance for AI development.

AI integrationAI toolingFunction Calling

0 likes · 11 min read

What Is Model Context Protocol (MCP) and How It Empowers LLMs?

Sohu Tech Products

Jun 18, 2025 · Backend Development

How LLMs Transform Traffic Replay Testing for Backend Services

This article walks through the challenges of traditional traffic replay, explains the design and benefits of a conventional replay system, and then details how integrating large language models can automate data preparation, script generation, and validation to make backend testing more accurate, scalable, and efficient.

Backend testingLLMservice reliability

0 likes · 18 min read

How LLMs Transform Traffic Replay Testing for Backend Services

DataFunTalk

Jun 18, 2025 · Artificial Intelligence

Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro

This article examines the LiveCodeBench Pro benchmark, revealing that while large language models achieve impressive scores on knowledge‑ and logic‑heavy coding problems, they still fall short of human experts on high‑difficulty, observation‑intensive tasks, especially without external tool support.

AI EvaluationLLMalgorithmic reasoning

0 likes · 11 min read

Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro

AIWalker

Jun 18, 2025 · Artificial Intelligence

Six New Directions for Large Language Models

Large language models are booming, and this article highlights six cutting‑edge research directions—LLM‑plus synthetic data, reward modeling, inference techniques, LLM‑as‑a‑Judge, safety alignment, and long‑context handling—each illustrated with recent papers, experimental results, and links to code repositories.

InferenceLLMReward Modeling

0 likes · 9 min read

Six New Directions for Large Language Models

Aikesheng Open Source Community

Jun 17, 2025 · Artificial Intelligence

Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities

This article presents SCALE, a community‑driven, open‑source benchmark that expands beyond simple Text‑to‑SQL accuracy to evaluate large language models on performance, dialect conversion, and deep SQL understanding, offering developers, researchers, and CTOs a realistic measure of AI‑assisted database tasks.

AILLMbenchmark

0 likes · 10 min read

Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities

Tencent Technical Engineering

Jun 16, 2025 · Artificial Intelligence

Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies

This comprehensive guide walks you through the fundamentals of Retrieval‑Augmented Generation (RAG) and AI agents, explains their inner workings, shares optimization tricks, provides ready‑to‑run code snippets, and demonstrates how to evaluate performance with metrics such as recall, faithfulness, and answer relevance.

AI agentsLLMRAG

0 likes · 36 min read

Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies

AsiaInfo Technology: New Tech Exploration

Jun 16, 2025 · Artificial Intelligence

How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions

This article examines the theory and practice of shared memory in multi‑agent systems, tracing its evolution from classic blackboard models to modern solutions like Mem0.ai, Open Memory, and A‑MEM, and provides concrete design patterns, integration strategies, and future research directions for LangGraph users.

AI memoryDistributed SystemsLLM

0 likes · 37 min read

How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions

Network Intelligence Research Center (NIRC)

Jun 15, 2025 · Cloud Native

How MicroOps Enables Easy Deployment and Management of Virtual Networks on Kubernetes

The article details MicroOps' virtual network feature on Kubernetes, covering manual and intent‑driven deployment, topology visualization and editing, node types, monitoring with Prometheus and Fluentd, chaos injection via ChaosMesh and VN_Chaos, and upcoming alarm and self‑healing modules.

FluentdKubernetesLLM

0 likes · 6 min read

How MicroOps Enables Easy Deployment and Management of Virtual Networks on Kubernetes

ITPUB

Jun 15, 2025 · Artificial Intelligence

How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)

This article presents a step‑by‑step guide for constructing a scalable enterprise Retrieval‑Augmented Generation (RAG) solution using the Model Context Protocol (MCP), covering architecture comparison, system design, Milvus‑backed knowledge store, Python client implementation, deployment scripts, code examples, and best‑practice recommendations.

KnowledgeBaseLLMMCP

0 likes · 22 min read

How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)

Fighter's World

Jun 14, 2025 · Artificial Intelligence

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

The article analyzes how large language models can acquire true reasoning abilities for hard‑to‑score industry tasks by combining Chain‑of‑Thought prompting with reinforcement learning, addressing vague reward signals, reward hacking, and loyalty, and proposing a toolbox of reward engineering, synthetic data, hierarchical RL and multi‑agent collaboration.

Chain-of-ThoughtLLMReinforcement Learning

0 likes · 22 min read

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

Alibaba Cloud Big Data AI Platform

Jun 13, 2025 · Artificial Intelligence

How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud

EasyDistill, an open-source framework from Alibaba Cloud PAI, streamlines knowledge distillation for large language models, introducing the DistilQwen-ThoughtX series with variable-length chain-of-thought reasoning, and provides comprehensive best-practice guidance for training, fine-tuning, evaluation, compression, and deployment via the PAI-ModelGallery.

AI inferenceLLMknowledge distillation

0 likes · 12 min read

How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud

Instant Consumer Technology Team

Jun 12, 2025 · Artificial Intelligence

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

This guide walks through using Alibaba's new Qwen3-Embedding and Qwen3-Reranker models to build a two‑stage Retrieval‑Augmented Generation pipeline with Milvus, covering environment setup, data ingestion, vector indexing, reranking, and LLM‑driven answer generation, demonstrating production‑grade performance across multilingual queries.

EmbeddingLLMMilvus

0 likes · 19 min read

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

Alibaba Cloud Developer

Jun 11, 2025 · Artificial Intelligence

From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering

This article chronicles the evolution from simple chat interactions to sophisticated autonomous agents, detailing stages of LLM development, ReAct reasoning, memory management, tool integration, and practical implementation using the browser-use project, while offering prompt design insights and future directions for AI agents.

AI AgentLLMMCP

0 likes · 30 min read

From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering

Architecture & Thinking

Jun 11, 2025 · Artificial Intelligence

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Eino is an open‑source Golang framework for building large‑model applications, offering reusable components, robust orchestration, clean APIs, best‑practice templates, and full‑cycle DevOps tools, with code examples for both Ollama and OpenAI modes, plus streaming and normal output options.

AI DevelopmentFrameworkGo

0 likes · 10 min read

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Instant Consumer Technology Team

Jun 10, 2025 · Artificial Intelligence

Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide

This article explains how the Model Context Protocol (MCP) standardizes AI agent communication with external tools, outlines its benefits, describes its core components, showcases open‑source implementations, and provides step‑by‑step Python examples for building MCP servers and clients.

Function CallingLLMMCP

0 likes · 22 min read

Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide

Alibaba Cloud Developer

Jun 10, 2025 · Artificial Intelligence

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

This article traces the evolution of AI application architectures—from the earliest minimal user‑LLM interaction to advanced designs featuring context enhancement, input/output guardrails, intent routing, model gateways, caching strategies, agent capabilities, monitoring, and inference performance optimizations—providing practical insights and references for developers.

AI ArchitectureAgentInference Optimization

0 likes · 21 min read

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

DataFunSummit

Jun 8, 2025 · Artificial Intelligence

Mastering LLM Applications: Practical Agent Design and Implementation Strategies

This comprehensive guide explores the core implementation paths for large language model (LLM) applications, focusing on agent design, workflow orchestration, tool integration, memory management, multi‑agent architectures, and future trends, providing actionable methodologies and real‑world examples for practitioners.

AI AgentAgent DesignLLM

0 likes · 25 min read

Mastering LLM Applications: Practical Agent Design and Implementation Strategies

dbaplus Community

Jun 7, 2025 · Artificial Intelligence

How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons

The article shares practical experiences using large language models such as Cursor and DeepSeek in data‑warehouse workflows, covering assisted coding, automated metric extraction, self‑service analysis, documentation generation, their benefits, limitations, and the broader impact on data engineering roles.

AI automationBusiness IntelligenceLLM

0 likes · 9 min read

How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons

AI2ML AI to Machine Learning

Jun 6, 2025 · Artificial Intelligence

Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)

The article enumerates common pitfalls of Retrieval‑Augmented Generation—such as missing content, low‑rank document misses, context limits, format errors, incomplete answers, scalability bottlenecks, complex PDF extraction, data‑quality issues, domain adaptation gaps, hallucinations, and feedback‑loop deficiencies—and offers concrete mitigation strategies ranging from data cleaning and prompt design to hybrid search, hierarchical retrieval, document compression, and automated evaluation.

Data QualityHybrid SearchLLM

0 likes · 9 min read

Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)

Youzan Coder

Jun 6, 2025 · Artificial Intelligence

How AI Agents Turn Manual Data Retrieval into Fully Automated Insights

This article examines the challenges of manual data extraction in data‑driven enterprises, explains why large language models alone fall short, and details how the Cursor‑Agent framework automates end‑to‑end querying, knowledge‑base integration, and result validation to become a self‑sufficient "data master" for both technical and non‑technical users.

AI AgentCursor AgentData Automation

0 likes · 26 min read

How AI Agents Turn Manual Data Retrieval into Fully Automated Insights

DaTaobao Tech

Jun 6, 2025 · Artificial Intelligence

Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration

This article examines how the rise of large language models reshapes core business assets, defines agents and tools, explores multi‑agent collaboration patterns, task allocation and conflict resolution mechanisms, and evaluates the MCP protocol and engineering requirements for building scalable, flexible agent platforms.

Agent ArchitectureLLMMCP protocol

0 likes · 9 min read

Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration

JavaEdge

Jun 5, 2025 · Artificial Intelligence

How Amazon’s Strands Agents SDK Simplifies Building AI Agents

Amazon’s newly open‑source Strands Agents SDK lets developers create AI agents with minimal code by defining prompts, tools, and models, offering a lightweight, production‑ready framework that supports multiple model providers, observability, multi‑agent collaboration, and extensible tooling via dedicated packages.

AI agentsAmazonLLM

0 likes · 7 min read

How Amazon’s Strands Agents SDK Simplifies Building AI Agents

Didi Tech

Jun 5, 2025 · Artificial Intelligence

Unlocking Modern AI Application Architecture: From RAG to Agents and MCP

This article surveys the evolution of AI applications, explains large language model fundamentals, outlines architectural challenges, and introduces three core patterns—Retrieval‑Augmented Generation (RAG), autonomous Agents, and Model Context Protocol (MCP)—while providing practical LangChain code snippets and integration guidance.

AIAgentLLM

0 likes · 28 min read

Unlocking Modern AI Application Architecture: From RAG to Agents and MCP

AI Frontier Lectures

Jun 5, 2025 · Artificial Intelligence

Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy

This paper introduces the Thought Leap Bridge task and the CoT‑Bridge model, which detect and fill missing intermediate steps in chain‑of‑thought reasoning, dramatically improving large language model performance on mathematical and logical benchmarks and enhancing downstream distillation and reinforcement‑learning pipelines.

Chain-of-ThoughtCoT-BridgeLLM

0 likes · 8 min read

Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy

AI Algorithm Path

Jun 4, 2025 · Artificial Intelligence

Why LLMs Hallucinate and How to Mitigate the Problem

The article explains that hallucinations in large language models stem mainly from the supervised fine‑tuning stage, illustrates the issue with concrete examples, and presents mitigation techniques such as knowledge‑probing data generation and web‑search tool integration using special tokens.

LLMMetaOpenAssistant

0 likes · 12 min read

Why LLMs Hallucinate and How to Mitigate the Problem

Architect's Alchemy Furnace

Jun 4, 2025 · Artificial Intelligence

What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems

This article examines the evolving role of the AI engineer, contrasting it with AI researchers, ML engineers, and software engineers, outlines essential skills such as prompt engineering, MLOps, and data integration, and predicts how AI engineering will become a pivotal, high‑demand discipline in the coming years.

AI EngineeringAI systemsAgentic RAG

0 likes · 17 min read

What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems

Baobao Algorithm Notes

Jun 4, 2025 · Artificial Intelligence

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

This article critically examines seven high‑profile reinforcement‑learning papers for large language models, exposing flawed baseline evaluations, unrealistic settings, and modest actual improvements despite bold claims of dramatic performance gains.

AI researchLLMbaseline evaluation

0 likes · 8 min read

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

Xiaohongshu Tech REDtech

Jun 4, 2025 · Artificial Intelligence

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

MarkerGen introduces a novel, plug‑and‑play framework that decomposes length‑controllable text generation into four sub‑abilities—identifying, counting, planning, and aligning—integrates external tokenizers and dynamic markers, and achieves significantly lower length errors and higher quality across diverse models, tasks, and languages.

LLMLength-Controlled GenerationMarkerGen

0 likes · 14 min read

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

DaTaobao Tech

Jun 4, 2025 · Artificial Intelligence

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

This article provides a comprehensive overview of large language models (LLMs), covering their transformer architecture, parameter counts, GPU memory and storage requirements, and detailed fine‑tuning methods such as prompt engineering, data construction, LoRA, PEFT, RLHF, and DPO, along with practical deployment and inference acceleration strategies.

DPOFine-tuningLLM

0 likes · 17 min read

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

AI Frontier Lectures

Jun 3, 2025 · Artificial Intelligence

Master LLM Engineering: Model Conversion, Parallel Inference, and Channel‑Loss Techniques

This article outlines essential LLM engineering skills, including scripts for converting various model checkpoints to Llama format, customizing modeling files for advanced features, building a multi‑GPU inference class, and adding channel‑aware loss tracking to fine‑tuning pipelines.

Flash AttentionLLMTraining Optimization

0 likes · 6 min read

Master LLM Engineering: Model Conversion, Parallel Inference, and Channel‑Loss Techniques

Alibaba Cloud Observability

Jun 3, 2025 · Artificial Intelligence

How to Build an MCP Server for AI-Powered Observability: 6 Practical Design Tips

Discover how to design and implement an MCP Server that integrates AI-driven observability, covering essential components, best practices, code examples, and real-world lessons learned to enable natural language interaction with monitoring data and streamline system analysis.

AIDesignLLM

0 likes · 16 min read

How to Build an MCP Server for AI-Powered Observability: 6 Practical Design Tips

ITFLY8 Architecture Home

Jun 2, 2025 · Artificial Intelligence

Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide

This article provides a systematic overview of existing LLM AI Agent communication protocols, categorizing them into four major types, detailing their functions, benefits, and use‑cases, and compares four representative protocols—MCP, A2A, ANP, and Agora—through a concrete travel‑planning scenario.

AI AgentCommunication ProtocolLLM

0 likes · 11 min read

Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide

Fighter's World

Jun 2, 2025 · Artificial Intelligence

Why Is Context King for Large Language Models?

This article provides a comprehensive technical analysis of LLM context, covering its definition, types, tokenization, window‑size evolution, diminishing returns, management techniques such as RAG, CoT, memory‑as‑a‑service, and future challenges like multimodal fusion, privacy, and autonomous agent memory.

Agent MemoryLLMMemory-as-a-Service

0 likes · 48 min read

Why Is Context King for Large Language Models?

Radish, Keep Going!

Jun 2, 2025 · Databases

Human‑Crafted XOR Trick Beats LLMs in Detecting Redis Vector Set Bugs

The author recounts fixing a complex Redis Vector Sets bug, explores how human creativity outperforms LLMs in devising efficient data‑consistency checks, and shares experimental ideas—including XOR accumulators and MurmurHash—to detect non‑mutual links in large HNSW graphs.

Data ConsistencyLLMVector Sets

0 likes · 8 min read

Human‑Crafted XOR Trick Beats LLMs in Detecting Redis Vector Set Bugs

JavaEdge

May 30, 2025 · Artificial Intelligence

How to Build a Deep Research Workflow in Dify Using AI Agents

This guide explains how to construct a deep research workflow in Dify that leverages AI agents, loop variables, and structured outputs to automatically explore complex topics, gather sources, and synthesize comprehensive reports with proper citations.

AI workflowAgentDeep Research

0 likes · 9 min read

How to Build a Deep Research Workflow in Dify Using AI Agents

Instant Consumer Technology Team

May 30, 2025 · Artificial Intelligence

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

This article explains how the Model Context Protocol (MCP) standardizes AI‑assistant communication, compares the traditional Server‑Sent Events (SSE) transport with the newer Streamable HTTP mechanism, and provides step‑by‑step code examples for building both MCP servers and clients that leverage Streamable HTTP for bidirectional, session‑aware data exchange.

AILLMMCP

0 likes · 22 min read

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive