Tagged articles

2016 articles

Page 14 of 21

Apr 4, 2025 · Artificial Intelligence

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

This article surveys the latest research on large language model reasoning, highlighting test‑time scaling methods, chain‑of‑thought variants, and novel inference‑time techniques that boost performance while exposing trade‑offs, costs, and future directions for AI developers.

AILLMTest-Time Scaling

0 likes · 26 min read

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

Ops Development & AI Practice

Apr 4, 2025 · Artificial Intelligence

Decoding LLM Endpoint Features: Quantization, Tokens, and Tool Support Explained

This article breaks down the key endpoint features of large language models—such as quantization, max token limits, streaming cancellation, tool support, and reasoning ability—explaining what each term means, why it matters, and how to choose models wisely for different applications.

AI model evaluationEndpoint FeaturesLLM

0 likes · 11 min read

Decoding LLM Endpoint Features: Quantization, Tokens, and Tool Support Explained

Ops Development & AI Practice

Apr 3, 2025 · Artificial Intelligence

What Powers LLMs? Unpacking Transformers, Architectures, and Context Windows

This article explains the core Transformer architecture behind large language models, compares encoder‑decoder and decoder‑only designs, and dives into the crucial concept of the context window, including its limits, examples, and ongoing research to extend it.

AI ArchitectureContext WindowLLM

0 likes · 10 min read

What Powers LLMs? Unpacking Transformers, Architectures, and Context Windows

Alimama Tech

Apr 3, 2025 · Artificial Intelligence

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

UQABench introduces the first benchmark for assessing high‑density user embeddings that serve as soft prompts in LLM‑driven recommendation, featuring a three‑stage pre‑train‑align‑evaluate pipeline, seven personalized QA tasks, and findings that transformer encoders, side‑information, simple linear adapters, and larger models markedly improve accuracy while cutting input tokens to about five percent.

AILLMbenchmark

0 likes · 12 min read

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

ByteDance Cloud Native

Apr 3, 2025 · Operations

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.

APMAPMPlusCloudWeGo

0 likes · 13 min read

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

MaGe Linux Operations

Apr 3, 2025 · Artificial Intelligence

How to Build and Deploy a Dify LLM Application Platform on CentOS

This guide explains what Dify is, outlines its key features and application scenarios, and provides step‑by‑step instructions for preparing the environment, installing Docker and Docker‑Compose, and deploying Dify on a CentOS 7.9 system, including verification of a successful setup.

AI PlatformDifyDocker

0 likes · 9 min read

How to Build and Deploy a Dify LLM Application Platform on CentOS

BirdNest Tech Talk

Apr 3, 2025 · Artificial Intelligence

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Genspark’s newly released Super Agent, built on a Mixture‑of‑Agents architecture that combines eight specialized LLMs and over 80 tools, claims to autonomously plan, execute, and integrate external services across tasks such as travel planning and video summarization, and reportedly surpasses OpenAI and Manus in the GAIA benchmark while offering instant access without an invitation code.

AI AgentGAIA benchmarkLLM

0 likes · 4 min read

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Big Data Technology & Architecture

Apr 3, 2025 · Artificial Intelligence

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

This article explains the Model Context Protocol (MCP) as a standard for LLM‑data integration, describes Retrieval‑Augmented Generation (RAG) techniques to reduce hallucinations, and introduces vector databases like Milvus that store high‑dimensional embeddings for efficient AI retrieval tasks.

LLMMCPMilvus

0 likes · 7 min read

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

DevOps

Apr 2, 2025 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

This article explains Retrieval‑Augmented Generation (RAG), its role in mitigating large language model knowledge cutoff and hallucination, outlines the evolution from naive to advanced, modular, graph, and agentic RAG, and discusses future directions such as intelligent and multi‑modal RAG systems.

Artificial IntelligenceKnowledge RetrievalLLM

0 likes · 10 min read

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

AntTech

Apr 2, 2025 · Artificial Intelligence

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.

Attention Re-weightingLLMPEAR

0 likes · 6 min read

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

JD Retail Technology

Apr 2, 2025 · Artificial Intelligence

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

The paper introduces One4All, a scalable multi‑task generative recommendation framework for CPS advertising that combines few‑shot intent prompting, a Rewards‑in‑Context multi‑objective optimization, and an online model‑selection strategy, delivering 2‑3× offline HitRate/NDCG gains and notable online CTR, CVR, and commission improvements.

AdvertisingLLMlarge language models

0 likes · 14 min read

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

AI Algorithm Path

Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM

0 likes · 5 min read

Master the Three Essential LLM Training Stages for 2025

Huolala Tech

Apr 1, 2025 · Frontend Development

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

This article explains how frontend developers can use large language models to detect and prevent marketing content violations in WeChat mini‑programs, covering pain‑point discovery, LLM‑driven compliance architecture, prompt optimization, model selection, testing methods, and seamless frontend integration with Feishu notifications.

AIIntegrationLLM

0 likes · 10 min read

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

Code Mala Tang

Mar 31, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Function Calling with Mistral, Llama, and Qwen

This tutorial explains how large language models can use function calling to access real‑time data, walks through setting up a Flask endpoint, demonstrates integration with Mistral Small, Llama 3.2‑1B, and Qwen models, and provides complete Python code examples for end‑to‑end execution.

APIFunction CallingLLM

0 likes · 10 min read

Unlocking LLM Power: A Hands‑On Guide to Function Calling with Mistral, Llama, and Qwen

Efficient Ops

Mar 31, 2025 · Artificial Intelligence

How the Model Context Protocol (MCP) Is Revolutionizing AI Operations

The Model Context Protocol (MCP) lets large language models safely and directly access diverse data sources and tools, breaking data silos and enabling seamless AI‑driven automation across development, operations, and multi‑agent workflows.

AI integrationLLMMCP

0 likes · 5 min read

How the Model Context Protocol (MCP) Is Revolutionizing AI Operations

Architect's Alchemy Furnace

Mar 31, 2025 · Artificial Intelligence

How to Deploy and Run Large Language Models with Xinference: A Step‑by‑Step Guide

Xinference is a powerful distributed inference framework that enables quick deployment and efficient serving of open‑source large language models via Docker or source installation, offering Web UI, CLI, and API interfaces with detailed setup, model launching, and Chatbox integration instructions.

APIDockerInference

0 likes · 11 min read

How to Deploy and Run Large Language Models with Xinference: A Step‑by‑Step Guide

Architect

Mar 31, 2025 · Artificial Intelligence

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

This paper presents a systematic investigation of failure patterns in LLM‑driven multi‑agent systems, introducing a 14‑type taxonomy (MASFT) derived from over 150 annotated dialogues, evaluating it with an LLM‑as‑a‑judge pipeline, and exploring modest intervention strategies while releasing all data and tools for future research.

AIAgenticLLM

0 likes · 29 min read

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

Baobao Algorithm Notes

Mar 30, 2025 · Artificial Intelligence

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

The article analyses two months of community attempts to reproduce DeepSeek R1, highlighting that model scaling, high‑quality data, robust training infrastructure, and careful hyper‑parameter tuning outweigh pure reward‑based tricks, and it outlines common pitfalls and future research directions.

DeepSeekInfrastructureLLM

0 likes · 13 min read

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

Rare Earth Juejin Tech Community

Mar 30, 2025 · Backend Development

Implementing Model Context Protocol (MCP) with SSE and HTTP in SpringBoot

This article explains the Model Context Protocol (MCP) for seamless LLM integration, describes its background, presents a sequence diagram of its architecture, and provides step‑by‑step Java SpringBoot code for SSE streaming, HTTP POST handling, and annotation‑based tool registration.

BackendLLMMCP

0 likes · 11 min read

Implementing Model Context Protocol (MCP) with SSE and HTTP in SpringBoot

Architect

Mar 29, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

This article guides developers without an AI background through the fundamentals of building large‑language‑model applications, covering prompt engineering, multi‑turn interaction, function calling, retrieval‑augmented generation, vector databases, code assistants, and the MCP protocol for AI agents.

AI AgentEmbeddingFunction Calling

0 likes · 51 min read

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

Qborfy AI

Mar 29, 2025 · Artificial Intelligence

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

This tutorial walks through the limitations of simple prompt usage, introduces LangChain as a framework for building full‑featured LLM applications, explains its core concepts and components, and provides step‑by‑step code examples for installing, configuring, and running a basic LangChain demo.

AI ApplicationLLMLangChain

0 likes · 11 min read

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

DevOps

Mar 27, 2025 · Artificial Intelligence

From Personal AI Tools to Industry Platforms: A Multi-Level Framework for AI Application Development

The article outlines a hierarchical model for AI application development, from basic user tools through personal assistants, SOP platforms, industry tools, and base models, emphasizing the importance of industry know‑how, data quality, and engineering to overcome model limitations and drive practical AI adoption.

AILLMSOP

0 likes · 24 min read

From Personal AI Tools to Industry Platforms: A Multi-Level Framework for AI Application Development

Architect's Alchemy Furnace

Mar 27, 2025 · Artificial Intelligence

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

This article provides a comprehensive side‑by‑side comparison of the open‑source LLM serving tools Xinference and Ollama, examining their core goals, architecture, model support, deployment options, performance, ecosystem integration, typical use cases, future roadmap, and guidance on selecting the right solution for enterprise or personal projects.

ComparisonLLMModel Serving

0 likes · 7 min read

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

JavaEdge

Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

Artificial IntelligenceDeep LearningLLM

0 likes · 8 min read

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

AI Large Model Application Practice

Mar 27, 2025 · Artificial Intelligence

Mastering AutoGen 0.4: Build Multi‑Agent Tools with Python and MCP

This article walks through the major changes in Microsoft AutoGen 0.4, explains its layered modular architecture and event‑driven multi‑agent design, details the built‑in Tools types, and provides step‑by‑step Python code for creating a Tools Agent and integrating it with an MCP server.

AutoGenLLMMCP

0 likes · 9 min read

Mastering AutoGen 0.4: Build Multi‑Agent Tools with Python and MCP

Baobao Algorithm Notes

Mar 27, 2025 · Artificial Intelligence

Why a Robust Training Pipeline Beats Fancy LLM Tricks – Lessons from DAPO

The article analyzes the DAPO technical report, showing how dynamic‑sampling pipelines and token‑level loss handling in SFT and RL training outperform ad‑hoc algorithm tricks, and compares the training dynamics of reinforce_baseline and GRPO with concrete code examples.

Dynamic SamplingGRPOLLM

0 likes · 8 min read

Why a Robust Training Pipeline Beats Fancy LLM Tricks – Lessons from DAPO

DevOps

Mar 26, 2025 · Artificial Intelligence

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

The article explains Anthropic's open Model Context Protocol (MCP), detailing its client‑server architecture, resource and prompt definitions, tool discovery and execution, sampling workflow, security features, and provides a complete Python example that demonstrates building, running, and testing an MCP server and client for real‑time data retrieval.

AI integrationLLMMCP

0 likes · 12 min read

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

Architect

Mar 26, 2025 · Artificial Intelligence

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

This article explains the fundamentals of AI agent memory—including short‑term, long‑term, and working memory types and their storage designs—and then details Dify's knowledge‑base segmentation modes, indexing strategies, and retrieval configurations for effective RAG applications.

Agent MemoryDifyKnowledge Base

0 likes · 14 min read

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

Architecture Digest

Mar 26, 2025 · Artificial Intelligence

Getting Started with LangChain in Java: Building Large Language Model Applications

This tutorial introduces the fundamentals of LangChain, explains large language models, prompt engineering, word embeddings, and demonstrates how to use the Java implementation LangChain4j with Maven dependencies, model I/O, memory, retrieval, chains, and agents to build sophisticated LLM‑driven applications.

AILLMLangChain

0 likes · 18 min read

Getting Started with LangChain in Java: Building Large Language Model Applications

DaTaobao Tech

Mar 26, 2025 · Artificial Intelligence

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

The article surveys Retrieval‑Augmented Generation (RAG) as a solution to large language model limits—such as outdated knowledge, hallucinations, and security risks—by integrating vector‑database retrieval with LLM generation, and discusses related tools, multi‑agent frameworks, prompt engineering, fine‑tuning methods, and emerging optimization trends.

AI applicationsLLMRAG

0 likes · 29 min read

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

ELab Team

Mar 26, 2025 · Artificial Intelligence

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Large language models often struggle with coding tasks, failing to stop when encountering obstacles, ignoring black‑box testing principles, and making unnecessary refactors; this article examines those blind spots, offers practical examples, and suggests strategies such as preparatory refactoring, stateless tools, and careful prompting to improve AI‑assisted development.

AI CodingLLMbest practices

0 likes · 59 min read

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Network Intelligence Research Center (NIRC)

Mar 26, 2025 · Artificial Intelligence

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

The paper introduces MHA2MLA, a data‑efficient fine‑tuning framework that converts pre‑trained multi‑head attention LLMs to DeepSeek’s Multi‑Head Latent Attention architecture, achieving up to 92% KV‑cache compression with less than 0.5% performance loss on long‑context tasks.

LLMLow-Rank ApproximationMulti-Head Latent Attention

0 likes · 8 min read

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

Programmer DD

Mar 25, 2025 · Artificial Intelligence

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

This article demonstrates how to implement the Model Context Protocol (MCP) using Spring AI, covering the creation of MCP hosts, clients, and servers, configuring dependencies, integrating Claude, adding Brave Search and filesystem tools, and building a functional chatbot that leverages external data sources through standardized LLM interfaces.

LLMModel Context Protocolai-integration

0 likes · 15 min read

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

21CTO

Mar 25, 2025 · Artificial Intelligence

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

This article breaks down major large language models, defining key comparison metrics such as speed, hallucination rate, and context window, then evaluates each model with benchmarks like HumanEval+, ChatBot Arena, and Aider to help you choose the most suitable LLM for your coding tasks.

AILLMbenchmark

0 likes · 10 min read

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

Open Source Tech Hub

Mar 24, 2025 · Artificial Intelligence

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

This article explains the data‑isolation problem facing large language models, introduces the Model Context Protocol (MCP) as a standard bridge to external data sources, and provides a step‑by‑step PHP SDK tutorial—including installation, server and client code, and optional advanced logging—to help developers integrate AI models securely and efficiently.

LLMMCPModel Context Protocol

0 likes · 13 min read

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

AI Algorithm Path

Mar 24, 2025 · Artificial Intelligence

How to Use Pydantic for Structured LLM Output

The article explains why LLM responses can be inconsistent, introduces Pydantic as a way to define custom output schemas, and walks through concrete examples—both with OpenAI and Ollama models—showing how to build a LangChain pipeline that parses responses into structured data.

LLMLangChainOllama

0 likes · 7 min read

How to Use Pydantic for Structured LLM Output

JavaEdge

Mar 24, 2025 · Artificial Intelligence

Why Large Language Models Still Struggle with Complex Reasoning – Challenges and Solutions

This article examines the fundamental reasoning limitations of large language models, illustrates real‑world failure cases, and outlines current research directions such as better datasets, chain‑of‑thought prompting, external verification, and specialized solvers to improve their logical capabilities.

AIChain-of-ThoughtLLM

0 likes · 8 min read

Why Large Language Models Still Struggle with Complex Reasoning – Challenges and Solutions

Alibaba Cloud Developer

Mar 24, 2025 · Artificial Intelligence

Boost LLM Evaluation with Semantic Enrichment and Vector Search

This article explains how semantic enrichment, vector and hybrid search, and clustering techniques can be applied to large language model logs to evaluate inputs and outputs, improve compliance auditing, and enhance model iteration across various business scenarios.

AILLMVector Search

0 likes · 12 min read

Boost LLM Evaluation with Semantic Enrichment and Vector Search

Alibaba Cloud Developer

Mar 24, 2025 · Artificial Intelligence

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

This article analyses the shortcomings of large‑model internet search—such as unverifiable sources, fabricated content, and poor instruction compliance—by comparing Qwen‑max, Doubao‑1.5‑pro‑256k, and DeepSeek‑v3, and proposes prompt engineering, post‑processing, and custom tool improvements to boost reliability.

AILLMevaluation

0 likes · 22 min read

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

AI Large Model Application Practice

Mar 24, 2025 · Artificial Intelligence

How to Build a Multimodal RAG Pipeline for PPT Documents with Vision LLMs

This article explains a step‑by‑step implementation of a multimodal Retrieval‑Augmented Generation system that parses PPT/PDF files, extracts rich text and images with vision models, indexes them in a vector store, and generates answers that combine markdown and relevant slide screenshots.

LLMMultimodalPython

0 likes · 9 min read

How to Build a Multimodal RAG Pipeline for PPT Documents with Vision LLMs

Ma Wei Says

Mar 24, 2025 · Artificial Intelligence

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Explore the BGE (BAAI General Embedding) family—including v1, v1.5, M3, Multilingual Gemma2, and EN‑ICL—detailing their multilingual capabilities, model variants, token limits, optimal use cases, and step‑by‑step installation and Python usage instructions with code examples for embedding generation and similarity scoring.

EmbeddingLLMPython

0 likes · 8 min read

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Alibaba Cloud Big Data AI Platform

Mar 24, 2025 · Artificial Intelligence

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

This article explains the challenges LLMs face in data analysis, introduces the Model Context Protocol (MCP) as a standard bridge, and provides a step‑by‑step guide to integrate Hologres, MCP, and large language models—using Claude Desktop as an example—to create a fast, multi‑source data‑analysis agent.

AI AgentHologresLLM

0 likes · 11 min read

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

Architect

Mar 23, 2025 · Artificial Intelligence

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

The article argues that the next wave of AI agents will shift from brittle, prompt‑driven workflows like Manus to truly autonomous, model‑centric agents trained with reinforcement learning and reasoning, exemplified by OpenAI's DeepResearch and Anthropic's Claude Sonnet 3.7, while the API‑driven market model collapses.

AI agentsClaudeDeepResearch

0 likes · 28 min read

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

Baobao Algorithm Notes

Mar 23, 2025 · Artificial Intelligence

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

The article argues that the next generation of AI agents should focus on improving the model itself through reinforcement learning and reasoning rather than relying on pre‑designed prompt‑driven workflows, highlighting industry trends, technical challenges, and the shift toward treating models as products.

DeepSearchLLMmodel as product

0 likes · 29 min read

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

Architect

Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI reliabilityLLMRAG

0 likes · 32 min read

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Cognitive Technology Team

Mar 22, 2025 · Artificial Intelligence

Three Stages of Developing Large Language Models and Practical Guidance

The article outlines the three development phases of large language models—building, pre‑training, and fine‑tuning—describes usage options, highlights key factors such as data scale, architecture, training processes, and evaluation, and offers practical advice for cost‑effective development.

Fine-tuningLLMModel Development

0 likes · 3 min read

Three Stages of Developing Large Language Models and Practical Guidance

Baobao Algorithm Notes

Mar 21, 2025 · Artificial Intelligence

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

This article provides a comprehensive technical overview of large language model post‑training, covering fine‑tuning methods (full, parameter‑efficient, LoRA families, prompt tuning), domain‑adaptive tuning, reinforcement‑learning reward modeling, process vs. outcome rewards, inference‑enhancement strategies, dynamic compute allocation, verifier‑augmented reasoning, current challenges, and emerging research directions such as meta‑cognition, physical reasoning, and swarm intelligence.

LLMmeta-cognitionpost-training

0 likes · 21 min read

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

Meituan Technology Team

Mar 20, 2025 · Artificial Intelligence

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

The article compiles Meituan’s recent 2024‑2025 research on large language models, presenting a diverse set of papers that explore transformer enhancements, scaling laws, safety optimization, instruction fine‑tuning, temporal decay learning, code generation, agent refinement, cost‑efficient MoE inference, quantization, fast parallel inference, speculative decoding, multilingual speech, vision‑language models, evaluation benchmarks, and jailbreak robustness.

ACLAILLM

0 likes · 4 min read

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

AI Large Model Application Practice

Mar 20, 2025 · Artificial Intelligence

Mastering Model Context Protocol (MCP): Build AI Agents with LlamaIndex & LangGraph

This guide explains the Model Context Protocol (MCP), its architecture, and how to create and debug MCP servers and clients in Python, then shows how to integrate third‑party MCP servers with LlamaIndex or LangGraph to quickly build powerful LLM agents.

LLMLangGraphLlamaIndex

0 likes · 12 min read

Mastering Model Context Protocol (MCP): Build AI Agents with LlamaIndex & LangGraph

Sohu Tech Products

Mar 19, 2025 · Artificial Intelligence

How to Recreate a Translation Agent with LangGraph and LLMs

This guide demonstrates building a steerable LLM‑based translation workflow using LangGraph, covering the initial translation, model‑generated reflection suggestions, and final improvement steps with full Python code examples and a complete execution result.

AILLMLangGraph

0 likes · 34 min read

How to Recreate a Translation Agent with LangGraph and LLMs

Ops Development & AI Practice

Mar 19, 2025 · Artificial Intelligence

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Integrating large language models with the open‑standard Model Context Protocol enables direct access to file systems, databases, and APIs, unlocking use cases such as automated file management, intelligent data analysis, personalized content generation, and task automation, while also raising security, privacy, and maturity challenges for future AI‑human collaboration.

LLMMCPcross-domain

0 likes · 10 min read

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Ops Development & AI Practice

Mar 19, 2025 · Artificial Intelligence

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

Cache‑augmented generation (CAG) preloads documents into LLM context using KV caches to eliminate retrieval latency, offering faster inference for static knowledge bases, while RAG remains more flexible for dynamic or large corpora; this article compares their definitions, performance, implementation steps, and future prospects.

CAGCache AugmentationInference Optimization

0 likes · 11 min read

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

DaTaobao Tech

Mar 19, 2025 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Retrieval‑augmented generation (RAG) enhances large language models by integrating a preprocessing pipeline—cleaning, chunking, embedding, and vector storage—with a query‑driven retrieval and prompt‑injection workflow, leveraging vector databases, multi‑stage recall, advanced prompting, and comprehensive evaluation metrics to mitigate knowledge cut‑off, hallucinations, and security issues.

LLMRAGRetrieval Augmented Generation

0 likes · 27 min read

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Tencent Cloud Developer

Mar 19, 2025 · Artificial Intelligence

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Tencent's new Hunyuan Turbo S model combines a 44% faster response time, dramatically lower token costs, and a hybrid Mamba‑Transformer architecture that merges linear attention with full attention, offering insights into fast‑thinking versus slow‑thinking LLM designs, MoE scaling laws, low‑precision training effects, and long‑short chain fusion techniques.

AIArchitectureHybridMambaLLM

0 likes · 14 min read

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Alibaba Cloud Infrastructure

Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceCloud Native

0 likes · 25 min read

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

JD Tech Talk

Mar 18, 2025 · Artificial Intelligence

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

This article surveys recent advances in generative recommendation for CPS advertising, detailing explicit intent‑aware controllable product recommendation, multi‑objective optimization techniques based on reward‑in‑context and DPO, and the scalable One4All framework that unifies behavior and language modeling across diverse ad scenarios.

CPS advertisingGenerative RecommendationLLM

0 likes · 14 min read

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

JD Cloud Developers

Mar 18, 2025 · Artificial Intelligence

How Generative LLMs Are Transforming CPS Advertising Recommendations

Since large language models have excelled in NLP, researchers are now enhancing CPS advertising recommendation systems by integrating generative LLMs for explicit intent perception, multi‑objective optimization, and a unified One4All framework, achieving significant offline and online performance gains across click‑through, conversion, and revenue metrics.

CPS advertisingGenerative RecommendationLLM

0 likes · 19 min read

How Generative LLMs Are Transforming CPS Advertising Recommendations

AI Large Model Application Practice

Mar 18, 2025 · Artificial Intelligence

Master OpenAI’s New Agents SDK: 10 Core Concepts with a Complete Example

This guide walks you through OpenAI's open‑source Agents SDK, explaining ten essential concepts—from model configuration and agent creation to runners, tools, context handling, guardrails, handoffs, structured output, tracing, and orchestration—while providing runnable Python code and visual demos.

LLMOpenAI AgentsPython

0 likes · 17 min read

Master OpenAI’s New Agents SDK: 10 Core Concepts with a Complete Example

AI Algorithm Path

Mar 17, 2025 · Artificial Intelligence

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

The article defines Agentic AI as autonomous, goal‑directed systems that can act and learn from experience, contrasts it with Generative AI’s passive, single‑step content generation, and illustrates the practical advantage of Agentic workflows through Andrew Ng’s HumanEval benchmark where a step‑wise approach outperforms zero‑shot prompting even for older models.

AI autonomyAgentic AIHumanEval

0 likes · 10 min read

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

Infra Learning Club

Mar 17, 2025 · Artificial Intelligence

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

The author walks through installing OpenManus, configuring it to use DeepSeek (and an Ollama‑based vision model), runs a sample financial data query, and reports that the system is slow, sometimes inaccurate, and still requires further optimization.

AI agentsCondaDeepSeek

0 likes · 5 min read

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

Ops Development & AI Practice

Mar 17, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Open WebUI offers a user‑friendly, open‑source web interface that simplifies interaction with large language models, supporting multiple back‑ends, offline operation, and extensible plugins, making AI experimentation accessible for developers, researchers, and enthusiasts alike.

AILLMModel Management

0 likes · 4 min read

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Alibaba Cloud Native

Mar 17, 2025 · Cloud Native

How to Deploy DeepSeek as an Enterprise AI Assistant on DingTalk Using Alibaba Cloud

This guide walks you through deploying the DeepSeek large‑language model on Alibaba Cloud PAI, integrating it with DingTalk via the Magic Wand AI platform, and configuring multi‑model routing, authentication, rate limiting, content safety, caching, web‑search, and observability using the Cloud Native API Gateway.

AIAlibaba CloudDingTalk

0 likes · 15 min read

How to Deploy DeepSeek as an Enterprise AI Assistant on DingTalk Using Alibaba Cloud

Alibaba Cloud Infrastructure

Mar 17, 2025 · Cloud Native

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

This guide demonstrates how to deploy the QwQ‑32B large language model on an Alibaba Cloud ACK cluster, configure OSS storage, enable the ACK Gateway with AI Extension, set up InferencePool and InferenceModel resources, and benchmark intelligent routing versus standard gateway routing, revealing latency and throughput improvements.

ACK GatewayAI ExtensionKubernetes

0 likes · 16 min read

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

Cognitive Technology Team

Mar 17, 2025 · Artificial Intelligence

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Large language models can assist and enhance each stage of traditional machine learning—including sample generation, data cleaning, feature engineering, model selection, hyper‑parameter tuning, and workflow automation—by generating synthetic data, refining features, selecting models, and orchestrating pipelines, though challenges such as bias, privacy, and noise remain.

Data GenerationLLMfeature engineering

0 likes · 11 min read

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Spring Full-Stack Practical Cases

Mar 17, 2025 · Backend Development

Generate SQL with Spring AI: LLM‑Powered Queries in Spring Boot 3

This article demonstrates how to use Spring AI with a large language model to automatically generate and execute SELECT SQL statements in a Spring Boot 3 application, covering dependency setup, configuration files, prompt templates, controller implementation, and testing with example scripts.

LLMSQL generationSpring Boot

0 likes · 9 min read

Generate SQL with Spring AI: LLM‑Powered Queries in Spring Boot 3

Ops Development & AI Practice

Mar 16, 2025 · Artificial Intelligence

How Function Calling Helps LLMs Overcome Hallucinations

This article explains how LLM function calling works, from defining external functions to processing API responses, and demonstrates a Python example using OpenAI's ChatGPT‑4o to fetch real‑time weather, showing how the technique mitigates hallucinations and expands practical AI applications.

AIFunction CallingLLM

0 likes · 8 min read

How Function Calling Helps LLMs Overcome Hallucinations

Architect

Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG

0 likes · 11 min read

Why Building Your Own RAG System Is a Costly Mistake

AI Algorithm Path

Mar 15, 2025 · Artificial Intelligence

Why the Industry Is Shifting From AI Agents to Agentic Workflows

The article explains that low accuracy and security risks of current AI agents—evidenced by a Claude AI Agent achieving only 14% of human performance and an average success rate of about 20%—are driving a move toward agentic workflows, which offer observable, auditable, and data‑synthesizing pipelines that dramatically improve enterprise productivity.

AI agentsLLMagentic workflows

0 likes · 7 min read

Why the Industry Is Shifting From AI Agents to Agentic Workflows

DataFunSummit

Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPUInferenceLLM

0 likes · 13 min read

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

AI Large Model Application Practice

Mar 14, 2025 · Artificial Intelligence

Why Softmax Is the Secret Behind LLM Probabilities and Creative Generation

This article explains how the Softmax function converts raw neural‑network scores into a proper probability distribution, why this conversion is essential for training and inference in large language models, and how the temperature parameter shapes the model's creativity and diversity.

LLMSoftmaxTemperature

0 likes · 9 min read

Why Softmax Is the Secret Behind LLM Probabilities and Creative Generation

Baidu Geek Talk

Mar 12, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

This article reviews how large language models (LLMs) enhance semantic text embeddings by comparing traditional methods with LLM‑based approaches, detailing synthetic data generation, backbone model designs, key model families, experimental results on the MTEB benchmark, and future research challenges.

LLMcontrastive learningmodel comparison

0 likes · 30 min read

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

Alibaba Cloud Developer

Mar 12, 2025 · Artificial Intelligence

Deploy Alibaba Cloud’s QwQ-32B LLM: Benchmarks, Agent Features, and One‑Click Setup

This guide introduces Alibaba Cloud’s open‑source QwQ-32B large language model, highlights its superior benchmark performance over competing models, explains its integrated agent capabilities, and provides step‑by‑step instructions for one‑click deployment via the PAI‑Model Gallery.

Alibaba CloudLLMModel Deployment

0 likes · 7 min read

Deploy Alibaba Cloud’s QwQ-32B LLM: Benchmarks, Agent Features, and One‑Click Setup

DaTaobao Tech

Mar 12, 2025 · Artificial Intelligence

Multimodal Automatic Layout Generation for E-commerce

The project develops a multimodal automatic layout generation system for e‑commerce by fine‑tuning the qwen‑vl‑7b vision‑language model with LoRA on poster and Taobao image‑layout data, employing diffusion‑based image generation and coordinate‑prediction methods to produce structured layouts that power poster, marketing image, and video‑cover creation with over 90% adoption, while exploring multi‑image, style‑aware, and iterative refinement extensions.

LLMMultimodal AIdiffusion

0 likes · 12 min read

Multimodal Automatic Layout Generation for E-commerce

Cognitive Technology Team

Mar 11, 2025 · Artificial Intelligence

Deploying DeepSeek R1:7b Model Locally with Ollama and Building AI Applications Using Dify

This tutorial explains how to set up Ollama for CPU or GPU environments, run the DeepSeek R1:7b large language model, and use the open‑source Dify platform to create and deploy a custom AI application, providing step‑by‑step commands and configuration details.

AIDeepSeekDify

0 likes · 8 min read

Deploying DeepSeek R1:7b Model Locally with Ollama and Building AI Applications Using Dify

NewBeeNLP

Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts

0 likes · 18 min read

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

Tencent Cloud Developer

Mar 11, 2025 · Artificial Intelligence

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

AIFine-tuningLLM

0 likes · 17 min read

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

Architect

Mar 10, 2025 · Artificial Intelligence

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

This article analyzes DeepSeek’s latest large‑model breakthroughs, covering the MLA attention compression, GRPO alignment algorithm, MoE load‑balancing redesign, multi‑stage training pipelines, reinforcement‑learning tricks, and performance comparisons with GPT‑4o‑Mini and Llama 3.1, highlighting both strengths and remaining challenges.

AI trainingDeepSeekGRPO

0 likes · 19 min read

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

AI Algorithm Path

Mar 10, 2025 · Artificial Intelligence

How Much GPU Memory Does an LLM Service Really Need?

This article explains a simple formula for estimating the GPU VRAM required to serve large language models, demonstrates the calculation with a 7‑billion‑parameter example, clarifies why a 20% safety buffer is needed, and offers practical strategies such as quantization, CPU offload, and multi‑GPU parallelism to reduce memory usage.

DeploymentGPU MemoryLLM

0 likes · 6 min read

How Much GPU Memory Does an LLM Service Really Need?

Tencent Technical Engineering

Mar 10, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

This guide shows non‑AI developers how to create large‑model applications by mastering prompt engineering, multi‑turn interactions, Retrieval‑Augmented Generation, function calling, and AI‑Agent integration, with practical code examples, tool design patterns, and deployment tips.

AI AgentEmbeddingFunction Calling

0 likes · 48 min read

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

Baobao Algorithm Notes

Mar 10, 2025 · Artificial Intelligence

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

This article provides a detailed technical analysis of FP8 training, comparing Nvidia’s TransformerEngine approach with DeepSeek V3’s novel scheme, and examines how block‑wise scaling, high‑precision accumulation, and vector length and correlation affect quantization error and signal‑to‑noise ratio in large‑language‑model training.

DeepSeekFP8LLM

0 likes · 20 min read

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

phodal

Mar 10, 2025 · Artificial Intelligence

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

AutoDev Bridge combines large‑model reasoning, C4 architecture analysis, AST‑based business logic extraction, and IDE‑integrated tooling to automate the migration of legacy systems, reducing manual effort and migration risk while highlighting the unique advantages of modern AI agents.

AICode TranslationLLM

0 likes · 7 min read

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

Java Architecture Diary

Mar 10, 2025 · Artificial Intelligence

Simplify Java AI Integration with Spring AI Custom Annotations and AI Services

AI Services, inspired by Spring Data JPA and Retrofit, offers a declarative Java API that abstracts LLM interactions, supporting input formatting, output parsing, chat memory, function calling, and RAG, with detailed examples using LangChain4j, custom Spring AI annotations, AOP aspects, and controller integration.

AI servicesLLMaop

0 likes · 7 min read

Simplify Java AI Integration with Spring AI Custom Annotations and AI Services

DevOps

Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentEmbeddingFunction Calling

0 likes · 48 min read

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

Architects' Tech Alliance

Mar 9, 2025 · Industry Insights

How DeepSeek’s LLMs Slash Training Costs and Reshape China’s Compute Landscape

DeepSeek’s three‑model LLM lineup—V3, R1‑Zero and R1—delivers high performance while cutting training expenses to under $600 k, a fraction of the $0.6‑1 B typical for comparable models, signaling a major shift in China’s AI compute demand and supply chain dynamics.

AI computeChinaDeepSeek

0 likes · 3 min read

How DeepSeek’s LLMs Slash Training Costs and Reshape China’s Compute Landscape

Alibaba Cloud Infrastructure

Mar 9, 2025 · Cloud Computing

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud Container Compute Service (ACS) to provision GPU resources, prepare the QwQ-32B model, configure persistent storage, deploy the model with vLLM, set up OpenWebUI, verify the service, and optionally benchmark its performance, all with detailed commands and YAML examples.

ACSAlibaba CloudGPU

0 likes · 17 min read

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

AI Frontier Lectures

Mar 9, 2025 · Industry Insights

Why the Model Is Becoming the Product: AI Market Trends and Risks

The article argues that AI models are evolving into standalone products, examines scaling limits, integration challenges, reinforcement‑learning economics, and investment dynamics, and warns that reliance on large‑lab APIs may jeopardize future profitability for integrators.

AIIndustryInsightsLLM

0 likes · 15 min read

Why the Model Is Becoming the Product: AI Market Trends and Risks

Alibaba Cloud Infrastructure

Mar 8, 2025 · Artificial Intelligence

Deploying QwQ-32B LLM with vLLM on Alibaba Cloud ACK and Configuring Intelligent Routing

This guide explains how to deploy the QwQ-32B large language model using vLLM on an Alibaba Cloud ACK Kubernetes cluster, configure storage, set up OpenWebUI, enable ACK Gateway with AI Extension for intelligent routing, and benchmark the inference service performance.

ACKInferenceKubernetes

0 likes · 17 min read

Deploying QwQ-32B LLM with vLLM on Alibaba Cloud ACK and Configuring Intelligent Routing

AI Product Manager Community

Mar 8, 2025 · Artificial Intelligence

Deploy OpenManus Locally and Let It Generate a Complete WeChat Mini‑Program

This article walks through installing OpenManus locally using Python 3.12, cloning its GitHub repository, configuring DeepSeek LLM credentials, launching the service, and prompting the agent to generate a full WeChat mini‑program, while sharing observations on performance, token cost, and limitations.

AI AgentDeepSeekLLM

0 likes · 5 min read

Deploy OpenManus Locally and Let It Generate a Complete WeChat Mini‑Program

Qunhe Technology Quality Tech

Mar 7, 2025 · Artificial Intelligence

How AI is Revolutionizing Software Testing: 2025 Roadmap and Real-World Successes

The Qunhe Technology Quality team outlines a 2025 strategy that leverages advanced AI models, a user-friendly AI testing platform, and AI‑driven automation to boost test efficiency, streamline workflows, and promote AI adoption across the testing organization.

AILLMefficiency

0 likes · 14 min read

How AI is Revolutionizing Software Testing: 2025 Roadmap and Real-World Successes

Alibaba Cloud Big Data AI Platform

Mar 7, 2025 · Artificial Intelligence

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

Alibaba Cloud's newly released QwQ-32B model delivers benchmark‑level performance rivaling top open‑source LLMs, integrates agent capabilities, and can be deployed with a single click through the PAI‑Model Gallery, offering a cost‑effective solution for developers seeking advanced AI inference.

AI BenchmarkAlibaba CloudLLM

0 likes · 5 min read

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

dbaplus Community

Mar 7, 2025 · Artificial Intelligence

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

This comprehensive guide explains what prompts are, outlines essential prompt components and multiple engineering frameworks, presents practical strategies for crafting clear and structured prompts, addresses model limitations such as hallucinations, and showcases a wide range of advanced prompting techniques with code examples.

AILLMfew-shot prompting

0 likes · 29 min read

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

DevOps

Mar 6, 2025 · Artificial Intelligence

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

This article explains how to create a high‑performance multi‑model chat agent on the Dify platform by combining DeepSeek‑R1 for reasoning and Gemini for answer generation, covering the underlying principles, configuration steps, API integration, performance benchmarks, and practical deployment guidance.

ChatbotDeepSeekDify

0 likes · 12 min read

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

Cognitive Technology Team

Mar 5, 2025 · Artificial Intelligence

Comparative Analysis of Java AI Frameworks: LangChain4j, Spring AI, and Agent-Flex

This article examines three leading Java AI frameworks—LangChain4j, Spring AI, and Agent-Flex—by comparing their architectures, core capabilities, and ideal use‑cases, helping developers choose the most suitable solution for enterprise, domestic, or rapid‑prototype projects.

AIAgent-FlexLLM

0 likes · 5 min read

Comparative Analysis of Java AI Frameworks: LangChain4j, Spring AI, and Agent-Flex

Cognitive Technology Team

Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

AgenticEnterprise searchLLM

0 likes · 18 min read

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

AI Algorithm Path

Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM

0 likes · 9 min read

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

Alibaba Cloud Developer

Mar 4, 2025 · Artificial Intelligence

Build a Smart Knowledge Base with DeepSeek R1 and Alibaba Cloud Low‑Code

This tutorial guides you through creating an AI‑powered, customizable knowledge space by integrating DeepSeek R1 via Alibaba Cloud Bailei's Model‑as‑a‑Service with the low‑code Mobinext platform, covering setup, configuration, deployment, and future expansion for multi‑tenant use.

AIAlibaba CloudDeepSeek

0 likes · 12 min read

Build a Smart Knowledge Base with DeepSeek R1 and Alibaba Cloud Low‑Code

Tencent Cloud Developer

Mar 4, 2025 · Artificial Intelligence

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

The guide teaches non‑AI developers how to build practical LLM‑powered applications by mastering prompt engineering, function calling, retrieval‑augmented generation, and AI agents, and introduces the Modal Context Protocol for seamless tool integration, offering a clear learning path to leverage large language models without deep theory.

AI AgentFunction CallingLLM

0 likes · 48 min read

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

Architect

Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

Code Mala Tang

Mar 3, 2025 · Artificial Intelligence

Unlock AI’s Full Potential with Structured Prompt Decorators

Prompt Decorators are structured prefixes that standardize and enhance AI responses, addressing common challenges like vague prompts, inconsistent answers, and lack of reasoning by guiding the model to produce clear, logical, and well‑organized outputs across various use cases.

AILLMautomation

0 likes · 23 min read

Unlock AI’s Full Potential with Structured Prompt Decorators

Fighter's World

Mar 3, 2025 · Artificial Intelligence

How OpenAI’s Deep Research Is Sparking a Wave of LLM‑Powered Search Experiments

The article explains what Deep Research agents are, walks through a concrete example of investigating the $6 million training cost controversy of DeepSeek V3, details the multi‑step plan‑edit‑execute workflow, and discusses broader implications for AI efficiency, market dynamics, and product design.

AI agentsDeep ResearchLLM

0 likes · 10 min read