Tagged articles
2016 articles
Page 14 of 21
AI Frontier Lectures
AI Frontier Lectures
Apr 4, 2025 · Artificial Intelligence

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

This article surveys the latest research on large language model reasoning, highlighting test‑time scaling methods, chain‑of‑thought variants, and novel inference‑time techniques that boost performance while exposing trade‑offs, costs, and future directions for AI developers.

AILLMTest-Time Scaling
0 likes · 26 min read
Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025
Alimama Tech
Alimama Tech
Apr 3, 2025 · Artificial Intelligence

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

UQABench introduces the first benchmark for assessing high‑density user embeddings that serve as soft prompts in LLM‑driven recommendation, featuring a three‑stage pre‑train‑align‑evaluate pipeline, seven personalized QA tasks, and findings that transformer encoders, side‑information, simple linear adapters, and larger models markedly improve accuracy while cutting input tokens to about five percent.

AILLMbenchmark
0 likes · 12 min read
UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems
ByteDance Cloud Native
ByteDance Cloud Native
Apr 3, 2025 · Operations

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.

APMAPMPlusCloudWeGo
0 likes · 13 min read
How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability
MaGe Linux Operations
MaGe Linux Operations
Apr 3, 2025 · Artificial Intelligence

How to Build and Deploy a Dify LLM Application Platform on CentOS

This guide explains what Dify is, outlines its key features and application scenarios, and provides step‑by‑step instructions for preparing the environment, installing Docker and Docker‑Compose, and deploying Dify on a CentOS 7.9 system, including verification of a successful setup.

AI PlatformDifyDocker
0 likes · 9 min read
How to Build and Deploy a Dify LLM Application Platform on CentOS
BirdNest Tech Talk
BirdNest Tech Talk
Apr 3, 2025 · Artificial Intelligence

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Genspark’s newly released Super Agent, built on a Mixture‑of‑Agents architecture that combines eight specialized LLMs and over 80 tools, claims to autonomously plan, execute, and integrate external services across tasks such as travel planning and video summarization, and reportedly surpasses OpenAI and Manus in the GAIA benchmark while offering instant access without an invitation code.

AI AgentGAIA benchmarkLLM
0 likes · 4 min read
How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 3, 2025 · Artificial Intelligence

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

This article explains the Model Context Protocol (MCP) as a standard for LLM‑data integration, describes Retrieval‑Augmented Generation (RAG) techniques to reduce hallucinations, and introduces vector databases like Milvus that store high‑dimensional embeddings for efficient AI retrieval tasks.

LLMMCPMilvus
0 likes · 7 min read
Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration
DevOps
DevOps
Apr 2, 2025 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

This article explains Retrieval‑Augmented Generation (RAG), its role in mitigating large language model knowledge cutoff and hallucination, outlines the evolution from naive to advanced, modular, graph, and agentic RAG, and discusses future directions such as intelligent and multi‑modal RAG systems.

Artificial IntelligenceKnowledge RetrievalLLM
0 likes · 10 min read
Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types
AntTech
AntTech
Apr 2, 2025 · Artificial Intelligence

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.

Attention Re-weightingLLMPEAR
0 likes · 6 min read
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
JD Retail Technology
JD Retail Technology
Apr 2, 2025 · Artificial Intelligence

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

The paper introduces One4All, a scalable multi‑task generative recommendation framework for CPS advertising that combines few‑shot intent prompting, a Rewards‑in‑Context multi‑objective optimization, and an online model‑selection strategy, delivering 2‑3× offline HitRate/NDCG gains and notable online CTR, CVR, and commission improvements.

AdvertisingLLMlarge language models
0 likes · 14 min read
One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising
AI Algorithm Path
AI Algorithm Path
Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM
0 likes · 5 min read
Master the Three Essential LLM Training Stages for 2025
Huolala Tech
Huolala Tech
Apr 1, 2025 · Frontend Development

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

This article explains how frontend developers can use large language models to detect and prevent marketing content violations in WeChat mini‑programs, covering pain‑point discovery, LLM‑driven compliance architecture, prompt optimization, model selection, testing methods, and seamless frontend integration with Feishu notifications.

AIIntegrationLLM
0 likes · 10 min read
How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks
Architect
Architect
Mar 31, 2025 · Artificial Intelligence

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

This paper presents a systematic investigation of failure patterns in LLM‑driven multi‑agent systems, introducing a 14‑type taxonomy (MASFT) derived from over 150 annotated dialogues, evaluating it with an LLM‑as‑a‑judge pipeline, and exploring modest intervention strategies while releasing all data and tools for future research.

AIAgenticLLM
0 likes · 29 min read
A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems
Architect
Architect
Mar 29, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

This article guides developers without an AI background through the fundamentals of building large‑language‑model applications, covering prompt engineering, multi‑turn interaction, function calling, retrieval‑augmented generation, vector databases, code assistants, and the MCP protocol for AI agents.

AI AgentEmbeddingFunction Calling
0 likes · 51 min read
How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained
Qborfy AI
Qborfy AI
Mar 29, 2025 · Artificial Intelligence

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

This tutorial walks through the limitations of simple prompt usage, introduces LangChain as a framework for building full‑featured LLM applications, explains its core concepts and components, and provides step‑by‑step code examples for installing, configuring, and running a basic LangChain demo.

AI ApplicationLLMLangChain
0 likes · 11 min read
Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Mar 27, 2025 · Artificial Intelligence

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

This article provides a comprehensive side‑by‑side comparison of the open‑source LLM serving tools Xinference and Ollama, examining their core goals, architecture, model support, deployment options, performance, ecosystem integration, typical use cases, future roadmap, and guidance on selecting the right solution for enterprise or personal projects.

ComparisonLLMModel Serving
0 likes · 7 min read
Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?
JavaEdge
JavaEdge
Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

Artificial IntelligenceDeep LearningLLM
0 likes · 8 min read
Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)
DevOps
DevOps
Mar 26, 2025 · Artificial Intelligence

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

The article explains Anthropic's open Model Context Protocol (MCP), detailing its client‑server architecture, resource and prompt definitions, tool discovery and execution, sampling workflow, security features, and provides a complete Python example that demonstrates building, running, and testing an MCP server and client for real‑time data retrieval.

AI integrationLLMMCP
0 likes · 12 min read
Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools
Architect
Architect
Mar 26, 2025 · Artificial Intelligence

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

This article explains the fundamentals of AI agent memory—including short‑term, long‑term, and working memory types and their storage designs—and then details Dify's knowledge‑base segmentation modes, indexing strategies, and retrieval configurations for effective RAG applications.

Agent MemoryDifyKnowledge Base
0 likes · 14 min read
Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details
DaTaobao Tech
DaTaobao Tech
Mar 26, 2025 · Artificial Intelligence

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

The article surveys Retrieval‑Augmented Generation (RAG) as a solution to large language model limits—such as outdated knowledge, hallucinations, and security risks—by integrating vector‑database retrieval with LLM generation, and discusses related tools, multi‑agent frameworks, prompt engineering, fine‑tuning methods, and emerging optimization trends.

AI applicationsLLMRAG
0 likes · 29 min read
Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies
ELab Team
ELab Team
Mar 26, 2025 · Artificial Intelligence

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Large language models often struggle with coding tasks, failing to stop when encountering obstacles, ignoring black‑box testing principles, and making unnecessary refactors; this article examines those blind spots, offers practical examples, and suggests strategies such as preparatory refactoring, stateless tools, and careful prompting to improve AI‑assisted development.

AI CodingLLMbest practices
0 likes · 59 min read
Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Mar 26, 2025 · Artificial Intelligence

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

The paper introduces MHA2MLA, a data‑efficient fine‑tuning framework that converts pre‑trained multi‑head attention LLMs to DeepSeek’s Multi‑Head Latent Attention architecture, achieving up to 92% KV‑cache compression with less than 0.5% performance loss on long‑context tasks.

LLMLow-Rank ApproximationMulti-Head Latent Attention
0 likes · 8 min read
Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining
Programmer DD
Programmer DD
Mar 25, 2025 · Artificial Intelligence

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

This article demonstrates how to implement the Model Context Protocol (MCP) using Spring AI, covering the creation of MCP hosts, clients, and servers, configuring dependencies, integrating Claude, adding Brave Search and filesystem tools, and building a functional chatbot that leverages external data sources through standardized LLM interfaces.

LLMModel Context Protocolai-integration
0 likes · 15 min read
How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps
21CTO
21CTO
Mar 25, 2025 · Artificial Intelligence

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

This article breaks down major large language models, defining key comparison metrics such as speed, hallucination rate, and context window, then evaluates each model with benchmarks like HumanEval+, ChatBot Arena, and Aider to help you choose the most suitable LLM for your coding tasks.

AILLMbenchmark
0 likes · 10 min read
Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared
Open Source Tech Hub
Open Source Tech Hub
Mar 24, 2025 · Artificial Intelligence

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

This article explains the data‑isolation problem facing large language models, introduces the Model Context Protocol (MCP) as a standard bridge to external data sources, and provides a step‑by‑step PHP SDK tutorial—including installation, server and client code, and optional advanced logging—to help developers integrate AI models securely and efficiently.

LLMMCPModel Context Protocol
0 likes · 13 min read
Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide
AI Algorithm Path
AI Algorithm Path
Mar 24, 2025 · Artificial Intelligence

How to Use Pydantic for Structured LLM Output

The article explains why LLM responses can be inconsistent, introduces Pydantic as a way to define custom output schemas, and walks through concrete examples—both with OpenAI and Ollama models—showing how to build a LangChain pipeline that parses responses into structured data.

LLMLangChainOllama
0 likes · 7 min read
How to Use Pydantic for Structured LLM Output
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 24, 2025 · Artificial Intelligence

Boost LLM Evaluation with Semantic Enrichment and Vector Search

This article explains how semantic enrichment, vector and hybrid search, and clustering techniques can be applied to large language model logs to evaluate inputs and outputs, improve compliance auditing, and enhance model iteration across various business scenarios.

AILLMVector Search
0 likes · 12 min read
Boost LLM Evaluation with Semantic Enrichment and Vector Search
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 24, 2025 · Artificial Intelligence

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

This article analyses the shortcomings of large‑model internet search—such as unverifiable sources, fabricated content, and poor instruction compliance—by comparing Qwen‑max, Doubao‑1.5‑pro‑256k, and DeepSeek‑v3, and proposes prompt engineering, post‑processing, and custom tool improvements to boost reliability.

AILLMevaluation
0 likes · 22 min read
Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek
Ma Wei Says
Ma Wei Says
Mar 24, 2025 · Artificial Intelligence

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Explore the BGE (BAAI General Embedding) family—including v1, v1.5, M3, Multilingual Gemma2, and EN‑ICL—detailing their multilingual capabilities, model variants, token limits, optimal use cases, and step‑by‑step installation and Python usage instructions with code examples for embedding generation and similarity scoring.

EmbeddingLLMPython
0 likes · 8 min read
Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 24, 2025 · Artificial Intelligence

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

This article explains the challenges LLMs face in data analysis, introduces the Model Context Protocol (MCP) as a standard bridge, and provides a step‑by‑step guide to integrate Hologres, MCP, and large language models—using Claude Desktop as an example—to create a fast, multi‑source data‑analysis agent.

AI AgentHologresLLM
0 likes · 11 min read
How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP
Architect
Architect
Mar 23, 2025 · Artificial Intelligence

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

The article argues that the next wave of AI agents will shift from brittle, prompt‑driven workflows like Manus to truly autonomous, model‑centric agents trained with reinforcement learning and reasoning, exemplified by OpenAI's DeepResearch and Anthropic's Claude Sonnet 3.7, while the API‑driven market model collapses.

AI agentsClaudeDeepResearch
0 likes · 28 min read
The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 23, 2025 · Artificial Intelligence

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

The article argues that the next generation of AI agents should focus on improving the model itself through reinforcement learning and reasoning rather than relying on pre‑designed prompt‑driven workflows, highlighting industry trends, technical challenges, and the shift toward treating models as products.

DeepSearchLLMmodel as product
0 likes · 29 min read
Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows
Architect
Architect
Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI reliabilityLLMRAG
0 likes · 32 min read
Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems
Cognitive Technology Team
Cognitive Technology Team
Mar 22, 2025 · Artificial Intelligence

Three Stages of Developing Large Language Models and Practical Guidance

The article outlines the three development phases of large language models—building, pre‑training, and fine‑tuning—describes usage options, highlights key factors such as data scale, architecture, training processes, and evaluation, and offers practical advice for cost‑effective development.

Fine-tuningLLMModel Development
0 likes · 3 min read
Three Stages of Developing Large Language Models and Practical Guidance
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 21, 2025 · Artificial Intelligence

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

This article provides a comprehensive technical overview of large language model post‑training, covering fine‑tuning methods (full, parameter‑efficient, LoRA families, prompt tuning), domain‑adaptive tuning, reinforcement‑learning reward modeling, process vs. outcome rewards, inference‑enhancement strategies, dynamic compute allocation, verifier‑augmented reasoning, current challenges, and emerging research directions such as meta‑cognition, physical reasoning, and swarm intelligence.

LLMmeta-cognitionpost-training
0 likes · 21 min read
Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques
Meituan Technology Team
Meituan Technology Team
Mar 20, 2025 · Artificial Intelligence

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

The article compiles Meituan’s recent 2024‑2025 research on large language models, presenting a diverse set of papers that explore transformer enhancements, scaling laws, safety optimization, instruction fine‑tuning, temporal decay learning, code generation, agent refinement, cost‑efficient MoE inference, quantization, fast parallel inference, speculative decoding, multilingual speech, vision‑language models, evaluation benchmarks, and jailbreak robustness.

ACLAILLM
0 likes · 4 min read
Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)
Sohu Tech Products
Sohu Tech Products
Mar 19, 2025 · Artificial Intelligence

How to Recreate a Translation Agent with LangGraph and LLMs

This guide demonstrates building a steerable LLM‑based translation workflow using LangGraph, covering the initial translation, model‑generated reflection suggestions, and final improvement steps with full Python code examples and a complete execution result.

AILLMLangGraph
0 likes · 34 min read
How to Recreate a Translation Agent with LangGraph and LLMs
Ops Development & AI Practice
Ops Development & AI Practice
Mar 19, 2025 · Artificial Intelligence

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Integrating large language models with the open‑standard Model Context Protocol enables direct access to file systems, databases, and APIs, unlocking use cases such as automated file management, intelligent data analysis, personalized content generation, and task automation, while also raising security, privacy, and maturity challenges for future AI‑human collaboration.

LLMMCPcross-domain
0 likes · 10 min read
How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows
Ops Development & AI Practice
Ops Development & AI Practice
Mar 19, 2025 · Artificial Intelligence

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

Cache‑augmented generation (CAG) preloads documents into LLM context using KV caches to eliminate retrieval latency, offering faster inference for static knowledge bases, while RAG remains more flexible for dynamic or large corpora; this article compares their definitions, performance, implementation steps, and future prospects.

CAGCache AugmentationInference Optimization
0 likes · 11 min read
Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency
DaTaobao Tech
DaTaobao Tech
Mar 19, 2025 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Retrieval‑augmented generation (RAG) enhances large language models by integrating a preprocessing pipeline—cleaning, chunking, embedding, and vector storage—with a query‑driven retrieval and prompt‑injection workflow, leveraging vector databases, multi‑stage recall, advanced prompting, and comprehensive evaluation metrics to mitigate knowledge cut‑off, hallucinations, and security issues.

LLMRAGRetrieval Augmented Generation
0 likes · 27 min read
Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques
Tencent Cloud Developer
Tencent Cloud Developer
Mar 19, 2025 · Artificial Intelligence

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Tencent's new Hunyuan Turbo S model combines a 44% faster response time, dramatically lower token costs, and a hybrid Mamba‑Transformer architecture that merges linear attention with full attention, offering insights into fast‑thinking versus slow‑thinking LLM designs, MoE scaling laws, low‑precision training effects, and long‑short chain fusion techniques.

AIArchitectureHybridMambaLLM
0 likes · 14 min read
Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceCloud Native
0 likes · 25 min read
Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes
JD Tech Talk
JD Tech Talk
Mar 18, 2025 · Artificial Intelligence

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

This article surveys recent advances in generative recommendation for CPS advertising, detailing explicit intent‑aware controllable product recommendation, multi‑objective optimization techniques based on reward‑in‑context and DPO, and the scalable One4All framework that unifies behavior and language modeling across diverse ad scenarios.

CPS advertisingGenerative RecommendationLLM
0 likes · 14 min read
Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework
JD Cloud Developers
JD Cloud Developers
Mar 18, 2025 · Artificial Intelligence

How Generative LLMs Are Transforming CPS Advertising Recommendations

Since large language models have excelled in NLP, researchers are now enhancing CPS advertising recommendation systems by integrating generative LLMs for explicit intent perception, multi‑objective optimization, and a unified One4All framework, achieving significant offline and online performance gains across click‑through, conversion, and revenue metrics.

CPS advertisingGenerative RecommendationLLM
0 likes · 19 min read
How Generative LLMs Are Transforming CPS Advertising Recommendations
AI Algorithm Path
AI Algorithm Path
Mar 17, 2025 · Artificial Intelligence

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

The article defines Agentic AI as autonomous, goal‑directed systems that can act and learn from experience, contrasts it with Generative AI’s passive, single‑step content generation, and illustrates the practical advantage of Agentic workflows through Andrew Ng’s HumanEval benchmark where a step‑wise approach outperforms zero‑shot prompting even for older models.

AI autonomyAgentic AIHumanEval
0 likes · 10 min read
Agentic AI vs Generative AI: Key Differences and Comparative Analysis
Infra Learning Club
Infra Learning Club
Mar 17, 2025 · Artificial Intelligence

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

The author walks through installing OpenManus, configuring it to use DeepSeek (and an Ollama‑based vision model), runs a sample financial data query, and reports that the system is slow, sometimes inaccurate, and still requires further optimization.

AI agentsCondaDeepSeek
0 likes · 5 min read
Testing OpenManus with DeepSeek: A Hands‑On Evaluation
Ops Development & AI Practice
Ops Development & AI Practice
Mar 17, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Open WebUI offers a user‑friendly, open‑source web interface that simplifies interaction with large language models, supporting multiple back‑ends, offline operation, and extensible plugins, making AI experimentation accessible for developers, researchers, and enthusiasts alike.

AILLMModel Management
0 likes · 4 min read
Unlocking LLM Power: A Hands‑On Guide to Open WebUI
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 17, 2025 · Cloud Native

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

This guide demonstrates how to deploy the QwQ‑32B large language model on an Alibaba Cloud ACK cluster, configure OSS storage, enable the ACK Gateway with AI Extension, set up InferencePool and InferenceModel resources, and benchmark intelligent routing versus standard gateway routing, revealing latency and throughput improvements.

ACK GatewayAI ExtensionKubernetes
0 likes · 16 min read
Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide
Cognitive Technology Team
Cognitive Technology Team
Mar 17, 2025 · Artificial Intelligence

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Large language models can assist and enhance each stage of traditional machine learning—including sample generation, data cleaning, feature engineering, model selection, hyper‑parameter tuning, and workflow automation—by generating synthetic data, refining features, selecting models, and orchestrating pipelines, though challenges such as bias, privacy, and noise remain.

Data GenerationLLMfeature engineering
0 likes · 11 min read
Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines
Ops Development & AI Practice
Ops Development & AI Practice
Mar 16, 2025 · Artificial Intelligence

How Function Calling Helps LLMs Overcome Hallucinations

This article explains how LLM function calling works, from defining external functions to processing API responses, and demonstrates a Python example using OpenAI's ChatGPT‑4o to fetch real‑time weather, showing how the technique mitigates hallucinations and expands practical AI applications.

AIFunction CallingLLM
0 likes · 8 min read
How Function Calling Helps LLMs Overcome Hallucinations
Architect
Architect
Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG
0 likes · 11 min read
Why Building Your Own RAG System Is a Costly Mistake
AI Algorithm Path
AI Algorithm Path
Mar 15, 2025 · Artificial Intelligence

Why the Industry Is Shifting From AI Agents to Agentic Workflows

The article explains that low accuracy and security risks of current AI agents—evidenced by a Claude AI Agent achieving only 14% of human performance and an average success rate of about 20%—are driving a move toward agentic workflows, which offer observable, auditable, and data‑synthesizing pipelines that dramatically improve enterprise productivity.

AI agentsLLMagentic workflows
0 likes · 7 min read
Why the Industry Is Shifting From AI Agents to Agentic Workflows
DataFunSummit
DataFunSummit
Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPUInferenceLLM
0 likes · 13 min read
Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations
Baidu Geek Talk
Baidu Geek Talk
Mar 12, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

This article reviews how large language models (LLMs) enhance semantic text embeddings by comparing traditional methods with LLM‑based approaches, detailing synthetic data generation, backbone model designs, key model families, experimental results on the MTEB benchmark, and future research challenges.

LLMcontrastive learningmodel comparison
0 likes · 30 min read
How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends
DaTaobao Tech
DaTaobao Tech
Mar 12, 2025 · Artificial Intelligence

Multimodal Automatic Layout Generation for E-commerce

The project develops a multimodal automatic layout generation system for e‑commerce by fine‑tuning the qwen‑vl‑7b vision‑language model with LoRA on poster and Taobao image‑layout data, employing diffusion‑based image generation and coordinate‑prediction methods to produce structured layouts that power poster, marketing image, and video‑cover creation with over 90% adoption, while exploring multi‑image, style‑aware, and iterative refinement extensions.

LLMMultimodal AIdiffusion
0 likes · 12 min read
Multimodal Automatic Layout Generation for E-commerce
NewBeeNLP
NewBeeNLP
Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts
0 likes · 18 min read
How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance
Tencent Cloud Developer
Tencent Cloud Developer
Mar 11, 2025 · Artificial Intelligence

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

AIFine-tuningLLM
0 likes · 17 min read
Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications
Architect
Architect
Mar 10, 2025 · Artificial Intelligence

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

This article analyzes DeepSeek’s latest large‑model breakthroughs, covering the MLA attention compression, GRPO alignment algorithm, MoE load‑balancing redesign, multi‑stage training pipelines, reinforcement‑learning tricks, and performance comparisons with GPT‑4o‑Mini and Llama 3.1, highlighting both strengths and remaining challenges.

AI trainingDeepSeekGRPO
0 likes · 19 min read
What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations
AI Algorithm Path
AI Algorithm Path
Mar 10, 2025 · Artificial Intelligence

How Much GPU Memory Does an LLM Service Really Need?

This article explains a simple formula for estimating the GPU VRAM required to serve large language models, demonstrates the calculation with a 7‑billion‑parameter example, clarifies why a 20% safety buffer is needed, and offers practical strategies such as quantization, CPU offload, and multi‑GPU parallelism to reduce memory usage.

DeploymentGPU MemoryLLM
0 likes · 6 min read
How Much GPU Memory Does an LLM Service Really Need?
Tencent Technical Engineering
Tencent Technical Engineering
Mar 10, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

This guide shows non‑AI developers how to create large‑model applications by mastering prompt engineering, multi‑turn interactions, Retrieval‑Augmented Generation, function calling, and AI‑Agent integration, with practical code examples, tool design patterns, and deployment tips.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 10, 2025 · Artificial Intelligence

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

This article provides a detailed technical analysis of FP8 training, comparing Nvidia’s TransformerEngine approach with DeepSeek V3’s novel scheme, and examines how block‑wise scaling, high‑precision accumulation, and vector length and correlation affect quantization error and signal‑to‑noise ratio in large‑language‑model training.

DeepSeekFP8LLM
0 likes · 20 min read
Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive
phodal
phodal
Mar 10, 2025 · Artificial Intelligence

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

AutoDev Bridge combines large‑model reasoning, C4 architecture analysis, AST‑based business logic extraction, and IDE‑integrated tooling to automate the migration of legacy systems, reducing manual effort and migration risk while highlighting the unique advantages of modern AI agents.

AICode TranslationLLM
0 likes · 7 min read
How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration
DevOps
DevOps
Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 9, 2025 · Cloud Computing

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud Container Compute Service (ACS) to provision GPU resources, prepare the QwQ-32B model, configure persistent storage, deploy the model with vLLM, set up OpenWebUI, verify the service, and optionally benchmark its performance, all with detailed commands and YAML examples.

ACSAlibaba CloudGPU
0 likes · 17 min read
Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide
AI Frontier Lectures
AI Frontier Lectures
Mar 9, 2025 · Industry Insights

Why the Model Is Becoming the Product: AI Market Trends and Risks

The article argues that AI models are evolving into standalone products, examines scaling limits, integration challenges, reinforcement‑learning economics, and investment dynamics, and warns that reliance on large‑lab APIs may jeopardize future profitability for integrators.

AIIndustryInsightsLLM
0 likes · 15 min read
Why the Model Is Becoming the Product: AI Market Trends and Risks
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 7, 2025 · Artificial Intelligence

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

Alibaba Cloud's newly released QwQ-32B model delivers benchmark‑level performance rivaling top open‑source LLMs, integrates agent capabilities, and can be deployed with a single click through the PAI‑Model Gallery, offering a cost‑effective solution for developers seeking advanced AI inference.

AI BenchmarkAlibaba CloudLLM
0 likes · 5 min read
How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud
dbaplus Community
dbaplus Community
Mar 7, 2025 · Artificial Intelligence

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

This comprehensive guide explains what prompts are, outlines essential prompt components and multiple engineering frameworks, presents practical strategies for crafting clear and structured prompts, addresses model limitations such as hallucinations, and showcases a wide range of advanced prompting techniques with code examples.

AILLMfew-shot prompting
0 likes · 29 min read
Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models
DevOps
DevOps
Mar 6, 2025 · Artificial Intelligence

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

This article explains how to create a high‑performance multi‑model chat agent on the Dify platform by combining DeepSeek‑R1 for reasoning and Gemini for answer generation, covering the underlying principles, configuration steps, API integration, performance benchmarks, and practical deployment guidance.

ChatbotDeepSeekDify
0 likes · 12 min read
Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini
Cognitive Technology Team
Cognitive Technology Team
Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

AgenticEnterprise searchLLM
0 likes · 18 min read
Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval
AI Algorithm Path
AI Algorithm Path
Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM
0 likes · 9 min read
How to Control LLM Output Using Temperature, Top‑K, and Top‑P
Tencent Cloud Developer
Tencent Cloud Developer
Mar 4, 2025 · Artificial Intelligence

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

The guide teaches non‑AI developers how to build practical LLM‑powered applications by mastering prompt engineering, function calling, retrieval‑augmented generation, and AI agents, and introduces the Modal Context Protocol for seamless tool integration, offering a clear learning path to leverage large language models without deep theory.

AI AgentFunction CallingLLM
0 likes · 48 min read
A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents
Architect
Architect
Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling
0 likes · 27 min read
Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies
Code Mala Tang
Code Mala Tang
Mar 3, 2025 · Artificial Intelligence

Unlock AI’s Full Potential with Structured Prompt Decorators

Prompt Decorators are structured prefixes that standardize and enhance AI responses, addressing common challenges like vague prompts, inconsistent answers, and lack of reasoning by guiding the model to produce clear, logical, and well‑organized outputs across various use cases.

AILLMautomation
0 likes · 23 min read
Unlock AI’s Full Potential with Structured Prompt Decorators