Tagged articles
2016 articles
Page 13 of 21
Alibaba Cloud Developer
Alibaba Cloud Developer
May 29, 2025 · Artificial Intelligence

Build a Minimal Large Language Model from Scratch with Python and PyTorch

This tutorial walks through creating a simple bigram language model in pure Python, refactoring it into a PyTorch implementation, and explains core concepts such as tokenization, embedding layers, loss functions, gradient descent, training loops, and text generation, preparing you for building a full GPT model.

BigramLLMLanguageModel
0 likes · 31 min read
Build a Minimal Large Language Model from Scratch with Python and PyTorch
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 29, 2025 · Artificial Intelligence

How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance

This article introduces the OmniThought dataset, which annotates over two million chain‑of‑thought reasoning steps with Reasoning Verbosity and Cognitive Difficulty scores, and explains how these metrics guide the training of DistilQwen‑ThoughtX models that adapt chain length to task difficulty, achieving superior performance compared to existing distilled LLMs.

CoTDatasetDistillation
0 likes · 16 min read
How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance
Tencent Technical Engineering
Tencent Technical Engineering
May 28, 2025 · Artificial Intelligence

A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents

This article provides a concise, easy-to-understand introduction to large language models, the transformer architecture, prompt engineering, temperature settings, function calling, the Model Context Protocol (MCP), agent communication (A2A), and future AI programming trends, using simple analogies and illustrative examples.

AIAgentFunction Calling
0 likes · 11 min read
A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
May 28, 2025 · Artificial Intelligence

Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment

This article provides a comprehensive guide to large language model fine‑tuning, covering model architecture, parameter and memory calculations, prompt engineering, data construction, LoRA and PEFT techniques, reinforcement learning methods such as DPO, and practical deployment workflows on internal platforms.

Fine‑TuningLLMLoRA
0 likes · 21 min read
Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment
JavaEdge
JavaEdge
May 27, 2025 · Artificial Intelligence

Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0

Version 0.8.0 of Dify introduces parallel workflow capabilities, allowing multiple branches to run concurrently, which dramatically reduces latency for complex LLM tasks; the guide explains how to create simple, nested, iterative, and conditional parallel branches, with step‑by‑step instructions and visual examples.

DifyLLMparallel processing
0 likes · 8 min read
Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0
Instant Consumer Technology Team
Instant Consumer Technology Team
May 27, 2025 · Artificial Intelligence

How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions

This comprehensive guide explains the Text2SQL concept, showcases real‑world scenarios, compares three implementation architectures—including a simple prompt‑based method, a LangChain‑based pipeline, and an enterprise‑grade Vanna solution—while providing practical tips, security measures, and advanced enhancements for deploying robust natural‑language‑to‑SQL systems.

LLMText2SQLVanna
0 likes · 26 min read
How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions
Architecture & Thinking
Architecture & Thinking
May 25, 2025 · Artificial Intelligence

Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze

This article compares three leading AI workflow tools—n8n, Dify, and Coze—by examining their origins, technical architectures, core advantages, typical use cases, real‑world case studies, and future deployment trends, helping developers and businesses choose the right "intelligent assistant" for their needs.

AILLMautomation
0 likes · 11 min read
Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze
Youzan Coder
Youzan Coder
May 23, 2025 · Artificial Intelligence

How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow

This article explains how a SaaS company leveraged large language models to automatically ingest, enrich, and analyze stability alerts, turning noisy notifications into actionable insights through configurable pipelines, Feishu integration, and a streamlined AI workflow that boosts incident response speed and reduces manual effort.

AIAlert MonitoringLLM
0 likes · 6 min read
How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow
Volcano Engine Developer Services
Volcano Engine Developer Services
May 22, 2025 · Artificial Intelligence

How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System

This article introduces TickIt, a ByteBrain system that leverages large language models to automatically identify and escalate critical Oncall tickets, detailing its multi‑class escalation, deduplication, and category‑guided fine‑tuning modules, experimental results, and the operational impact on cloud services.

LLMOncall analysisSupervised Fine‑Tuning
0 likes · 13 min read
How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System
JD Tech Talk
JD Tech Talk
May 22, 2025 · Artificial Intelligence

From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection

The article recounts Xiaoting’s journey from a PhD research background to leading JD.com’s ad‑fraud detection, detailing how large language models, reinforcement learning, and model distillation were applied to identify hidden address codes, reduce false‑positive rates to 0.3%, and balance accuracy with real‑time performance in a high‑traffic e‑commerce environment.

AIAd FraudAdvertising
0 likes · 11 min read
From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection
Sohu Tech Products
Sohu Tech Products
May 21, 2025 · Artificial Intelligence

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

The article examines the inherent knowledge cutoff of large language models, introduces function calling, Model Context Protocol (MCP), and Agent‑to‑Agent (A2A) as solutions for real‑time data access, compares their architectures, communication patterns, and use cases, and discusses their respective strengths and drawbacks.

A2AAI protocolsFunction Calling
0 likes · 17 min read
Beyond LLM Limits: Function Calling, MCP, and A2A Compared
Alibaba Cloud Developer
Alibaba Cloud Developer
May 21, 2025 · Artificial Intelligence

How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls

This article explains the challenges of integrating diverse tools without MCP, then demonstrates step‑by‑step how to configure Spring‑AI and the native MCP SDK to call LLMs, register tools, handle SSE and stdio services, and troubleshoot common issues, providing code snippets and best‑practice recommendations.

AI tool integrationLLMMCP
0 likes · 16 min read
How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls
DeWu Technology
DeWu Technology
May 19, 2025 · Artificial Intelligence

AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans

This article presents a comprehensive AI-driven solution for automatically generating functional test cases, detailing the AI background, design scheme, core components such as PRD parsing, test‑point generation, test‑case creation, knowledge‑base construction, implementation results, and future development directions.

AIKnowledge BaseLLM
0 likes · 7 min read
AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
May 19, 2025 · Artificial Intelligence

What Is AI MCP and How It Revolutionizes Model Integration?

AI MCP (Model Context Protocol) is an open protocol that standardizes communication between large language model applications and external data sources or tools, offering pre‑installed services, fast registration, ecosystem openness, and automatic discovery within Huawei Cloud ModelArts Studio, while eliminating the need for per‑API integration code.

AIHuaweiIntegration
0 likes · 7 min read
What Is AI MCP and How It Revolutionizes Model Integration?
Youzan Coder
Youzan Coder
May 16, 2025 · Artificial Intelligence

Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering

This article describes how a hybrid architecture that combines third‑party address‑recognition APIs with large‑language‑model (LLM) processing, along with carefully engineered prompts and a TSV output format, dramatically improves address parsing accuracy and latency in a retail checkout scenario.

AIHybrid ArchitectureLLM
0 likes · 12 min read
Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering
Alibaba Cloud Developer
Alibaba Cloud Developer
May 16, 2025 · Artificial Intelligence

Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices

This article explains the Model Context Protocol (MCP), its components, and how to integrate MCP servers with Alibaba Cloud Observability 2.0, offering practical design experiences, tool simplification tips, default parameter strategies, output size control, and future AI‑driven observability insights.

LLMMCPobservability
0 likes · 17 min read
Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices
AI Large Model Application Practice
AI Large Model Application Practice
May 16, 2025 · Artificial Intelligence

Why Residual Connections Keep Deep Neural Networks Stable

This article explains why residual connections are essential in deep neural networks, describing the problems of network degradation and gradient vanishing, how shortcut paths add the input to the layer output, the requirement of matching dimensions, and the resulting stability for training large language models.

LLMNeural NetworksResidual Connections
0 likes · 7 min read
Why Residual Connections Keep Deep Neural Networks Stable
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 15, 2025 · Artificial Intelligence

How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres

This guide walks you through creating a ChatBI intelligent agent by integrating Alibaba's Qwen3 large language model with PAI‑LangStudio, configuring the Model Context Protocol (MCP) server, and connecting to Hologres real‑time data warehouse, covering setup, deployment, and verification steps for enterprise data analysis.

ChatBIHologresLLM
0 likes · 11 min read
How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres
StarRocks
StarRocks
May 13, 2025 · Artificial Intelligence

How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins

StarRocks MCP Server provides a universal adapter that lets large language models like Claude, OpenAI, and Gemini execute SQL queries directly against StarRocks, simplifying data Q&A, intelligent analysis, and automated reporting by eliminating the need for bespoke plugins or complex prompt engineering.

AI agentsData AnalyticsLLM
0 likes · 14 min read
How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins
Tencent Cloud Developer
Tencent Cloud Developer
May 13, 2025 · Artificial Intelligence

Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems

The article reviews the shortcomings of traditional large language models, explains how function calling extends LLMs beyond pure text, introduces the Model Context Protocol (MCP) as a standardized USB‑C‑like interface for AI tools, and demonstrates a Python MCP example that integrates LLMs with Tencent Advertising APIs.

AI integrationAPIFunction Calling
0 likes · 16 min read
Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems
Tencent Technical Engineering
Tencent Technical Engineering
May 12, 2025 · Artificial Intelligence

Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture

This article provides a detailed Chinese‑to‑English summary of Andrej Karpathy’s 7‑hour LLM tutorial, covering chat process analysis, tokenization, pre‑training data pipelines, model architecture, training strategies, post‑training fine‑tuning, reinforcement learning, chain‑of‑thought reasoning, and current industry applications.

AILLMModel architecture
0 likes · 25 min read
Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture
AI Algorithm Path
AI Algorithm Path
May 9, 2025 · Artificial Intelligence

A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models

This article explains the Mixture of Experts (MoE) technique used in modern LLMs, detailing its core components—experts and router—comparing dense and sparse layers, describing load‑balancing, expert capacity, and routing strategies, and showcasing real‑world examples such as Switch Transformer, Vision‑MoE, and Mixtral 8x7B.

Expert CapacityLLMMixture of Experts
0 likes · 15 min read
A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models
phodal
phodal
May 9, 2025 · Artificial Intelligence

Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval

The article examines how pre‑generating structured context for codebases can overcome the uncertainty and quality issues of traditional Retrieval‑Augmented Generation, outlines the technical and business challenges of RAG, compares existing code‑search tools, and introduces AutoDev’s Context Worker as a practical solution.

AILLMRAG
0 likes · 11 min read
Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval
Bilibili Tech
Bilibili Tech
May 9, 2025 · Artificial Intelligence

How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing

This article explains the design of an AI gateway that centralizes LLM access, detailing its background, overall architecture, authentication, quota management, multi‑model routing, load‑balancing strategies, multi‑tenant isolation, observability features, and the supported API protocols for enterprise integration.

AI gatewayAuthenticationLLM
0 likes · 17 min read
How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing
G7 EasyFlow Tech Circle
G7 EasyFlow Tech Circle
May 9, 2025 · Artificial Intelligence

How LLMs + Python Are Redefining Data Analysis: A Practical Guide

This article explains how large language models combined with Python's data‑science ecosystem can automate metadata extraction, data cleaning, and analysis tasks—illustrated with a step‑by‑step Titanic passenger dataset case study, complete prompts, code snippets, and best‑practice recommendations.

LLMPythondata analysis
0 likes · 18 min read
How LLMs + Python Are Redefining Data Analysis: A Practical Guide
Youzan Coder
Youzan Coder
May 8, 2025 · Artificial Intelligence

Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons

The article details how Youzan’s Store Smart Assistant was built on the Feishu Aily platform, describing why Aily was chosen, the three‑stage development process, deep system integration, practical tips for knowledge‑base management and model stability, and the resulting efficiency gains such as handling 80% of routine queries.

AI AssistantAily platformKnowledge Base
0 likes · 24 min read
Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons
Architect's Alchemy Furnace
Architect's Alchemy Furnace
May 7, 2025 · Artificial Intelligence

Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama

This article provides a comprehensive comparison of seven popular large‑language‑model inference engines—Transformers, vLLM, Llama.cpp, SGLang, MLX, Ollama and others—detailing their core features, performance characteristics, hardware compatibility, concurrency support, and ideal use‑cases, plus practical installation guidance for Xinference.

InferenceLLMMLX
0 likes · 17 min read
Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama
Alibaba Cloud Developer
Alibaba Cloud Developer
May 7, 2025 · Artificial Intelligence

What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation

This article explores the concept of AI agents, contrasting them with traditional software and chatbots, outlines their core components, workflow, and the technological and market forces driving their evolution, and provides practical guidance for improving agent performance and choosing between workflow and LLM approaches.

AI AgentLLMprompt engineering
0 likes · 24 min read
What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation
JD Tech
JD Tech
May 6, 2025 · Artificial Intelligence

One4All Generative Recommendation Framework for CPS Advertising

This article reviews recent advances in applying large language models to CPS advertising recommendation, outlines business requirements and core technical challenges, proposes an extensible multi‑task generative framework with explicit intent perception and multi‑objective optimization, and presents offline and online performance gains along with future research directions.

AI OptimizationCPS advertisingGenerative Models
0 likes · 13 min read
One4All Generative Recommendation Framework for CPS Advertising
AI Large Model Application Practice
AI Large Model Application Practice
May 6, 2025 · Artificial Intelligence

How to Build an Agentic RAG System from Scratch Using MCP Architecture

This article walks through the design and full implementation of an Agentic Retrieval‑Augmented Generation (RAG) system built on the MCP standard, covering the conceptual fusion of MCP and RAG, server‑side tool creation with LlamaIndex, client‑side agent construction with LangGraph, configuration files, caching strategies, code examples, and an end‑to‑end demonstration.

Agentic RAGLLMLangGraph
0 likes · 15 min read
How to Build an Agentic RAG System from Scratch Using MCP Architecture
Data Thinking Notes
Data Thinking Notes
May 5, 2025 · Artificial Intelligence

How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries

This article explores the MCP platform’s data service capabilities, detailing its core components—Resources, Prompts, and Tools—and demonstrates how its Text2SQL feature enables natural‑language queries to retrieve table schemas, perform data sampling, and execute complex relational analyses across multiple database tables.

AIData IntegrationLLM
0 likes · 7 min read
How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries
21CTO
21CTO
May 3, 2025 · Artificial Intelligence

Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source

JetBrains has released its purpose‑built code‑completion large language model, Mellum, as an open‑source project on Hugging Face, highlighting its focus on specialized code‑completion tasks, low runtime costs, support for many programming languages, and its potential for AI/ML researchers and educators.

AILLMcode completion
0 likes · 4 min read
Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source
AI Algorithm Path
AI Algorithm Path
May 3, 2025 · Artificial Intelligence

DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning

DeepSeek‑Prover‑V2, an open‑source LLM specialized for Lean 4, bridges intuitive high‑level reasoning and strict formal verification through sub‑goal decomposition, dual operation modes, and a novel cold‑start data pipeline, achieving state‑of‑the‑art results on MiniF2F, PutnamBench and CombiBench while highlighting trade‑offs in inference cost and model scalability.

AI mathematicsDeepSeek Prover V2LLM
0 likes · 18 min read
DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning
Baobao Algorithm Notes
Baobao Algorithm Notes
May 2, 2025 · Artificial Intelligence

Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models

This article analyzes whether reinforcement learning enhances large language model reasoning, compares findings from DeepSeek-Math, a Tsinghua‑Shanghai Jiao‑Tong paper, and Qwen3, and outlines practical training pipelines—including Seed‑Thinking‑v1.5, DeepSeek‑R1, Kimi‑K1.5, and Qwen3—that aim to endow LLMs with robust reasoning capabilities.

Artificial IntelligenceLLMModel Training
0 likes · 12 min read
Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models
AI Algorithm Path
AI Algorithm Path
May 1, 2025 · Artificial Intelligence

Uncovering the Secrets of LLM Inference Optimization

This article dissects the major bottlenecks of large‑language‑model serving—prefill vs. decode, sparsity, memory bandwidth, KV‑cache growth—and walks through concrete engineering tricks such as paged attention, radix‑tree KV caches, compressed attention, speculative decoding, FlexGen weight scheduling, FastServe queuing, plus a runnable vLLM code snippet.

FastServeFlexGenInference Optimization
0 likes · 18 min read
Uncovering the Secrets of LLM Inference Optimization
Architecture & Thinking
Architecture & Thinking
Apr 30, 2025 · Artificial Intelligence

Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools

This article introduces the Model Context Protocol (MCP) released by Anthropic, explains its core features and client‑server architecture, walks through building a Go‑based MCP server and client with time, weather, and schedule tools, demonstrates testing with MCP Inspector, and highlights MCP's advantages and typical AI application scenarios.

AI integrationGoLLM
0 likes · 22 min read
Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools
Tencent Cloud Developer
Tencent Cloud Developer
Apr 29, 2025 · Artificial Intelligence

Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination

The article compares Google’s A2A coordination protocol with Anthropic’s Model Context Protocol, showing through a financial‑report case study that A2A enables deeper LLM‑driven interactions while MCP provides tool‑wrapper services, evaluates three integration paths, discusses SDK, latency and cost challenges, and predicts A2A could become the dominant orchestration layer for AI agents.

A2AAI agentsComparison
0 likes · 23 min read
Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination
Data Thinking Notes
Data Thinking Notes
Apr 27, 2025 · Artificial Intelligence

Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients

This guide walks developers through creating a complete MCP application, covering the workflow, server setup with Python, debugging tools, and client implementation using both Claude and DeepSeek models, complete with code snippets, environment configuration, and testing procedures to demonstrate end‑to‑end LLM tool integration.

ClaudeDeepSeekLLM
0 likes · 10 min read
Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 27, 2025 · Artificial Intelligence

How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning

The DeepSeek‑R1T‑Chimera model merges DeepSeek‑R1 reasoning with V3‑0324 architecture, reusing most V3 weights and swapping only the blue‑highlighted R1 routing experts, achieving the same intelligence as R1 while reducing output tokens by about 40% and running faster, all without any fine‑tuning or distillation.

Artificial IntelligenceDeepSeekLLM
0 likes · 5 min read
How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning
Youzan Coder
Youzan Coder
Apr 25, 2025 · Artificial Intelligence

AI-Powered Code Review System: Design, Implementation, and Lessons Learned

The team built a low‑cost AI‑powered code‑review assistant that injects line‑level comments into GitLab merge requests, using LLMs via Feishu, iterating quickly through MVP and optimization phases, achieving 64 integrations, 150+ daily comments, feedback‑driven prompt refinement, and demonstrating high ROI for small‑to‑medium teams while outlining future IDE and rule‑based extensions.

AICode reviewGitLab
0 likes · 17 min read
AI-Powered Code Review System: Design, Implementation, and Lessons Learned
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 25, 2025 · Artificial Intelligence

Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments

This article combines theoretical analysis and practical case studies to systematically explore the core components, design patterns, and future directions of AI agents, detailing the implementation of OpenManus, custom memory and planning modules, experimental evaluations, and insights for improving agent reliability and scalability.

AI AgentLLMOpenManus
0 likes · 31 min read
Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments
JavaEdge
JavaEdge
Apr 24, 2025 · Artificial Intelligence

How to Customize HTTP Clients for LangChain4j LLM Integration in Java

This guide explains how LangChain4j modules let you replace the default HTTP client used to call LLM provider APIs, showing two out‑of‑the‑box implementations (JdkHttpClient and SpringRestClient) and providing step‑by‑step code examples for custom JDK and Spring RestClient configurations.

HTTP clientLLMLangChain4j
0 likes · 4 min read
How to Customize HTTP Clients for LangChain4j LLM Integration in Java
Alimama Tech
Alimama Tech
Apr 23, 2025 · Artificial Intelligence

Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning

The paper introduces an explainable LLM framework (ELLM‑rele) that uses chain‑of‑thought reasoning and a multi‑dimensional knowledge distillation pipeline to compress large‑model relevance judgments into lightweight student models, achieving superior offline relevance scores and online click‑through and conversion improvements in Taobao’s search advertising.

Chain-of-ThoughtLLMexplainability
0 likes · 17 min read
Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning
AI Algorithm Path
AI Algorithm Path
Apr 22, 2025 · Artificial Intelligence

Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained

The article walks through the fundamentals of large‑language‑model quantization, presenting a concrete int8 example, detailed explanations of GPTQ, GGUF/GGML, QAT, and AWQ methods, and provides step‑by‑step code snippets, formulas, calibration procedures, and performance observations for each technique.

AWQGGMLGGUF
0 likes · 15 min read
Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained
Volcano Engine Developer Services
Volcano Engine Developer Services
Apr 22, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Transforms LLM Applications

Model Context Protocol (MCP) is an open standard that standardizes how large language models interact with external tools and data, enabling seamless function calls, simplifying prompt engineering, and allowing developers to build modular AI applications without handling low‑level integration details.

AI integrationFunction CallingLLM
0 likes · 16 min read
What Is Model Context Protocol (MCP) and How It Transforms LLM Applications
Tencent Cloud Developer
Tencent Cloud Developer
Apr 22, 2025 · Industry Insights

Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming

Vibe Coding, introduced by AI expert Andrej Karpathy in 2025, lets developers describe functionality in natural language and rely on large language models to generate code, shifting the programmer’s role to guiding AI, boosting productivity, lowering entry barriers, and reshaping software development practices.

AI programmingLLMSoftware Development
0 likes · 16 min read
Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming
DaTaobao Tech
DaTaobao Tech
Apr 21, 2025 · Artificial Intelligence

How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop

Facing DeepSeek R1 server instability, the open‑source MNN LLM framework offers local, mobile‑friendly deployment with model quantization and hardware‑specific optimizations, dramatically improving inference speed, stability, and download reliability across Android, iOS, and desktop platforms while supporting multimodal inputs.

AndroidLLMMNN
0 likes · 11 min read
How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop
Nightwalker Tech
Nightwalker Tech
Apr 21, 2025 · Artificial Intelligence

Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices

This article outlines a comprehensive methodology for integrating AI—particularly large language models—into software development workflows by establishing knowledge‑base templates, rule systems, multi‑model collaboration, context management, and task decomposition to transform AI from a whimsical code generator into a trustworthy engineering partner.

AILLMSoftware Development
0 likes · 16 min read
Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices
AI Algorithm Path
AI Algorithm Path
Apr 20, 2025 · Artificial Intelligence

Boosting Visual Reasoning in VLMs with Reinforcement Learning

The article analyzes how reinforcement learning, which transformed LLM reasoning in DeepSeek, can be applied to visual‑language models to overcome the limitations of traditional chain‑of‑thought prompting and supervised fine‑tuning, presenting concrete reward designs, training pipelines, and a critical assessment of their strengths and weaknesses.

Chain-of-ThoughtLLMRL training
0 likes · 10 min read
Boosting Visual Reasoning in VLMs with Reinforcement Learning
DataFunTalk
DataFunTalk
Apr 19, 2025 · Artificial Intelligence

Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment

Microsoft Research released BitNet b1.58 2B4T, the first open‑source native 1‑bit large language model with 2 billion parameters, 1.58‑bit effective precision and a 0.4 GB footprint, achieving full‑precision performance while enabling efficient CPU and GPU inference for edge AI applications.

1-bit quantizationCPU inferenceLLM
0 likes · 10 min read
Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment
Fun with Large Models
Fun with Large Models
Apr 18, 2025 · Artificial Intelligence

How RAG Works: From Data Prep to LLM Generation Explained

This article breaks down Retrieval‑Augmented Generation (RAG) into its three core stages—data preparation, data retrieval, and LLM generation—showing how document chunking, embedding, vector databases, similarity search, and optional re‑ranking combine to let large language models produce more accurate, knowledge‑grounded answers.

EmbeddingLLMRAG
0 likes · 9 min read
How RAG Works: From Data Prep to LLM Generation Explained
Data Thinking Notes
Data Thinking Notes
Apr 17, 2025 · Artificial Intelligence

How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design

Dify is an open‑source LLM application platform that blends BaaS and LLMOps, offering low‑code development, modular components, extensive model support, and advanced retrieval features, while also detailing its current limitations and recent enhancements such as MySQL integration and Elasticsearch‑based RAG capabilities.

AIElasticsearchLLM
0 likes · 7 min read
How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design
AI Frontier Lectures
AI Frontier Lectures
Apr 17, 2025 · Artificial Intelligence

Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive

This article analyzes a recent study on language‑model reasoning, revealing that reinforcement learning often brings little or no improvement, while evaluation variance caused by seeds, hardware, and decoding settings can dramatically affect benchmark results, and supervised fine‑tuning emerges as a more reliable path.

LLMReinforcement LearningReproducibility
0 likes · 12 min read
Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive
21CTO
21CTO
Apr 17, 2025 · Artificial Intelligence

How AI Will Revolutionize Software Development in 2025

This article explores how context‑aware AI, on‑premise model training, autonomous agents, and new metrics for AI impact will reshape software development, boost productivity, improve code quality, and give forward‑looking enterprises a decisive market advantage.

AIEnterpriseLLM
0 likes · 8 min read
How AI Will Revolutionize Software Development in 2025
Java Captain
Java Captain
Apr 17, 2025 · Artificial Intelligence

Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls

This article explains how the Model Context Protocol (MCP) enables large language models to retrieve up‑to‑date external information through standardized tool calls, illustrating the complete end‑to‑end workflow with Python code for the MCP server, client, and host, and discussing its advantages for building AI agents.

AI AgentLLMMCP
0 likes · 21 min read
Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 16, 2025 · Artificial Intelligence

Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM

This article presents a step‑by‑step guide for deploying and optimizing large‑language‑model inference across multiple GPU‑enabled nodes using ACK Gateway with Inference Extension, vLLM’s tensor‑ and pipeline‑parallel techniques, and Kubernetes resources such as LeaderWorkerSet, PVCs, and custom routing policies, followed by performance benchmarking and analysis.

ACK GatewayDistributed inferenceKubernetes
0 likes · 19 min read
Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM
Java Architecture Diary
Java Architecture Diary
Apr 16, 2025 · Artificial Intelligence

Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples

An in‑depth guide shows how to configure Spring AI for various LLM providers, tune model parameters such as temperature and max tokens, and apply a range of prompt‑engineering patterns—including zero‑shot, few‑shot, chain‑of‑thought, self‑consistency, role‑based and automatic prompting—using concise Java code examples.

ChatOptionsLLMspring-ai
0 likes · 18 min read
Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples
Ops Development & AI Practice
Ops Development & AI Practice
Apr 15, 2025 · Frontend Development

How to Build an AI‑Powered VS Code Extension in Minutes

This guide walks you through the VS Code extension architecture and provides a step‑by‑step example that creates a simple AI text‑explanation plugin, covering preparation, project scaffolding, command registration, API integration, debugging, and best‑practice security tips.

AI integrationExtension DevelopmentLLM
0 likes · 12 min read
How to Build an AI‑Powered VS Code Extension in Minutes
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 15, 2025 · Industry Insights

Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking

The article examines the slowdown caused by long‑chain‑of‑thought LLMs, presents a Python benchmarking script, compares token‑per‑second performance of several models—including the ultra‑fast GLM‑Z1‑AirX—and demonstrates a real‑time anti‑fraud use case that benefits from sub‑second response times.

GLM-Z1-AirXLLMPython
0 likes · 13 min read
Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking
DeWu Technology
DeWu Technology
Apr 14, 2025 · Artificial Intelligence

Overview of Recent Large Language Model Quantization Techniques

The article surveys modern post‑training quantization approaches for large language models, detailing weight‑only and activation‑aware methods such as GPTQ, AWQ, HQQ, SmoothQuant, QuIP, QuaRot, SpinQuant, QQQ, QoQ, and FP8, and compares their precision levels, algorithmic steps, accuracy‑throughput trade‑offs, and implementation considerations for efficient inference.

AILLMmodel compression
0 likes · 32 min read
Overview of Recent Large Language Model Quantization Techniques
Open Source Tech Hub
Open Source Tech Hub
Apr 14, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?

This article explains the Model Context Protocol (MCP) – an open, consensus‑based standard that lets large language models seamlessly interact with external tools and data, describes its architecture, why it’s needed, how models choose tools, and provides a step‑by‑step Python server implementation with code examples.

LLMMCPTool Calling
0 likes · 22 min read
What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?
Ops Development & AI Practice
Ops Development & AI Practice
Apr 10, 2025 · Artificial Intelligence

Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector

Introducing MCP Inspector, a GUI-based debugger for Model Context Protocol (MCP) servers that lets developers visualize tool registrations, prompt templates, resources, and real-time interactions, while providing commands to launch, control, and troubleshoot LLM applications, ultimately streamlining development and reducing debugging friction.

LLMMCP InspectorModel Context Protocol
0 likes · 8 min read
Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector
AI Algorithm Path
AI Algorithm Path
Apr 10, 2025 · Artificial Intelligence

Beginner-Friendly Guide to Understanding Large Language Models

This article walks readers through the fundamentals of large language models, covering what tokens are, how tokenization works, the conversion of tokens to numeric IDs, the transformer architecture—including positional encoding, self‑attention, feed‑forward networks and softmax—and explains how these components enable next‑token prediction.

Artificial IntelligenceEmbeddingLLM
0 likes · 9 min read
Beginner-Friendly Guide to Understanding Large Language Models
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Apr 10, 2025 · Artificial Intelligence

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

This guide walks through creating a Retrieval‑Augmented Generation (RAG) system using Spring Boot 3.4.2, Milvus vector database, and the bge‑m3 embedding model via Ollama, covering environment setup, dependency configuration, vector store operations, and integration with a large language model to deliver refined, similarity‑based answers.

EmbeddingLLMMilvus
0 likes · 11 min read
Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 10, 2025 · Artificial Intelligence

Building a Pet Hospital AI Assistant with RAG and LLMs

This article walks through the motivation, core concepts of Retrieval‑Augmented Generation, and a step‑by‑step guide to constructing a pet‑hospital AI assistant on Alibaba Cloud using LLMs, vector databases, and automated pipelines, complete with code examples and practical tips.

AI AssistantAlibaba CloudLLM
0 likes · 18 min read
Building a Pet Hospital AI Assistant with RAG and LLMs
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
Apr 7, 2025 · Artificial Intelligence

LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing

This article explores the application of Large Language Models (LLM) in text information detection and extraction, focusing on blue-collar recruitment data processing. It details the implementation of LLM through prompt engineering, RAG enhancement, and model fine-tuning to improve data cleaning efficiency and accuracy.

AI applicationsLLMRAG
0 likes · 31 min read
LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing
JD Cloud Developers
JD Cloud Developers
Apr 7, 2025 · Artificial Intelligence

Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents

This article explains why overloading prompts and tools harms AI‑Agent performance, and offers practical modular design, intent‑driven instruction splitting, and efficient context management strategies such as curated function‑call tools and dynamic RAG to reduce token costs, improve response speed, and avoid hallucinations.

AI AgentLLMRAG
0 likes · 13 min read
Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents
AI Frontier Lectures
AI Frontier Lectures
Apr 6, 2025 · Artificial Intelligence

Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?

A new study from the a‑m‑team introduces “Think Twice”, a test‑time multi‑round reasoning technique that, without additional training or model changes, repeatedly prompts large language models to self‑correct, yielding notable accuracy gains across benchmarks such as AIME, MATH‑500, GPQA‑Diamond and LiveCodeBench, while also producing shorter, more confident answers.

Artificial IntelligenceLLMMulti-round reasoning
0 likes · 6 min read
Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?
21CTO
21CTO
Apr 5, 2025 · Artificial Intelligence

AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More

Developers can stay current with recent AI advancements as Anthropic introduces Claude’s educational mode, Amazon launches the Nova model hub and Act SDK, Solo.io unveils the MCP Gateway for AI tool integration, Kong updates its AI Gateway to curb hallucinations, env0 releases Cloud Analyst, CodeSignal adds AI skill assessments, and Zencoder offers new AI coding and testing agents.

AIAI PlatformsLLM
0 likes · 8 min read
AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More
Ops Development & AI Practice
Ops Development & AI Practice
Apr 5, 2025 · Artificial Intelligence

Why Do LLMs Follow Instructions So Well? Unpacking the Secrets

This article explains the concept of instruction‑following in large language models, compares early and modern LLMs, details the training techniques that enable it, highlights its importance, offers practical prompting tips, and discusses current challenges and future directions.

AILLMinstruction following
0 likes · 10 min read
Why Do LLMs Follow Instructions So Well? Unpacking the Secrets