Tagged articles

891 articles

Page 8 of 9

Nov 18, 2024 · Artificial Intelligence

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

This article shares a half‑year of hands‑on experience with Retrieval‑Augmented Generation, analyzing why simple RAG setups often feel unintelligent, identifying three core knowledge issues, and presenting concrete optimization strategies—including chunking, knowledge expansion, and tag‑based conflict resolution—to improve retrieval and generation performance in low‑resource environments.

AIRAGinformation retrieval

0 likes · 25 min read

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

dbaplus Community

Nov 16, 2024 · Artificial Intelligence

Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability

The article critiques LLM frameworks, comparing them to early ORM tools, explains how Retrieval Augmented Generation works, warns against premature optimization, and advises developers to favor simple, visible practices over complex, abstracted frameworks for better control and understanding.

AILLMModelEvaluation

0 likes · 7 min read

Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability

ITPUB

Nov 15, 2024 · Databases

Why Vector Databases Matter: Deploying PgVector on PostgreSQL for Scalable AI Retrieval

This article explains the need for vector databases in the AI era, reviews PostgreSQL's extensible ecosystem, compares vector‑database options, provides step‑by‑step PgVector installation and usage, shares operational best practices, performance tuning tips, and real‑world Qunar & Tujia case studies.

AIPostgreSQLRAG

0 likes · 27 min read

Why Vector Databases Matter: Deploying PgVector on PostgreSQL for Scalable AI Retrieval

Alibaba Cloud Developer

Nov 14, 2024 · Artificial Intelligence

Building a High‑Accuracy Automotive Maintenance Q&A System with Multi‑Agent LLMs

This article details how to design, implement, and evaluate a complex‑table intelligent Q&A solution for automotive maintenance using large language models, RAG pipelines, multi‑agent architectures, prompt engineering, and Alibaba Cloud services, achieving up to 93.8% accuracy.

LLMMulti-AgentPrompt engineering

0 likes · 31 min read

Building a High‑Accuracy Automotive Maintenance Q&A System with Multi‑Agent LLMs

Alibaba Cloud Developer

Nov 13, 2024 · Artificial Intelligence

Boost RAG Accuracy with GraphRAG: Combining Knowledge Graphs and Vectors on PolarDB

This article explains how to build a GraphRAG system that integrates knowledge graphs and vector embeddings using PolarDB, Alibaba Cloud's Tongyi Qianwen LLM, and LangChain, demonstrating improved retrieval‑augmented generation accuracy through combined graph‑and‑vector search.

GraphRAGKnowledge GraphLangChain

0 likes · 23 min read

Boost RAG Accuracy with GraphRAG: Combining Knowledge Graphs and Vectors on PolarDB

Architects' Tech Alliance

Nov 12, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts Enterprise AI with Intel Optimizations

This article explains the fundamentals of Retrieval‑Augmented Generation (RAG), its four‑step workflow, architecture, and how Intel’s hardware and software optimizations—including vector search, quantized embeddings, and advanced inference extensions—enhance performance, security, and scalability for enterprise LLM applications.

AI inferenceEmbedding QuantizationIntel Optimization

0 likes · 14 min read

How Retrieval‑Augmented Generation Boosts Enterprise AI with Intel Optimizations

Data Thinking Notes

Nov 12, 2024 · Artificial Intelligence

Unlock Data Power with DB‑GPT: An Open‑Source AI Framework for Data Development

DB‑GPT is an open‑source AI‑native data application framework that unifies multi‑model management, RAG, agents, and workflow orchestration to simplify building large‑model‑driven data solutions, offering features such as private Q&A, multi‑source analytics, automated fine‑tuning, and robust privacy security.

AIData FrameworkFine-tuning

0 likes · 13 min read

Unlock Data Power with DB‑GPT: An Open‑Source AI Framework for Data Development

Aikesheng Open Source Community

Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBAFault DiagnosisKnowledge Base

0 likes · 10 min read

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

Aikesheng Open Source Community

Nov 11, 2024 · Databases

ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management

ChatDBA is an AI‑driven conversational system developed by Shanghai Aikesheng that assists DBAs with fault diagnosis, knowledge learning, SQL generation and optimization by leveraging large language models, RAG architecture, and advanced retrieval and document‑processing techniques.

ChatDBAFault DiagnosisLLM

0 likes · 10 min read

ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management

Baidu Tech Salon

Nov 11, 2024 · Cloud Native

Baidu Cloud Native Data Platform: Empowering Enterprise AI in the LLM Era

To empower enterprise AI in the LLM era, Baidu Cloud unveils a cloud‑native data platform featuring upgraded databases—PegaDB, GaiaDB 5.0, Vector DB 2.0, Palo 2.0—and integrated services like DBSC 2.0, EDAP 2.0, and DBStack, delivering high‑performance, cost‑effective handling of structured, unstructured, and vector data for fine‑tuning and Enterprise RAG.

DBStackData LakehouseEDAP

0 likes · 10 min read

Baidu Cloud Native Data Platform: Empowering Enterprise AI in the LLM Era

JavaEdge

Nov 9, 2024 · Artificial Intelligence

Build an AI‑Powered Airline Ticket Agent with Spring AI Alibaba

This tutorial walks through creating an intelligent airline‑ticket customer‑service agent using Spring AI Alibaba, covering requirements, architecture, RAG integration, function calling, chat memory, core capabilities, code implementation with ChatClient, and a sample running result.

AI AgentAlibabaChat Memory

0 likes · 9 min read

Build an AI‑Powered Airline Ticket Agent with Spring AI Alibaba

DataFunSummit

Nov 8, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Retrieval‑Augmented Generation

ChatDBA, developed by Shanghai Aikesheng, is an AI-driven database operation assistant that leverages large language models and Retrieval‑Augmented Generation to provide fault diagnosis, knowledge learning, SQL generation and optimization, addressing challenges such as vague outputs, complex troubleshooting logic, and memory management through a structured architecture and multi‑modal retrieval strategies.

AIFault DiagnosisRAG

0 likes · 10 min read

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Retrieval‑Augmented Generation

AI Large Model Application Practice

Nov 8, 2024 · Artificial Intelligence

How to Build a Multimodal Embedding RAG with Cohere and LlamaIndex

This guide explains how to overcome the limitations of text‑only embeddings for enterprise AI search by using a multimodal embedding model to index and retrieve both text and images, detailing the full workflow, code examples, and performance benefits.

CohereLLMLlamaIndex

0 likes · 13 min read

How to Build a Multimodal Embedding RAG with Cohere and LlamaIndex

NewBeeNLP

Nov 7, 2024 · Artificial Intelligence

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

This article provides a comprehensive analysis of large language model hallucinations, detailing their definitions, classifications, root causes, detection techniques, and a wide range of mitigation approaches—including RAG pipelines, decoding strategies, and model‑enhancement methods—to improve reliability and safety in real‑world AI applications.

AI SafetyModel EvaluationPrompt engineering

0 likes · 22 min read

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

Sohu Tech Products

Nov 6, 2024 · Artificial Intelligence

RAG2.0 Engine Design Challenges and Implementation

The talk outlines RAG2.0’s design challenges—low vector recall, complex documents, semantic gaps—and presents a two‑stage architecture using deep multimodal understanding and knowledge‑graph‑enhanced retrieval, detailing advanced chunking, multi‑index and multi‑path retrieval, efficient sorting models like ColBERT, and future multi‑modal and memory‑augmented agent directions.

ColBERTDelayed InteractionEnterprise AI

0 likes · 23 min read

RAG2.0 Engine Design Challenges and Implementation

37 Interactive Technology Team

Nov 4, 2024 · Artificial Intelligence

Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components

The article outlines a step‑by‑step methodology for creating Retrieval‑Augmented Generation and custom Agent applications with LangChain, illustrated by an AI assistant for activity components that evolves from a rapid Dify prototype to a LangChain‑based RAG system and finally a hand‑crafted ReAct‑style agent, detailing LCEL chain composition, vector‑search integration, model performance trade‑offs, and a unified routing layer.

AI AssistantAgentCloud-native

0 likes · 6 min read

Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components

DataFunTalk

Oct 31, 2024 · Artificial Intelligence

Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice

This article presents the evolution from traditional to intelligent BI, explores how large language models enable natural‑language data analysis, details the OlaChat platform’s architecture, metadata‑enhanced retrieval methods, Text2SQL pipeline, multi‑turn dialogue system, and shares practical deployment insights and Q&A.

Business IntelligenceIntelligent AnalyticsLLM

0 likes · 20 min read

Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice

JD Tech

Oct 31, 2024 · Artificial Intelligence

Design and Implementation of the Logistics Intelligent Robot “Yunli XiaoZhi” Powered by Large Language Models

The article details the development of Yunli XiaoZhi, an AI‑driven logistics chatbot that combines knowledge‑base Q&A, data‑analysis, proactive alerts and report‑pushing to streamline SOP access, reduce manual query effort, and improve operational efficiency for operators, carriers and drivers.

AI chatbotKnowledge BaseRAG

0 likes · 22 min read

Design and Implementation of the Logistics Intelligent Robot “Yunli XiaoZhi” Powered by Large Language Models

AI Large Model Application Practice

Oct 30, 2024 · Artificial Intelligence

How to Efficiently Incrementally Update Knowledge in RAG Applications

Incremental knowledge updates in Retrieval‑Augmented Generation (RAG) systems can be achieved by using document‑level or chunk‑level strategies, leveraging hash fingerprints, record managers, and framework‑specific APIs such as LangChain’s index() with cleanup modes or LlamaIndex’s ingestion pipeline, reducing redundant computation and cost.

LangChainLlamaIndexRAG

0 likes · 12 min read

How to Efficiently Incrementally Update Knowledge in RAG Applications

Baobao Algorithm Notes

Oct 29, 2024 · Industry Insights

Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine

In this interview, Perplexity AI CEO Aravind Srinivas explains the company’s retrieval‑augmented generation architecture, multi‑model strategy, vector‑database use, competitive positioning against Google, monetization plans, and future product road‑map, offering a deep industry perspective on AI‑driven search.

AI startupIndustry analysisLLM

0 likes · 38 min read

Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine

Baidu Geek Talk

Oct 28, 2024 · Artificial Intelligence

Baidu Intelligent Cloud Qianfan AppBuilder: Enterprise-Level Large Model Application Development Platform

Baidu Intelligent Cloud’s Qianfan AppBuilder 3.0 offers an enterprise‑grade platform that simplifies large‑model application development by providing high‑accuracy RAG, robust agent scheduling, extensive integration, secure private‑or‑hybrid deployment, and a guided methodology, enabling industries to transform processes, add AI copilots, and create novel capabilities.

AI integrationBaidu Intelligent CloudDigital Transformation

0 likes · 12 min read

Baidu Intelligent Cloud Qianfan AppBuilder: Enterprise-Level Large Model Application Development Platform

DevOps

Oct 27, 2024 · Artificial Intelligence

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

This article reviews Wang et al.'s 2024 research on Retrieval‑Augmented Generation, outlining optimal practices such as query classification, chunk sizing, hybrid metadata search, embedding selection, vector databases, query transformation, reranking, document repacking, summarization, fine‑tuning, and multimodal retrieval to guide developers in constructing high‑performance RAG pipelines.

LLMQuery ClassificationRAG

0 likes · 11 min read

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

DataFunSummit

Oct 27, 2024 · Artificial Intelligence

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

This article describes Siemens' journey in applying generative AI and Retrieval‑Augmented Generation to create an internal knowledge chatbot, detailing the business challenges, technical architecture, data integration, multi‑modal capabilities, deployment outcomes, and strategic lessons for enterprise AI adoption.

AI chatbotData IntegrationEnterprise Knowledge Management

0 likes · 21 min read

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

Alibaba Cloud Native

Oct 26, 2024 · Artificial Intelligence

Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute

This tutorial walks through constructing a zero‑to‑one RAG pipeline that ingests OSS text files via EventBridge, transforms them into embeddings with DashScope, stores vectors in DashVector, and performs semantic search using FunctionCompute and a Qwen‑Turbo LLM, complete with code samples and configuration steps.

DashVectorEmbeddingEventBridge

0 likes · 10 min read

Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute

DataFunSummit

Oct 25, 2024 · Artificial Intelligence

Progress and Standardization of Large Model + Data Intelligence Applications by the China Academy of Information and Communications Technology

This article reviews the China Academy of Information and Communications Technology's advancements in large‑model‑driven data intelligence, covering development trends, key deployment technologies such as prompt engineering, fine‑tuning and RAG, emerging application paradigms, challenges, and a series of newly drafted standards to guide industry adoption.

AIData IntelligenceKnowledge Graph

0 likes · 13 min read

Progress and Standardization of Large Model + Data Intelligence Applications by the China Academy of Information and Communications Technology

DataFunSummit

Oct 24, 2024 · Big Data

Bilibili’s Large Language Model‑Based Intelligent Assistant for the Big Data Platform: Architecture, Principles, and Deployment

This article details Bilibili’s implementation of a large‑language‑model‑driven intelligent assistant for its massive big‑data platform, covering background, problem analysis, architectural design, knowledge‑base construction, precision and recall challenges, deployment across offline and real‑time Spark/Flink diagnostics, and future outlooks.

AgentBig DataFlink

0 likes · 23 min read

Bilibili’s Large Language Model‑Based Intelligent Assistant for the Big Data Platform: Architecture, Principles, and Deployment

21CTO

Oct 23, 2024 · Artificial Intelligence

IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models

IBM introduced the Granite 3.0 series, an open‑source family of large language models that combine cutting‑edge performance with enhanced security, multi‑language support, and cost‑efficiency, while offering a variety of base, instruct, and specialist variants for enterprise use.

AI modelsGraniteIBM

0 likes · 4 min read

IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models

DaTaobao Tech

Oct 23, 2024 · Artificial Intelligence

Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges

Retrieval-Augmented Generation (RAG) combines a retriever that fetches relevant external documents and a generator that uses them, improving LLM accuracy, relevance, privacy, and up-to-date information, but faces challenges such as retrieval latency, computational cost, chunking strategies, embedding selection, and system integration complexity.

AIKnowledge RetrievalLLM

0 likes · 13 min read

Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges

Alibaba Cloud Big Data AI Platform

Oct 22, 2024 · Artificial Intelligence

How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search

At the 2024 Alibaba Cloud Yúnxī Conference, senior AI Search expert Xing Shaomin detailed the enterprise‑grade Retrieval‑Augmented Generation (RAG) pipeline, covering critical link architecture, effectiveness, performance, and cost optimizations, as well as practical applications, vector store enhancements, LLM agents, and deployment strategies.

AI searchCost OptimizationEnterprise AI

0 likes · 16 min read

How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search

DataFunSummit

Oct 21, 2024 · Artificial Intelligence

Retrieval‑Augmented Generation (RAG) for Office Applications: Architecture, Challenges, and Practical Practices

This article introduces Retrieval‑Augmented Generation (RAG) as a solution to the hallucination, freshness, and data‑privacy issues of large language models, details its modular architecture, explains the layered system design and hybrid retrieval pipeline, and shares the practical challenges and engineering tricks encountered when deploying RAG in enterprise office scenarios.

AIHybrid RetrievalPrompt engineering

0 likes · 19 min read

Retrieval‑Augmented Generation (RAG) for Office Applications: Architecture, Challenges, and Practical Practices

Alibaba Cloud Native

Oct 18, 2024 · Artificial Intelligence

How Spring AI Alibaba Simplifies Java AI Application Development

This article introduces the open‑source Spring AI Alibaba framework, explains its background, core features such as chat model abstraction, prompt templates, structured output, function calling, RAG and chat memory, and walks through a complete smart‑ticket‑assistant example with code snippets and deployment guidance.

AI FrameworkChat MemoryFunction Calling

0 likes · 17 min read

How Spring AI Alibaba Simplifies Java AI Application Development

DataFunSummit

Oct 18, 2024 · Artificial Intelligence

Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab

This article details how PingCAP's three‑person AI Lab leveraged Retrieval‑Augmented Generation (RAG) techniques—including basic RAG, fine‑tuned embeddings, re‑ranking, graph RAG, and agent‑based RAG—to create scalable, multilingual document‑question answering services while addressing large‑scale documentation challenges, model limitations, and user feedback loops.

AgentEmbeddingFine-tuning

0 likes · 14 min read

Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab

Architecture Digest

Oct 18, 2024 · Databases

Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance

Redis has launched an enhanced, multi‑threaded query engine that dramatically increases throughput and reduces latency for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications while maintaining sub‑10 ms response times.

Database PerformanceQuery EngineRAG

0 likes · 7 min read

Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance

Alibaba Cloud Big Data AI Platform

Oct 18, 2024 · Artificial Intelligence

Integrate Alibaba Cloud AI Search with Elasticsearch: A Step‑by‑Step Guide

This tutorial walks you through configuring Elasticsearch’s Open Inference API to connect with Alibaba Cloud AI Search, covering setup of text generation, rerank, sparse and dense vector services, and demonstrates end‑to‑end requests with code examples for building RAG and semantic search applications.

Alibaba Cloud AI SearchElasticsearchInference API

0 likes · 11 min read

Integrate Alibaba Cloud AI Search with Elasticsearch: A Step‑by‑Step Guide

Baobao Algorithm Notes

Oct 17, 2024 · Artificial Intelligence

How Contextual Retrieval Slashes RAG Failures by Up to 67% and Cuts Costs

Anthropic’s Contextual Retrieval augments traditional RAG with contextual embeddings and BM25, reducing retrieval failure rates by 49% (up to 67% with reranking), improving accuracy across domains, and lowering latency and cost through Claude’s prompt‑caching feature.

AIBM25Contextual Retrieval

0 likes · 11 min read

How Contextual Retrieval Slashes RAG Failures by Up to 67% and Cuts Costs

Alibaba Cloud Developer

Oct 17, 2024 · Artificial Intelligence

Build AI-Powered Java Apps Fast with Spring AI Alibaba: Features & Demo

Spring AI Alibaba is an open‑source Java framework that integrates Alibaba Cloud's large‑model services with Spring AI, offering high‑level abstractions for chat models, prompts, function calling, RAG, and conversation memory, and includes a complete ticket‑assistant example with code snippets.

AI FrameworkChatbotFunction Calling

0 likes · 17 min read

Build AI-Powered Java Apps Fast with Spring AI Alibaba: Features & Demo

AntData

Oct 16, 2024 · Artificial Intelligence

Building a Data Assistant Application with DB‑GPT V0.6.0

This tutorial walks through the end‑to‑end process of creating a data‑assistant application using DB‑GPT V0.6.0, covering prerequisite deployment, knowledge‑base construction, sub‑agent creation, RAG‑based QA, AWEL workflow installation, intent‑recognition knowledge base, and unified multi‑agent orchestration.

AIDB-GPTData Assistant

0 likes · 12 min read

Building a Data Assistant Application with DB‑GPT V0.6.0

Baobao Algorithm Notes

Oct 16, 2024 · Artificial Intelligence

How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning

This article analyzes the Meta Comprehensive RAG (CRAG) benchmark, detailing its three tasks, evaluation metrics, and the champion DB3 team's end‑to‑end solution that combines data preprocessing, dual‑stage retrieval, prompt engineering, LoRA‑based fine‑tuning, and public data augmentation to achieve top scores across all tasks.

BenchmarkKnowledge GraphLLM

0 likes · 17 min read

How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning

AI Large Model Application Practice

Oct 14, 2024 · Artificial Intelligence

Build a Multimodal RAG Pipeline with Kotaemon, Azure Document Intelligence, and VLM

This guide walks through setting up the open‑source Kotaemon framework, configuring Azure Document Intelligence and a visual large model, and implementing code to extract and caption images and tables from PDFs for end‑to‑end multimodal RAG applications.

AzurePythonRAG

0 likes · 12 min read

Build a Multimodal RAG Pipeline with Kotaemon, Azure Document Intelligence, and VLM

21CTO

Oct 10, 2024 · Artificial Intelligence

5 Practical AI Projects to Build Your Skills with Python

This article presents five hands‑on AI project ideas—from resume optimization to multimodal search—complete with step‑by‑step instructions, required Python libraries, and code snippets, helping beginners and intermediate developers quickly build valuable AI applications.

AIAutomationLLM

0 likes · 12 min read

5 Practical AI Projects to Build Your Skills with Python

DaTaobao Tech

Oct 9, 2024 · Artificial Intelligence

Building a Vertical Domain QA Bot with Vector Search, RAG, and SFT

This guide walks entry‑level developers through building a logistics‑focused QA bot by first embedding documents for vector similarity search, then adding retrieval‑augmented generation, fine‑tuning a small model, integrating hybrid checks, and optimizing deployment with feedback loops to achieve fast, accurate, out‑of‑scope‑aware answers.

AIChatbotFine-tuning

0 likes · 15 min read

Building a Vertical Domain QA Bot with Vector Search, RAG, and SFT

DataFunTalk

Oct 9, 2024 · Artificial Intelligence

Interview on Data Fabric, Data Virtualization, and AI Integration with Denodo Leaders

In this interview, Denodo executives discuss the origins, challenges, and future of data fabric and data virtualization, explore how generative AI and retrieval‑augmented generation enhance data management, share customer success stories, and offer strategic insights for enterprises navigating digital transformation.

Data FabricDenodoEnterprise Data Management

0 likes · 19 min read

Interview on Data Fabric, Data Virtualization, and AI Integration with Denodo Leaders

DevOps

Oct 8, 2024 · Artificial Intelligence

Top 20+ Retrieval‑Augmented Generation (RAG) Interview Questions and Answers

This article presents over twenty essential Retrieval‑Augmented Generation (RAG) interview questions with detailed answers, covering fundamentals, applications, architecture, training, limitations, ethical considerations, and integration, offering AI enthusiasts and job candidates a comprehensive guide to mastering RAG concepts.

AI InterviewNLPRAG

0 likes · 15 min read

Top 20+ Retrieval‑Augmented Generation (RAG) Interview Questions and Answers

Java Tech Enthusiast

Oct 8, 2024 · Artificial Intelligence

Spring AI Framework for Java Developers

Spring AI is a Java‑centric framework that unifies access to chat, text‑to‑image, embedding and retrieval‑augmented generation models—including OpenAI, Anthropic and Alibaba’s Tongyi Qianwen—through synchronous or asynchronous APIs, POJO mapping, function calling, vector‑store integration and fluent tooling for rapid AI agent development.

AI frameworksFunction CallingJava development

0 likes · 5 min read

JD Tech Talk

Oct 8, 2024 · Artificial Intelligence

Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant

This article explains how to construct a Retrieval‑Augmented Generation pipeline in Rust, covering knowledge‑base creation with Qdrant, model loading and embedding using the candle library, data ingestion, and integration of a Rust‑based inference service based on mistral.rs, while also discussing resource usage and common pitfalls.

AIEmbeddingLLM

0 likes · 16 min read

Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant

JD Cloud Developers

Oct 8, 2024 · Artificial Intelligence

How to Build a Rust-Powered Retrieval‑Augmented Generation (RAG) System from Scratch

This article explains how to construct a Retrieval‑Augmented Generation pipeline in Rust, covering knowledge‑base creation with Qdrant, model loading and embedding generation using candle, and integrating a Rust‑based inference service to answer queries with up‑to‑date external data.

EmbeddingLLMQdrant

0 likes · 17 min read

How to Build a Rust-Powered Retrieval‑Augmented Generation (RAG) System from Scratch

Architect

Oct 7, 2024 · Artificial Intelligence

Master Prompt Engineering: A Universal Framework for Building Effective LLM Prompts

This article presents a systematic, four‑part Prompt engineering framework—role definition, problem description, goal setting, and requirement specification—augmented with RAG, few‑shot examples, memory handling, and model‑parameter tuning, enabling developers to craft high‑quality prompts for large language models across diverse tasks.

Few‑Shot LearningModel ParametersPrompt engineering

0 likes · 28 min read

Master Prompt Engineering: A Universal Framework for Building Effective LLM Prompts

JavaEdge

Oct 2, 2024 · Artificial Intelligence

Boost RAG Retrieval Accuracy with Contextual Embeddings and BM25

This article presents a contextual retrieval technique that combines contextual embeddings and contextual BM25 to reduce RAG miss rates by up to 67%, explains the underlying methods, implementation steps, cost considerations, experimental results, and practical deployment guidance.

AIBM25Contextual Retrieval

0 likes · 17 min read

Boost RAG Retrieval Accuracy with Contextual Embeddings and BM25

DataFunSummit

Oct 2, 2024 · Artificial Intelligence

NVIDIA’s Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end stack for large language models, covering the NeMo Framework for data processing, training, and deployment, the open‑source TensorRT‑LLM inference accelerator, and the Retrieval‑Augmented Generation (RAG) technique that enriches model outputs with external knowledge.

NeMoNvidiaRAG

0 likes · 17 min read

NVIDIA’s Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

JD Cloud Developers

Sep 30, 2024 · Artificial Intelligence

How a Large‑Model Powered Bot Boosts Logistics Ops with Smart Q&A and Data Insights

This article describes the design, implementation, and impact of a large‑model‑driven logistics chatbot that unifies knowledge Q&A, data analysis, proactive alerts, and report pushing to streamline operations for functional staff, frontline workers, and managers, dramatically reducing query time and improving decision efficiency.

AI chatbotEnterprise AIKnowledge Base

0 likes · 20 min read

How a Large‑Model Powered Bot Boosts Logistics Ops with Smart Q&A and Data Insights

JD Tech Talk

Sep 30, 2024 · Artificial Intelligence

Yunli XiaoZhi: An AI‑Powered Intelligent Assistant for Knowledge Q&A and Data Analysis in Logistics Operations

The document describes the design, implementation, and operational results of Yunli XiaoZhi, an AI‑driven portable knowledge‑base and data‑analysis chatbot that consolidates SOPs, manuals, and real‑time information for logistics staff, using LangChain‑based RAG, vector databases, and large‑model prompting to improve query efficiency, proactive alerts, and reporting across multiple user groups.

AIChatbotKnowledge Base

0 likes · 19 min read

Yunli XiaoZhi: An AI‑Powered Intelligent Assistant for Knowledge Q&A and Data Analysis in Logistics Operations

JD Tech Talk

Sep 29, 2024 · Artificial Intelligence

Building a Simple Local AI Question‑Answer System with Java, LangChain, Ollama, and ChromaDB

This article explains how to set up a lightweight local AI Q&A system using Java, LangChain (and LangChain4J), Ollama for LLM inference, embedding techniques, and a vector database (ChromaDB), covering core concepts, environment preparation, Maven dependencies, and sample code.

AILLMLangChain

0 likes · 21 min read

Building a Simple Local AI Question‑Answer System with Java, LangChain, Ollama, and ChromaDB

JD Cloud Developers

Sep 29, 2024 · Artificial Intelligence

Build a Local AI Q&A System with Java, Ollama, and LangChain4J

This article walks through building a local AI question‑answer system using Java, Ollama, LangChain4J, embeddings, and a Chroma vector database, covering LLM fundamentals, embedding techniques, RAG architecture, setup steps, Maven dependencies, and sample code to retrieve and answer queries.

AIEmbeddingJava

0 likes · 19 min read

Build a Local AI Q&A System with Java, Ollama, and LangChain4J

21CTO

Sep 28, 2024 · Artificial Intelligence

How Digital Twins and Generative AI Are Transforming Real‑Time Monitoring

This article explores how digital twins evolve from design tools to real‑time monitoring platforms, how integrating generative AI and retrieval‑augmented generation (RAG) boosts AI accuracy and situational awareness, and why software teams must adopt these combined technologies to stay ahead in modern operations.

Digital TwinRAGgenerative AI

0 likes · 11 min read

How Digital Twins and Generative AI Are Transforming Real‑Time Monitoring

Tencent Cloud Developer

Sep 27, 2024 · Artificial Intelligence

A Comprehensive Prompt Engineering Framework: Universal Templates, RAG, Few‑Shot, Memory, and Automated Optimization

The article presents a universal four‑part prompt template—role, problem description, goal, and requirements—augmented with role definitions, RAG‑based knowledge retrieval, few‑shot examples, memory handling, temperature/top‑p tuning, and automated optimization techniques such as APE, APO, and OPRO, enabling developers to reliably craft high‑quality prompts for LLMs.

AI Prompt OptimizationFew‑Shot LearningPrompt engineering

0 likes · 26 min read

A Comprehensive Prompt Engineering Framework: Universal Templates, RAG, Few‑Shot, Memory, and Automated Optimization

iQIYI Technical Product Team

Sep 26, 2024 · Artificial Intelligence

AI-Powered Search in iQIYI: Techniques, Architecture, and Implementation

iQIYI’s AI‑powered search expands beyond title‑only queries by handling fuzzy role, plot, star, award, and semantic searches, using Chain‑of‑Thought‑generated TIPS, Retrieval‑Augmented Generation with sophisticated indexing, chunking, embedding, reranking, and prompt‑engineering to deliver personalized, accurate video recommendations that boost user engagement.

AI searchEmbeddingQuery Guidance

0 likes · 15 min read

AI-Powered Search in iQIYI: Techniques, Architecture, and Implementation

AntData

Sep 26, 2024 · Artificial Intelligence

DB-GPT: Open-Source AI-Native Data Application Development Framework

DB‑GPT is an open‑source AI‑native data‑application framework that provides multi‑model management, Text‑to‑SQL optimization, RAG, multi‑agent collaboration, and intelligent workflow orchestration, enabling developers to build scalable large‑model database applications, with proven enterprise adoption, community growth, and academic publications.

AIRAGdata engineering

0 likes · 6 min read

DB-GPT: Open-Source AI-Native Data Application Development Framework

JavaEdge

Sep 24, 2024 · Artificial Intelligence

Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation

This article explains how to extend large language models with domain‑specific knowledge using Retrieval‑Augmented Generation (RAG) in LangChain4j, covering the concepts of RAG, its indexing and retrieval stages, simple RAG setup, detailed API usage, and advanced customization options such as query transformers and content injectors.

EmbeddingJavaLLM

0 likes · 24 min read

Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation

Alibaba Cloud Developer

Sep 23, 2024 · Artificial Intelligence

Boosting Aviator Script Development with AI—No Model Training Required

This article details an engineering‑focused practice that uses large language models, RAG, prompt engineering, and reranking to automatically generate, review, and refine Aviator scripts for decision‑center policies without any model pre‑training, offering practical insights and code examples for developers.

AI code generationAviator scriptLLM

0 likes · 29 min read

Boosting Aviator Script Development with AI—No Model Training Required

Fighter's World

Sep 22, 2024 · Artificial Intelligence

How Large-Model AI Transforms Smart Customer Service – Alibaba Cloud Insights

The talk outlines the evolution of intelligent customer service over three decades, explains how generative large-model AI like ChatGPT has raised service expectations, and presents Alibaba Cloud’s four-stage implementation—experience, efficiency, capability, and insight—through three concrete cases and a roadmap for SMEs to build their own smart service systems.

AI agentsAlibaba-CloudLarge Model

0 likes · 12 min read

How Large-Model AI Transforms Smart Customer Service – Alibaba Cloud Insights

Senior Brother's Insights

Sep 19, 2024 · Artificial Intelligence

Rule Engines vs AI Models: Choosing the Right Approach for Product Logic

The article compares traditional rule‑engine architectures with AI‑driven models, explains their differing characteristics, outlines when deterministic rule matching is preferable over flexible AI inference, and recommends practical technologies such as Drools for rule‑based solutions and LLM‑based RAG/Agent frameworks for AI‑centric scenarios.

AIDroolsLLM

0 likes · 9 min read

Rule Engines vs AI Models: Choosing the Right Approach for Product Logic

JavaEdge

Sep 19, 2024 · Artificial Intelligence

Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture

LangChain4j streamlines the integration of large language models into Java applications by offering a standardized API, extensive support for over a dozen LLM providers and vector stores, a rich toolbox for RAG, chat memory, and tool calling, plus two abstraction layers that cater to both low‑level control and high‑level convenience.

AIIntegrationJava

0 likes · 10 min read

Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture

Qunhe Technology Quality Tech

Sep 19, 2024 · Artificial Intelligence

Deploy FastGPT Locally: Step‑by‑Step Docker & Source Code Guide for RAG AI

This article explains how to set up FastGPT, a Retrieval‑Augmented Generation (RAG) knowledge‑base system powered by large language models, covering both Docker‑compose image deployment and source‑code installation, including environment configuration, database setup, and API usage examples.

AIDockerFastGPT

0 likes · 13 min read

Deploy FastGPT Locally: Step‑by‑Step Docker & Source Code Guide for RAG AI

21CTO

Sep 17, 2024 · Artificial Intelligence

How to Build AI-Powered Semantic Search with OpenAI Embeddings and Milvus

This guide explains vector embeddings, OpenAI's text‑embedding models, and how to use PyMilvus with Zilliz Cloud to create a Retrieval‑Augmented Generation (RAG) system for fast, accurate semantic search.

MilvusOpenAIPython

0 likes · 12 min read

How to Build AI-Powered Semantic Search with OpenAI Embeddings and Milvus

DevOps

Sep 13, 2024 · Artificial Intelligence

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

The article outlines fifteen advanced Retrieval‑Augmented Generation (RAG) techniques—from hierarchical indexing and context caching to multimodal alignment and microservice orchestration—explaining how they help transform AI prototypes into scalable, reliable production systems while highlighting common pitfalls and a concluding call to action.

AI productionLLMRAG

0 likes · 8 min read

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

Code Mala Tang

Sep 12, 2024 · Artificial Intelligence

Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js

This article explains the core concepts of Retrieval‑Augmented Generation (RAG), walks through its implementation steps with LangChain.js—including text chunking, embedding, storage, retrieval, and generation—and showcases practical use cases, challenges, and best practices for building reliable AI‑powered applications.

AI applicationsEmbeddingLLM

0 likes · 16 min read

Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js

Baidu Geek Talk

Sep 11, 2024 · Databases

Why Vector Databases Are the Next Big Thing in AI: A Deep Dive into RAG and Baidu’s VectorDB

This article examines the 70‑year evolution of databases, explains how large‑model AI drives the rise of vector databases and Retrieval‑Augmented Generation (RAG), outlines the four‑stage RAG workflow, compares Baidu’s self‑built VectorDB with open‑source alternatives, and showcases real‑world deployments that highlight performance, scalability, and enterprise benefits.

AIDatabase ArchitectureRAG

0 likes · 16 min read

Why Vector Databases Are the Next Big Thing in AI: A Deep Dive into RAG and Baidu’s VectorDB

Alibaba Cloud Big Data AI Platform

Sep 11, 2024 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with OpenSearch LLM and Dify

Learn step‑by‑step how to integrate OpenSearch LLM’s intelligent Q&A edition with the Dify large‑model platform to create a robust Retrieval‑Augmented Generation (RAG) system, covering architecture, workflow setup, API authentication, result parsing, and practical code examples.

AIDifyLLM

0 likes · 7 min read

How to Build a Retrieval‑Augmented Generation (RAG) System with OpenSearch LLM and Dify

Alibaba Cloud Big Data AI Platform

Sep 10, 2024 · Artificial Intelligence

Unlocking AI Search with Alibaba Cloud Elasticsearch: Vectors, HNSW & RAG

This article details Alibaba Cloud Elasticsearch's AI search advancements, covering embedding vectors, HNSW-based approximate nearest neighbor search, hardware-accelerated vector engines, sparse vectors, hybrid retrieval, the Inference API, and RAG implementations that together boost performance, efficiency, and relevance for modern AI-driven search applications.

ElasticsearchHNSWRAG

0 likes · 11 min read

Unlocking AI Search with Alibaba Cloud Elasticsearch: Vectors, HNSW & RAG

Architect's Alchemy Furnace

Sep 6, 2024 · Artificial Intelligence

Exploring LLM Application Architectures: From AI Embedded to Multi‑Agent Systems

This article examines the typical business and technical architectures for large language model applications, covering AI Embedded, Copilot, and Agent modes, single‑ and multi‑agent systems, core frameworks, and guidance on selecting appropriate technical routes.

AI agentsLLMMulti-Agent

0 likes · 11 min read

Exploring LLM Application Architectures: From AI Embedded to Multi‑Agent Systems

DataFunSummit

Sep 6, 2024 · Artificial Intelligence

Knowledge Graph and RAG Applications in 360 Document Cloud: Challenges and Solutions

This article presents a comprehensive overview of 360's document cloud knowledge management and Q&A scenarios, discussing business pain points, large‑model challenges, the advantages of the intelligent document solution, and how knowledge graphs enhance retrieval‑augmented generation and document standardization for AI‑driven enterprise applications.

AIDocument ManagementEnterprise AI

0 likes · 15 min read

Knowledge Graph and RAG Applications in 360 Document Cloud: Challenges and Solutions

Baidu Intelligent Cloud Tech Hub

Sep 5, 2024 · Databases

How Vector Databases Power AI and RAG: Insights from Baidu’s DTCC 2024

This article reviews the 70‑year evolution of databases, explains how vector databases and Retrieval‑Augmented Generation (RAG) are reshaping AI applications, and details Baidu Intelligent Cloud's VectorDB architecture, performance advantages, real‑world use cases, and future trends in data engineering.

AIDatabase ArchitectureDistributed Systems

0 likes · 16 min read

How Vector Databases Power AI and RAG: Insights from Baidu’s DTCC 2024

DataFunSummit

Sep 4, 2024 · Artificial Intelligence

How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications

This article explains how Elasticsearch’s advanced search capabilities—including vector and semantic search, hardware acceleration, hybrid retrieval, model re‑ranking, multi‑vector support, and integrated security—enable robust RAG implementations and outlines future directions such as a new compute engine, stronger vector engines, and cloud‑native serverless deployment.

AIElasticsearchHybrid Search

0 likes · 9 min read

How Elasticsearch Powers Retrieval‑Augmented Generation (RAG) Applications

Full-Stack Cultivation Path

Sep 4, 2024 · Artificial Intelligence

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

This article introduces Kotaemon, an open‑source Retrieval‑Augmented Generation platform that lets users chat with their documents, offering a self‑hosted web UI, support for local and API LLMs, hybrid retrieval, multimodal question answering, GraphRAG indexing, and advanced reasoning capabilities, along with step‑by‑step installation via App or Docker.

GraphRAGLLMMultimodal QA

0 likes · 6 min read

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

AI Large Model Application Practice

Sep 4, 2024 · Artificial Intelligence

When to Use GraphRAG vs. Traditional RAG and How to Combine Them

This article compares GraphRAG with traditional RAG across seven dimensions—suitable scenarios, knowledge representation, retrieval, comprehensive queries, hidden‑relationship understanding, scalability, and performance‑cost trade‑offs—explains how they can be fused, and offers guidance on selecting the right approach for complex data‑driven applications.

GraphRAGLLMRAG

0 likes · 13 min read

When to Use GraphRAG vs. Traditional RAG and How to Combine Them

Alibaba Cloud Big Data AI Platform

Sep 2, 2024 · Artificial Intelligence

Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems

This article explains why generic large language models struggle with domain‑specific data, introduces Retrieval‑Augmented Generation (RAG) as a solution, compares Word and PDF formats, outlines document‑parsing pipelines, reviews open‑source PDF tools, and presents Alibaba Cloud's rule‑based parsing architecture with performance results.

AIDocument ParsingLLM

0 likes · 13 min read

Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems

Data Thinking Notes

Sep 1, 2024 · Artificial Intelligence

Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI

This article provides a comprehensive overview of large language models, covering their fundamental concepts, historical milestones, parameter scaling, prompt engineering techniques, retrieval‑augmented generation, autonomous agents, and multimodal model applications, illustrating how these technologies reshape AI capabilities across domains.

AI agentsLLMPrompt engineering

0 likes · 22 min read

Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI

AI Large Model Application Practice

Aug 29, 2024 · Artificial Intelligence

8 Essential Indexing Strategies to Boost Enterprise RAG Performance

This article presents eight practical optimization recommendations for the indexing stage of enterprise‑level Retrieval‑Augmented Generation (RAG) applications, covering chunk creation, abbreviation handling, multimodal document processing, semantic enrichment, metadata usage, alternative index types, and embedding model selection.

RAGchunkingindexing

0 likes · 15 min read

8 Essential Indexing Strategies to Boost Enterprise RAG Performance

DataFunSummit

Aug 29, 2024 · Artificial Intelligence

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

AILLMNPC

0 likes · 18 min read

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

Qunar Tech Salon

Aug 28, 2024 · Databases

Why Vector Databases Are Needed, PgVector Installation, Usage, and Operational Practices in PostgreSQL

This article explains the necessity of vector databases for AI workloads, reviews the PostgreSQL ecosystem, compares vector database options, provides detailed PgVector installation and usage steps, shares operational best‑practices, performance tuning tips, and real‑world deployment cases at Qunar and Tujia.

AIPostgreSQLRAG

0 likes · 24 min read

Why Vector Databases Are Needed, PgVector Installation, Usage, and Operational Practices in PostgreSQL

Qunhe Technology Quality Tech

Aug 27, 2024 · Artificial Intelligence

Boosting Test Code Quality: How Large Language Models Transform Code Review

This article explores how mature testing teams can leverage large language models for automated code review, outlining the advantages, challenges, and a practical implementation using FastGPT and GitLab CI to build a low‑cost, AI‑enhanced review system that improves efficiency and feedback quality.

AICode reviewFastGPT

0 likes · 10 min read

Boosting Test Code Quality: How Large Language Models Transform Code Review

DataFunSummit

Aug 25, 2024 · Artificial Intelligence

Applying Large AI Models to Financial Data Governance and Innovative Use Cases

This article presents a comprehensive technical overview of how large AI models are reshaping financial data production, governance, multimodal document understanding, lakehouse storage, private‑domain model deployment, data‑centric engineering methods, and multi‑agent intelligent advisory within the finance sector.

AIMulti-AgentRAG

0 likes · 21 min read

Applying Large AI Models to Financial Data Governance and Innovative Use Cases

phodal

Aug 22, 2024 · Artificial Intelligence

What’s New in Shire 0.5? AI Coding Agent Gains SonarQube, Git, and Data‑Guarding Features

Shire 0.5 introduces SonarQube issue support, Git‑enabled ShireQL queries, a reranking function for RAG, and new data‑guarding capabilities like the redact function and customizable secret‑pattern YAML, enabling developers to securely build AI‑powered coding agents that leverage IDE assets while protecting sensitive information.

AI CodingGit queriesIDE integration

0 likes · 8 min read

What’s New in Shire 0.5? AI Coding Agent Gains SonarQube, Git, and Data‑Guarding Features

Huolala Tech

Aug 22, 2024 · Artificial Intelligence

How Large Language Models Automate Order Cancellation Responsibility at HuoLala

This article explains how HuoLala leverages large language models, multimodal feature integration, and retrieval‑augmented generation to automatically determine responsibility for order cancellations, improving accuracy, explainability, and driver‑user experience.

AIMultimodal RetrievalOrder Cancellation

0 likes · 10 min read

How Large Language Models Automate Order Cancellation Responsibility at HuoLala

Volcano Engine Developer Services

Aug 20, 2024 · Databases

How Vector Databases Power RAG: Scaling, Algorithms, and Real‑World Trade‑offs

RAG technology leverages vector databases to provide context‑aware answers without updating model parameters, and this article explores how cloud search teams integrate multiple vector algorithms, balance cost, stability and latency, and adopt open‑source solutions like OpenSearch to build scalable, enterprise‑grade retrieval systems.

AIDiskANNOpenSearch

0 likes · 21 min read

How Vector Databases Power RAG: Scaling, Algorithms, and Real‑World Trade‑offs

Alibaba Cloud Developer

Aug 19, 2024 · Artificial Intelligence

Ensuring Stable AI Agents: Engineering Practices, RAG, and Monitoring

This article shares engineering insights from Hema’s AI smart customer service deployment, detailing key stability factors for AI agents—including hallucination mitigation, memory integration, RAG enhancement, exception handling, and comprehensive monitoring—to improve reliability and performance in real‑world e‑commerce chatbot scenarios.

AI AgentLLMRAG

0 likes · 13 min read

Ensuring Stable AI Agents: Engineering Practices, RAG, and Monitoring

Selected Java Interview Questions

Aug 18, 2024 · Backend Development

Redis Introduces a Multi‑Threaded Query Engine to Boost Vector Search Performance for Generative AI

Redis has launched a multi‑threaded query engine that vertically scales its in‑memory database, dramatically increasing query throughput and lowering latency for vector similarity searches, thereby addressing the performance demands of real‑time retrieval‑augmented generation in generative AI applications.

BackendRAGgenerative AI

0 likes · 9 min read

Redis Introduces a Multi‑Threaded Query Engine to Boost Vector Search Performance for Generative AI

Programmer DD

Aug 16, 2024 · Databases

How Redis’s New Multithreaded Query Engine Supercharges Vector Search for AI

Redis has introduced a multithreaded query engine that dramatically boosts throughput and lowers latency for vector searches, enabling scalable, real‑time retrieval‑augmented generation (RAG) workloads while preserving the low‑latency performance of its core in‑memory database.

AIRAGmultithreading

0 likes · 8 min read

How Redis’s New Multithreaded Query Engine Supercharges Vector Search for AI

AI Large Model Application Practice

Aug 16, 2024 · Artificial Intelligence

How to Query a Microsoft GraphRAG Knowledge Graph with Neo4j: Local and Global Modes

This guide explains how to query a Microsoft GraphRAG knowledge graph using the official CLI, API, and a custom Neo4j implementation, covering both local and global retrieval modes, vector index creation, Cypher query customization, and integration with LangChain for end‑to‑end RAG pipelines.

LangChainMicrosoft GraphRAGNeo4j

0 likes · 13 min read

How to Query a Microsoft GraphRAG Knowledge Graph with Neo4j: Local and Global Modes

Qunhe Technology Quality Tech

Aug 14, 2024 · Artificial Intelligence

Should Your Testing Team Build a Private LLM or Use RAG with a General Model?

This article compares the high costs and technical challenges of building a private large language model with the benefits, flexibility, and lower risk of using Retrieval‑Augmented Generation (RAG) on a general LLM, offering practical guidance for testing teams seeking AI assistance.

AIModel DeploymentRAG

0 likes · 11 min read

Should Your Testing Team Build a Private LLM or Use RAG with a General Model?

DaTaobao Tech

Aug 12, 2024 · Artificial Intelligence

Challenges and Optimization Techniques for Retrieval‑Augmented Generation (RAG)

Deploying large language models faces domain gaps, hallucinations, and high barriers, so Retrieval‑Augmented Generation (RAG) combines retrieval with generation, and advanced optimizations—such as RAPTOR’s hierarchical clustering, Self‑RAG’s self‑reflective retrieval, CRAG’s corrective evaluator, proposition‑level Dense X Retrieval, sophisticated chunking, query rewriting, and hybrid sparse‑dense methods—are essential for improving accuracy, reducing hallucinations, and achieving efficient, scalable performance.

AIRAGRetrieval Augmented Generation

0 likes · 22 min read

Challenges and Optimization Techniques for Retrieval‑Augmented Generation (RAG)

37 Interactive Technology Team

Aug 12, 2024 · Backend Development

Intelligent Backend Menu Search with OpenAI Embeddings, LangChain, and DIFY

The article demonstrates how to improve backend menu navigation by building a knowledge base of menu metadata, generating concise Chinese descriptions with OpenAI embeddings, and implementing RAG retrieval using both LangChain code orchestration and DIFY’s visual workflow, highlighting each approach’s flexibility and ease of use.

Backend SearchKnowledge BaseLangChain

0 likes · 9 min read

Intelligent Backend Menu Search with OpenAI Embeddings, LangChain, and DIFY

AI Large Model Application Practice

Aug 9, 2024 · Artificial Intelligence

How to Build and Index Microsoft GraphRAG with Neo4j: A Step‑by‑Step Guide

This article explains the fundamentals of Microsoft GraphRAG, details its indexing pipeline—including text chunking, entity‑relationship extraction, community detection, and description generation—shows how to set up the graphrag library, create adaptive prompts, build the index, and import the resulting graph into Neo4j for visualization and analysis.

AIGraphRAGNeo4j

0 likes · 13 min read

How to Build and Index Microsoft GraphRAG with Neo4j: A Step‑by‑Step Guide

58 Tech

Aug 7, 2024 · Artificial Intelligence

Bridging Compute and Applications: 58.com AI Lab’s Large‑Model Platform and AI Agent Solutions

In this article, 58.com AI Lab senior director Zhan Kunlin explains how the company built a multi‑layer AI platform, created a vertical large‑language model called LingXi, and developed an AI Agent system with RAG capabilities to accelerate practical AI applications across various business scenarios.

AI PlatformAI agentsModel Deployment

0 likes · 10 min read

Bridging Compute and Applications: 58.com AI Lab’s Large‑Model Platform and AI Agent Solutions

37 Interactive Technology Team

Aug 5, 2024 · Artificial Intelligence

Case Study: Applying AIGC to Component Activity Business with Dify

This case study shows how AIGC, implemented through Dify’s low‑code platform, enables a natural‑language AI assistant to recommend and insert the optimal components from a 200‑plus library, streamlining selection, building an embedding‑based knowledge base, exposing a RAG‑driven agent via API, and demonstrating rapid AI‑business validation compared with custom frameworks.

AI AgentAIGCBusiness Automation

0 likes · 8 min read

Case Study: Applying AIGC to Component Activity Business with Dify

NewBeeNLP

Aug 5, 2024 · Industry Insights

How Alibaba Cloud Scales Search Recommendations with Big Data, AI, and LLMs

This article details Alibaba Cloud's end‑to‑end architecture for search and advertising recommendation, covering the data platform, AI services, feature‑store design, training and inference optimizations, and the integration of large language models for new recommendation scenarios.

AI PlatformAlibaba CloudBig Data

0 likes · 17 min read

How Alibaba Cloud Scales Search Recommendations with Big Data, AI, and LLMs

Architect

Aug 2, 2024 · Artificial Intelligence

Building AI‑Native Applications with Spring AI: A Complete Tutorial

This article explains how to quickly develop an AI‑native application using Spring AI, covering core features such as chat models, prompt templates, function calling, structured output, image generation, embedding, vector stores, and Retrieval‑Augmented Generation (RAG), and provides end‑to‑end Java code examples for building a simple AI‑driven service.

AI-nativeBackendFunction Calling

0 likes · 40 min read

Building AI‑Native Applications with Spring AI: A Complete Tutorial

DataFunTalk

Aug 2, 2024 · Artificial Intelligence

From Big Data to Large Models: Alibaba Cloud AI Platform Architecture and Practices for Search Recommendation

This presentation details Alibaba Cloud's AI platform, covering the end‑to‑end pipeline from big‑data processing and feature engineering to large‑model training, inference optimization, recommendation system architecture, and RAG applications, highlighting practical engineering solutions and performance gains.

AI PlatformBig DataFeature Store

0 likes · 18 min read

From Big Data to Large Models: Alibaba Cloud AI Platform Architecture and Practices for Search Recommendation

Data Thinking Notes

Aug 1, 2024 · Artificial Intelligence

Unlocking Vertical Domain LLMs: Advantages, Challenges, and Alignment Strategies

Over the past year our team explored applying large language models to specialized domains, detailing their professional benefits, unique challenges such as accuracy and knowledge‑base maintenance, and presenting solutions like alignment enhancement via BPO, Text2API, RAG, and advanced SFT/DPO techniques.

Model AlignmentRAGSFT

0 likes · 10 min read

Unlocking Vertical Domain LLMs: Advantages, Challenges, and Alignment Strategies

Open Source Tech Hub

Jul 31, 2024 · Artificial Intelligence

Understanding LLMs, AI Agents, and Retrieval-Augmented Generation: Key Concepts and Challenges

This article explains the fundamentals of large language models, artificial general intelligence, AI-generated content, AI agents, retrieval‑augmented generation, knowledge bases, multimodal processing, fine‑tuning, alignment, tokens, vectors, and related tools, highlighting their capabilities, limitations, and practical considerations.

AI AgentFine-tuningLLM

0 likes · 14 min read

Understanding LLMs, AI Agents, and Retrieval-Augmented Generation: Key Concepts and Challenges