Topic

RAG

Collection size
167 articles
Page 4 of 9
Architecture and Beyond
Architecture and Beyond
Feb 22, 2025 · Artificial Intelligence

Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models

The article explains how the inherent knowledge‑staleness, hallucination, lack of private data, non‑traceable output, limited long‑text handling, and data‑security concerns of large language models can be mitigated by Retrieval‑Augmented Generation, which combines external retrieval, augmentation, and generation to provide up‑to‑date, reliable, and secure AI responses.

AILLMLarge Language Models
0 likes · 15 min read
Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models
System Architect Go
System Architect Go
Nov 19, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp

This article explains the concept, architecture, and step‑by‑step implementation of Retrieval Augmented Generation (RAG), covering indexing, retrieval & generation processes, a practical LangChain‑Redis‑llama.cpp example on Kubernetes, code snippets, test results, challenges, and references.

AILLMLangChain
0 likes · 6 min read
Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp
Architecture Digest
Architecture Digest
Jan 16, 2025 · Artificial Intelligence

Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance for Generative AI

Redis has unveiled a multi‑threaded query engine that dramatically increases query throughput and lowers latency for vector similarity searches, offering up to 16× performance gains and enabling real‑time Retrieval‑Augmented Generation (RAG) workloads in generative AI applications.

Database performanceMulti-threadingRAG
0 likes · 7 min read
Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance for Generative AI
Architecture Digest
Architecture Digest
Oct 18, 2024 · Databases

Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance

Redis has launched an enhanced, multi‑threaded query engine that dramatically increases throughput and reduces latency for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications while maintaining sub‑10 ms response times.

Database performanceMulti-threadingQuery Engine
0 likes · 7 min read
Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance
Tencent Technical Engineering
Tencent Technical Engineering
May 19, 2025 · Artificial Intelligence

RAG, Agents, and Multimodal Large Models: Evolution, Challenges, and Future Trends

This article examines the evolution of large model technologies—including Retrieval‑Augmented Generation, AI agents, and multimodal models—detailing their technical foundations, practical challenges, industry applications, and future development trends, offering a comprehensive perspective for AI practitioners and researchers.

AI AgentRAGknowledge retrieval
0 likes · 14 min read
RAG, Agents, and Multimodal Large Models: Evolution, Challenges, and Future Trends
Java Tech Enthusiast
Java Tech Enthusiast
May 21, 2025 · Artificial Intelligence

How ChatGPT's New Memory Feature Works: Technical Analysis and Implementation Details

The article provides a detailed technical breakdown of OpenAI's new ChatGPT memory feature, explaining its two memory modes, underlying sub‑systems, possible implementation approaches using vector stores and scheduled jobs, and early user feedback highlighting both benefits and bugs.

AIChatGPTMemory Feature
0 likes · 8 min read
How ChatGPT's New Memory Feature Works: Technical Analysis and Implementation Details
Architect
Architect
Apr 1, 2025 · Artificial Intelligence

When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG

The article explains why most projects should start with prompt engineering or simple agent workflows, outlines the scenarios where model fine‑tuning adds real value, compares fine‑tuning with Retrieval‑Augmented Generation, and offers practical criteria for deciding which approach to adopt.

AI deploymentLarge Language ModelsLoRA
0 likes · 9 min read
When to Fine‑Tune Large Language Models vs. Relying on Prompting and RAG
Architect
Architect
Mar 26, 2025 · Artificial Intelligence

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

This article explains the fundamentals of AI agent memory—including short‑term, long‑term, and working memory types and their storage designs—and then details Dify's knowledge‑base segmentation modes, indexing strategies, and retrieval configurations for effective RAG applications.

DifyLLMRAG
0 likes · 14 min read
Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details
Architect
Architect
Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI ReliabilityLLMPrompt Engineering
0 likes · 32 min read
Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems
Architect
Architect
Jul 13, 2024 · Artificial Intelligence

Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations

This article provides a comprehensive, step‑by‑step guide for developing large‑language‑model (LLM) applications, covering prompt design techniques, n‑shot and chain‑of‑thought strategies, retrieval‑augmented generation, structured I/O, workflow optimization, evaluation pipelines, operational best practices, and team organization to create reliable, scalable AI products.

AI operationsLLMPrompt Engineering
0 likes · 54 min read
Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations
DevOps
DevOps
Apr 27, 2025 · Artificial Intelligence

Large Model Technologies: RAG, AI Agents, Multimodal Applications, and Future Trends

This article examines how Retrieval‑Augmented Generation (RAG), AI agents, and multimodal large‑model techniques are reshaping AI‑industry integration, discusses their technical challenges and practical implementations, and outlines future development directions across algorithms, products, and domain‑specific applications.

AI agentsArtificial IntelligenceRAG
0 likes · 14 min read
Large Model Technologies: RAG, AI Agents, Multimodal Applications, and Future Trends
DevOps
DevOps
Apr 20, 2025 · Artificial Intelligence

Building a Medical Knowledge Base with RAG: A Step‑by‑Step Example

This article demonstrates how to construct an AI‑powered medical knowledge base for diabetes treatment by preprocessing literature, performing semantic chunking, generating BioBERT embeddings, storing them in a FAISS vector database, and using a RAG framework together with a knowledge graph to retrieve and generate accurate answers.

BioBERTMedical AIRAG
0 likes · 12 min read
Building a Medical Knowledge Base with RAG: A Step‑by‑Step Example
DevOps
DevOps
Apr 2, 2025 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

This article explains Retrieval‑Augmented Generation (RAG), its role in mitigating large language model knowledge cutoff and hallucination, outlines the evolution from naive to advanced, modular, graph, and agentic RAG, and discusses future directions such as intelligent and multi‑modal RAG systems.

Artificial IntelligenceLLMRAG
0 likes · 10 min read
Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types
DevOps
DevOps
Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentLLMPrompt Engineering
0 likes · 48 min read
A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents
DevOps
DevOps
Feb 12, 2025 · Artificial Intelligence

A Comprehensive Guide to Prompt Engineering, RAG, and Optimization Techniques for Large Language Models

This article presents a systematic framework for crafting effective prompts, detailing the universal prompt template, role definition, task decomposition, RAG integration, few‑shot examples, memory handling, and parameter tuning to enhance large language model performance across diverse applications.

AI optimizationLarge Language ModelsPrompt Engineering
0 likes · 24 min read
A Comprehensive Guide to Prompt Engineering, RAG, and Optimization Techniques for Large Language Models
DevOps
DevOps
Jan 8, 2025 · Artificial Intelligence

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

The article explains how generative AI agents combine language models, tool integration, self‑guided planning, prompt‑engineering frameworks, extensions, function calls, and vector‑based data storage to create adaptable, retrieval‑augmented systems that can interact with real‑world APIs and perform complex tasks.

AI agentsData StoragePrompt Engineering
0 likes · 12 min read
Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage
DevOps
DevOps
Oct 27, 2024 · Artificial Intelligence

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

This article reviews Wang et al.'s 2024 research on Retrieval‑Augmented Generation, outlining optimal practices such as query classification, chunk sizing, hybrid metadata search, embedding selection, vector databases, query transformation, reranking, document repacking, summarization, fine‑tuning, and multimodal retrieval to guide developers in constructing high‑performance RAG pipelines.

LLMRAGVector Database
0 likes · 11 min read
Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems
DevOps
DevOps
Oct 8, 2024 · Artificial Intelligence

Top 20+ Retrieval‑Augmented Generation (RAG) Interview Questions and Answers

This article presents over twenty essential Retrieval‑Augmented Generation (RAG) interview questions with detailed answers, covering fundamentals, applications, architecture, training, limitations, ethical considerations, and integration, offering AI enthusiasts and job candidates a comprehensive guide to mastering RAG concepts.

AI interviewNLPRAG
0 likes · 15 min read
Top 20+ Retrieval‑Augmented Generation (RAG) Interview Questions and Answers
DevOps
DevOps
Sep 13, 2024 · Artificial Intelligence

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

The article outlines fifteen advanced Retrieval‑Augmented Generation (RAG) techniques—from hierarchical indexing and context caching to multimodal alignment and microservice orchestration—explaining how they help transform AI prototypes into scalable, reliable production systems while highlighting common pitfalls and a concluding call to action.

AI productionLLMRAG
0 likes · 8 min read
15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions
DevOps
DevOps
Jul 21, 2024 · Artificial Intelligence

LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows

This article provides a comprehensive overview of large language models (LLMs), covering their basic concepts, relationship with NLP, development history, parameter scaling, offline deployment, practical applications, prompt‑engineering frameworks, retrieval‑augmented generation, LangChain integration, agents, workflow orchestration, and future directions toward multimodal AI and AGI.

AI applicationsAgentArtificial Intelligence
0 likes · 36 min read
LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows