Topic

RAG

Collection size
178 articles
Page 5 of 9
JD Tech Talk
JD Tech Talk
Jul 16, 2024 · Artificial Intelligence

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

TaD, a task‑aware decoding technique jointly developed by JD.com and Tsinghua University and presented at IJCAI 2024, leverages differences between pre‑ and post‑fine‑tuned LLM outputs to construct knowledge vectors, significantly reducing hallucinations across various models, tasks, and data‑scarce scenarios, especially when combined with RAG.

AILLMRAG
0 likes · 18 min read
Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models
Cognitive Technology Team
Cognitive Technology Team
Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

LLMRAGVector Database
0 likes · 18 min read
Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval
Cognitive Technology Team
Cognitive Technology Team
Feb 28, 2025 · Artificial Intelligence

Comparative Study of Traditional RAG, GraphRAG, and DeepSearcher for Knowledge Retrieval and Generation

This article examines why Retrieval‑Augmented Generation (RAG) is needed, compares traditional RAG, GraphRAG, and the DeepSearcher framework across architecture, data organization, retrieval mechanisms, result generation, efficiency and accuracy, and provides step‑by‑step implementation guides and experimental results using vector and graph databases.

Artificial IntelligenceDeepSearcherGraph Database
0 likes · 20 min read
Comparative Study of Traditional RAG, GraphRAG, and DeepSearcher for Knowledge Retrieval and Generation
AntData
AntData
Sep 26, 2024 · Artificial Intelligence

DB-GPT: Open-Source AI-Native Data Application Development Framework

DB‑GPT is an open‑source AI‑native data‑application framework that provides multi‑model management, Text‑to‑SQL optimization, RAG, multi‑agent collaboration, and intelligent workflow orchestration, enabling developers to build scalable large‑model database applications, with proven enterprise adoption, community growth, and academic publications.

AIRAGdata engineering
0 likes · 6 min read
DB-GPT: Open-Source AI-Native Data Application Development Framework
AntTech
AntTech
Apr 2, 2025 · Artificial Intelligence

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.

Attention Re-weightingContext AwarenessLLM
0 likes · 6 min read
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
AntTech
AntTech
Jul 2, 2024 · Artificial Intelligence

Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support

This article surveys Retrieval‑Augmented Generation (RAG), analyzes the limitations of traditional vector‑based RAG, introduces Graph RAG that leverages knowledge graphs for more reliable context, proposes a universal RAG architecture compatible with vector, graph and full‑text indexes, and details its open‑source implementation, code components, testing, and future research directions.

AIEngineeringGraphRAGKnowledgeGraph
0 likes · 26 min read
Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support
Zhihu Tech Column
Zhihu Tech Column
Dec 9, 2024 · Artificial Intelligence

Large Model Application Engineering: ZhiLight Inference Framework and Zhihu Direct Answer System

The article details Zhihu's technical salon on large‑model engineering, covering the RAG‑based Zhihu Direct Answer system, the open‑source ZhiLight inference framework, prompt engineering, agent research, and future plans for integrating AI into product and community workflows.

AI EngineeringAgentInference Framework
0 likes · 8 min read
Large Model Application Engineering: ZhiLight Inference Framework and Zhihu Direct Answer System
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBADatabase AIFault Diagnosis
0 likes · 10 min read
ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 11, 2024 · Databases

ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management

ChatDBA is an AI‑driven conversational system developed by Shanghai Aikesheng that assists DBAs with fault diagnosis, knowledge learning, SQL generation and optimization by leveraging large language models, RAG architecture, and advanced retrieval and document‑processing techniques.

ChatDBADatabase AIFault Diagnosis
0 likes · 10 min read
ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 22, 2025 · Artificial Intelligence

Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI

This guide explains how to locally deploy the DeepSeek large‑language model using Ollama on Windows, macOS, and Linux, configure model storage and CORS, build personal and enterprise RAG knowledge bases with AnythingLLM and Open WebUI, and integrate the model into a Spring AI application via Docker and Docker‑Compose.

ContainerizationDeepseekDocker
0 likes · 16 min read
Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

LLMRAGcaching
0 likes · 32 min read
Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 11, 2024 · Artificial Intelligence

AI Hackathon Journey: Building the "Novel Jump" Bot on Coze Platform

This article recounts the author's participation in a Shenzhen AI Hackathon, detailing the development of an interactive novel‑character chatbot using the Coze platform, describing the workflow design, technical challenges, model choices, knowledge‑base construction, and the final demo and award outcomes.

AIChatbotCoze
0 likes · 12 min read
AI Hackathon Journey: Building the "Novel Jump" Bot on Coze Platform
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 8, 2024 · Artificial Intelligence

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The article introduces PreFLMR, an open‑source, general‑purpose pre‑trained multimodal retriever that leverages fine‑grained late‑interaction to boost retrieval‑augmented generation for knowledge‑intensive visual tasks, describes its M2KR benchmark, training stages, and strong experimental results across multiple tasks.

AIFLMRRAG
0 likes · 11 min read
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI BotCozeLLM
0 likes · 25 min read
Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 25, 2024 · Artificial Intelligence

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

This article reviews the author’s hands‑on experience with Pinecone’s serverless vector database, various embedding and generation models such as all‑MiniLM‑L6‑v2, text‑embedding‑ada‑002, clip‑ViT‑B‑32, and GPT‑3.5‑turbo‑instruct, and demonstrates how they are applied to semantic search, RAG, recommendation, hybrid, and facial similarity tasks using Python code examples.

AIEmbedding ModelsPinecone
0 likes · 9 min read
Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 31, 2024 · Artificial Intelligence

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

This tutorial demonstrates how to build an advanced Retrieval‑Augmented Generation (RAG) system for semi‑structured PDF data by leveraging LangChain, the unstructured library, ChromaDB vector store, and OpenAI models, covering installation, PDF partitioning, element classification, summarization, and query execution.

AIChromaDBLangChain
0 likes · 11 min read
Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB
Architecture and Beyond
Architecture and Beyond
Feb 22, 2025 · Artificial Intelligence

Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models

The article explains how the inherent knowledge‑staleness, hallucination, lack of private data, non‑traceable output, limited long‑text handling, and data‑security concerns of large language models can be mitigated by Retrieval‑Augmented Generation, which combines external retrieval, augmentation, and generation to provide up‑to‑date, reliable, and secure AI responses.

AILLMRAG
0 likes · 15 min read
Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models
System Architect Go
System Architect Go
Nov 19, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp

This article explains the concept, architecture, and step‑by‑step implementation of Retrieval Augmented Generation (RAG), covering indexing, retrieval & generation processes, a practical LangChain‑Redis‑llama.cpp example on Kubernetes, code snippets, test results, challenges, and references.

AIEmbeddingLLM
0 likes · 6 min read
Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp
Architecture Digest
Architecture Digest
Oct 18, 2024 · Databases

Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance

Redis has launched an enhanced, multi‑threaded query engine that dramatically increases throughput and reduces latency for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications while maintaining sub‑10 ms response times.

Database performanceMulti-threadingQuery Engine
0 likes · 7 min read
Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance