Tagged articles

RAG

1044 articles · Page 11 of 11
21CTO
21CTO
May 6, 2024 · Databases

How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises

Oracle’s latest release, Database 23ai, upgrades its 23c platform with AI-driven vector search, RAG capabilities, and enhanced JSON and graph querying, positioning the database as a unified, secure, and scalable solution for handling structured, semi‑structured, and unstructured data across cloud and on‑premises environments.

AIOracleRAG
0 likes · 7 min read
How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises
AI Large Model Application Practice
AI Large Model Application Practice
May 3, 2024 · Artificial Intelligence

Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval

This article examines whether the rapid growth of large‑language‑model context windows can eliminate the need for retrieval‑augmented generation, presenting experimental needle‑in‑a‑haystack tests, analysis of model performance across token lengths and needle positions, and practical guidance using an open‑source evaluation tool.

AIEvaluationLLM
0 likes · 13 min read
Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval
DataFunTalk
DataFunTalk
Apr 29, 2024 · Artificial Intelligence

Practical Experience and Q&A Exploration of Patent Large Models

This article presents a comprehensive overview of the development, training, data preparation, algorithmic strategies, evaluation methods, and RAG integration for a domain‑specific patent large language model, highlighting challenges, practical results, and future research directions.

Domain-specific ModelPatent AIRAG
0 likes · 19 min read
Practical Experience and Q&A Exploration of Patent Large Models
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

CachingEnterprise AILLM
0 likes · 32 min read
Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices
DevOps
DevOps
Apr 17, 2024 · Artificial Intelligence

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

The article explores how enterprises can build and improve large‑model applications by combining prompt engineering, retrieval‑augmented generation (RAG), and fine‑tuning, discusses their relationships, optimization dimensions, testing challenges, and provides practical guidance for SE4AI implementation.

AI EngineeringEnterprise AIRAG
0 likes · 20 min read
Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning
21CTO
21CTO
Apr 12, 2024 · Artificial Intelligence

How I Built an AI‑Powered Resume Chatbot with LLMs and RAG

Senior developer Jon Olson shares how he created an AI resume assistant using GPT‑4/3.5, LangChain, LlamaIndex, and retrieval‑augmented generation, detailing prompt engineering, backend integration, and future routing features to help job seekers showcase their skills.

AI ChatbotLLMLangChain
0 likes · 8 min read
How I Built an AI‑Powered Resume Chatbot with LLMs and RAG
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 12, 2024 · Artificial Intelligence

Typical Business and Technical Architectures for Large Language Model Applications

This article reviews the common business and technical architectures used in large language model (LLM) applications, explains AI Embedded, AI Copilot, and AI Agent modes—including single‑ and multi‑agent systems—and offers guidance on selecting appropriate technology stacks such as prompt‑only, function‑calling agents, RAG, and fine‑tuning.

AI AgentLLMRAG
0 likes · 9 min read
Typical Business and Technical Architectures for Large Language Model Applications
Eric Tech Circle
Eric Tech Circle
Apr 11, 2024 · Artificial Intelligence

Build a Generative AI RAG App with Spring AI in Minutes

This guide walks you through setting up Spring AI, configuring model providers and vector stores, initializing a Spring Boot project, adding OpenAI credentials, and running a complete RAG (Retrieval‑Augmented Generation) demo with code snippets and sample API calls.

JavaOpenAIRAG
0 likes · 15 min read
Build a Generative AI RAG App with Spring AI in Minutes
HelloTech
HelloTech
Apr 10, 2024 · Artificial Intelligence

An Overview of LangChain: Architecture, Core Components, and Code Examples

LangChain is an open‑source framework that provides Python and JavaScript SDKs, templates, and services such as LangServe and LangSmith to compose models, embeddings, prompts, indexes, memory, chains, and agents via a concise expression language, enabling rapid prototyping, debugging, and deployment of LLM‑driven applications.

AI EngineeringAgentsJavaScript
0 likes · 19 min read
An Overview of LangChain: Architecture, Core Components, and Code Examples
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 10, 2024 · Artificial Intelligence

Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering

This guide walks AI engineers through a rapid 10‑minute boot‑strap of LangChain, explaining its purpose, core concepts, design questions, environment setup, and step‑by‑step code examples that cover APIs, chains, memory, retrieval‑augmented generation, tools, agents, and the overall architecture.

AI EngineeringAgentsLLM
0 likes · 28 min read
Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 8, 2024 · Artificial Intelligence

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The article introduces PreFLMR, an open‑source, general‑purpose pre‑trained multimodal retriever that leverages fine‑grained late‑interaction to boost retrieval‑augmented generation for knowledge‑intensive visual tasks, describes its M2KR benchmark, training stages, and strong experimental results across multiple tasks.

AIFLMRPretrained Models
0 likes · 11 min read
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI botCozeLLM
0 likes · 25 min read
Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design
AI Large Model Application Practice
AI Large Model Application Practice
Mar 29, 2024 · Artificial Intelligence

How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows

This article examines the evolution of Retrieval‑Augmented Generation (RAG) architectures for large language models, outlines the challenges they face, introduces the modular RAG Flow concept with four workflow paradigms, and provides a step‑by‑step implementation using LangChain and LlamaIndex with code examples.

LLMLangChainRAG
0 likes · 15 min read
How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows
Sohu Tech Products
Sohu Tech Products
Mar 27, 2024 · Artificial Intelligence

Building a RAG Application with Baidu Vector Database and Qianfan Embedding

This tutorial walks through building a Retrieval‑Augmented Generation application by setting up Baidu’s Vector Database and Qianfan embedding service, configuring credentials, creating a document database and vector table, loading and chunking PDFs, generating embeddings, storing them, and performing scalar, vector and hybrid similarity searches, ready for integration with Wenxin LLM for answer generation.

AI ApplicationsBaidu QianfanEmbedding
0 likes · 11 min read
Building a RAG Application with Baidu Vector Database and Qianfan Embedding
Sohu Tech Products
Sohu Tech Products
Mar 27, 2024 · Artificial Intelligence

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

NVIDIA’s comprehensive LLM ecosystem combines the full‑stack NeMo Framework for data curation, distributed training, fine‑tuning, inference acceleration with TensorRT‑LLM and Triton, plus Retrieval‑Augmented Generation and Guardrails, enabling efficient, low‑latency, knowledge‑grounded model deployment across clusters.

AI accelerationModel TrainingNVIDIA
0 likes · 16 min read
NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions
Eric Tech Circle
Eric Tech Circle
Mar 24, 2024 · Artificial Intelligence

Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison

This guide compares Ollama and Hugging Face for running large language models locally, detailing API and local execution methods, installation steps, model selection, resource requirements, integration with AnythingLLM, container deployment, embedding and vector store setup, and practical observations on performance and limitations.

AnythingLLMDockerEmbedding
0 likes · 15 min read
Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison
NewBeeNLP
NewBeeNLP
Mar 18, 2024 · Artificial Intelligence

Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning

This article provides a comprehensive technical guide on Retrieval‑Augmented Generation (RAG), open‑source large language models such as LLaMA, fine‑tuning methods, evaluation metrics, memory‑optimization tricks, and attention‑related optimizations for modern AI systems.

LLMLangChainMemory optimization
0 likes · 19 min read
Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning
DataFunTalk
DataFunTalk
Mar 15, 2024 · Artificial Intelligence

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end ecosystem for large language models, covering the NeMo Framework’s data processing, distributed training, model fine‑tuning, inference acceleration with TensorRT‑LLM, deployment via Triton, and Retrieval‑Augmented Generation (RAG) techniques that enhance model reliability and performance.

AINVIDIANeMo
0 likes · 16 min read
NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Artificial Intelligence

Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch

This step‑by‑step guide explains how to implement a lightweight Retrieval‑Augmented Generation system—Tiny‑RAG—by creating embedding classes, loading and chunking documents, building a simple vector store, performing similarity search, and integrating a large language model for answer generation, complete with runnable Python code.

EmbeddingLLMPython
0 likes · 14 min read
Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch
Baidu Geek Talk
Baidu Geek Talk
Mar 13, 2024 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain

The article explains Retrieval-Augmented Generation (RAG), its workflow, advantages, comparison with fine-tuning, and provides a step-by-step implementation using Baidu's ERNIE SDK, LangChain, and ChromaDB to build a personal knowledge base that answers queries with retrieved context.

AIERNIE SDKKnowledge Base
0 likes · 13 min read
Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain
Xiaohe Frontend Team
Xiaohe Frontend Team
Mar 6, 2024 · Artificial Intelligence

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Sequoia Capital’s “Generative AI Act Two” report highlights a shift from hype‑driven model releases to user‑centric, end‑to‑end solutions, emphasizing the rise of foundational models as components, the importance of developer tools, emerging RAG and fine‑tuning techniques, and the evolving competitive landscape.

AI marketFoundational modelsGenerative AI
0 likes · 6 min read
What the New “Generative AI Act Two” Reveals About the Next AI Wave
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 27, 2024 · Artificial Intelligence

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

This comprehensive guide walks AI developers through building a Retrieval‑Augmented Generation (RAG) chatbot on Alibaba Cloud PAI, covering architecture, vector store setup, model deployment, knowledge ingestion, multi‑modal retrieval, fusion, re‑ranking, prompt design, and end‑to‑end configuration with code examples.

Alibaba CloudChatbotLLM
0 likes · 26 min read
Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 25, 2024 · Artificial Intelligence

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

This article reviews the author’s hands‑on experience with Pinecone’s serverless vector database, various embedding and generation models such as all‑MiniLM‑L6‑v2, text‑embedding‑ada‑002, clip‑ViT‑B‑32, and GPT‑3.5‑turbo‑instruct, and demonstrates how they are applied to semantic search, RAG, recommendation, hybrid, and facial similarity tasks using Python code examples.

AIEmbedding ModelsPinecone
0 likes · 9 min read
Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course
Cloud Native Technology Community
Cloud Native Technology Community
Feb 8, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Retrieval‑augmented generation (RAG) enhances large language models by fetching up‑to‑date, authoritative information from external sources, addressing hallucinations, outdated knowledge, and lack of citations, while offering cost‑effective implementation, improved relevance, user trust, and greater developer control through vector databases, semantic search, and prompt engineering.

AIPrompt EngineeringRAG
0 likes · 10 min read
How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust
Baobao Algorithm Notes
Baobao Algorithm Notes
Feb 4, 2024 · Industry Insights

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

In this talk, the speaker examines the dual goals of AI agents—being entertaining and useful—while introducing the concepts of fast and slow thinking, multimodal perception, long‑term memory, retrieval‑augmented generation, and tool integration as essential steps toward building truly valuable digital companions.

AI AgentsFuture AIMultimodal
0 likes · 18 min read
Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 31, 2024 · Artificial Intelligence

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

This tutorial demonstrates how to build an advanced Retrieval‑Augmented Generation (RAG) system for semi‑structured PDF data by leveraging LangChain, the unstructured library, ChromaDB vector store, and OpenAI models, covering installation, PDF partitioning, element classification, summarization, and query execution.

AIChromaDBLangChain
0 likes · 11 min read
Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB
DaTaobao Tech
DaTaobao Tech
Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3FastAPIKnowledge Base
0 likes · 23 min read
Deploying a Private LLM Knowledge Base on a MacBook
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 6, 2023 · Artificial Intelligence

How to Systematically Fix Bad Cases in Large Language Models

The article outlines a structured approach to identifying, categorizing, evaluating impact, and repairing undesirable responses from large language models, covering both model‑level interventions across training stages and practical inference‑time techniques such as parameter tuning, prompt engineering, RAG, and pre/post‑processing safeguards.

Prompt EngineeringRAGbad case remediation
0 likes · 9 min read
How to Systematically Fix Bad Cases in Large Language Models
DataFunTalk
DataFunTalk
Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIBig DataRAG
0 likes · 11 min read
Cost as the Primary Driver of Vector Database Industry Development
Architect
Architect
Nov 8, 2023 · Artificial Intelligence

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

The article dissects the rise of AI agents—from OpenAI's Assistants API and multimodal perception‑brain‑action pipelines to retrieval‑augmented generation, tool‑use strategies, single‑ and multi‑agent deployments, and emerging frameworks like AutoGen—while highlighting concrete examples, benchmark results, and current limitations.

AI AgentsAssistants APIEmbodied AI
0 likes · 38 min read
AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks
AI Large Model Application Practice
AI Large Model Application Practice
Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

LLMLangChainPDF Processing
0 likes · 12 min read
How to Extract and Embed Tables and Images from PDFs for Multimodal RAG
dbaplus Community
dbaplus Community
Oct 14, 2023 · Artificial Intelligence

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

This guide explains the Retrieval‑Augmented Generation (RAG) technique, detailing how user queries are matched to private knowledge bases, how relevant passages are retrieved, and how large language models use those passages to generate context‑aware answers, complete with code examples and practical tips.

ChatbotEmbeddingLLM
0 likes · 19 min read
Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot
phodal
phodal
Sep 24, 2023 · Artificial Intelligence

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

This article explores the design principles, architectural decisions, and practical code examples behind the Chocolate Factory framework, a JVM‑centric LLM development platform inspired by LangChain, LlamaIndex, Spring AI, and PromptFlow, highlighting SDK construction, RAG workflows, and prompt engineering challenges.

JVMLLMPrompt Engineering
0 likes · 11 min read
Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory
Java High-Performance Architecture
Java High-Performance Architecture
Aug 18, 2023 · Databases

Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions

Redis 7.2, the first Unified Redis Release, introduces AI‑ready vector indexing, hybrid semantic search, scalable RAG support, server‑side Triggers and Functions, enhanced geospatial queries, and a preview of high‑performance searchable indexes, while expanding client library support and integrating Redis Data Integration for seamless enterprise data pipelines.

AIRAGRedis
0 likes · 8 min read
Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions