Tagged articles

vector database

224 articles · Page 1 of 3
AI Engineer Programming
AI Engineer Programming
Jul 4, 2026 · Artificial Intelligence

How Pinecone Nexus Turns Vector Search into an Agent Knowledge Engine

The article analyzes the shift to agent‑centric AI, explains why traditional retrieval creates a costly "Ten blue links" loop, and details how Pinecone Nexus’s context compiler and composable retriever, together with the KnowQL language, provide structured, governed knowledge that boosts task completion rates, cuts latency, and reduces token usage by up to 90%.

AI AgentsKnowQLPinecone
0 likes · 14 min read
How Pinecone Nexus Turns Vector Search into an Agent Knowledge Engine
Shuge Unlimited
Shuge Unlimited
Jun 29, 2026 · Databases

Inside Milvus’ Index Engine: 3‑Layer Parameter Filling, Compile‑time Hardware Split, and a 16× Memory Trade‑off

The article dissects Milvus’ index engine, revealing that AUTOINDEX relies on a three‑stage default‑parameter pipeline, that CPU/GPU index selection is fixed at compile time via Go build tags, that the C++ Knowhere engine executes the algorithms, and that version aggregation, scalar V3 format, and the new AISAQ index embody deliberate memory‑vs‑IO trade‑offs.

AISAQAUTOINDEXCPU/GPU build tags
0 likes · 26 min read
Inside Milvus’ Index Engine: 3‑Layer Parameter Filling, Compile‑time Hardware Split, and a 16× Memory Trade‑off
AI Engineer Programming
AI Engineer Programming
Jun 23, 2026 · Artificial Intelligence

Why Data Lineage Is the Final Piece of RAG Governance

The article explains how data lineage in Retrieval‑Augmented Generation systems links data quality, ingestion, and incremental sync into a traceable whole, detailing the five lineage nodes, schema trade‑offs, storage choices, and how lineage supports debugging, impact analysis, and version control.

Data GovernanceRAGdata lineage
0 likes · 15 min read
Why Data Lineage Is the Final Piece of RAG Governance
AI Engineer Programming
AI Engineer Programming
Jun 22, 2026 · Artificial Intelligence

Ensuring Consistent Incremental Sync in RAG Systems (Part 2)

The article examines how incremental synchronization, index stability, shadow‑index atomic switching, checkpointing, idempotency, backpressure handling, batch‑vs‑streaming trade‑offs, and multi‑layer validation (count reconciliation, content sampling, and retrieval regression) together keep vector‑based RAG knowledge bases reliable and up‑to‑date.

Data GovernanceRAGincremental sync
0 likes · 13 min read
Ensuring Consistent Incremental Sync in RAG Systems (Part 2)
Shuge Unlimited
Shuge Unlimited
Jun 21, 2026 · Databases

Why Deleting 1 Million Vectors in Milvus Doesn't Shrink Disk Space: A Deep Dive into 11 CompactionTypes

When Milvus appears to keep disk usage unchanged after deleting a million vectors, the cause is not a bug but a sophisticated compaction system that splits the single compact() API into eleven enum values, six independent policies, and seven special handling paths that together manage different kinds of data waste and ensure safe, incremental reclamation.

ClusteringCompactionDataCoord
0 likes · 23 min read
Why Deleting 1 Million Vectors in Milvus Doesn't Shrink Disk Space: A Deep Dive into 11 CompactionTypes
AI Engineer Programming
AI Engineer Programming
Jun 21, 2026 · Artificial Intelligence

RAG Data Governance: Incremental Sync and Consistency (Part 1)

The article explains how additions, updates, and deletions affect a vector store differently, outlines three layers of incremental synchronization—change detection, change handling, and service stability—and compares timestamp polling, content‑hash diffing, and CDC while discussing consistency models and conflict resolution in distributed vector databases.

CDCData GovernanceRAG
0 likes · 16 min read
RAG Data Governance: Incremental Sync and Consistency (Part 1)
Shuge Unlimited
Shuge Unlimited
Jun 20, 2026 · Databases

From 64 to a Million Tenants: Choosing the Right Milvus Multi‑Tenant Layer and Avoiding the 65,536 Ceiling

The article dissects Milvus's four‑layer multi‑tenant architecture—Database, Collection, Partition, and Partition Key—detailing each layer's default tenant limits, isolation strength versus scalability trade‑offs, hidden constraints like the 65,536 capacity ceiling, the Partition Key isolation switch, and practical guidance for selecting the appropriate layer in SaaS and regulated scenarios.

Database isolationMilvusMulti-Tenancy
0 likes · 17 min read
From 64 to a Million Tenants: Choosing the Right Milvus Multi‑Tenant Layer and Avoiding the 65,536 Ceiling
AI Engineer Programming
AI Engineer Programming
Jun 19, 2026 · Artificial Intelligence

RAG Data Quality: Old Problems in a New Bottle

Even with meticulous cleaning, residual noise, redundant legal clauses, and approximate duplicates can degrade retrieval and generation in RAG systems, while privacy risks from embedding inversion and the need for continuous, metric‑driven governance make data quality the ultimate ceiling for performance.

Data QualityEmbedding InversionLLM Retrieval
0 likes · 8 min read
RAG Data Quality: Old Problems in a New Bottle
Programmer DD
Programmer DD
Jun 18, 2026 · Artificial Intelligence

How Cursor Instantly Understands Massive Codebases

The article dissects Cursor's code‑base indexing pipeline, explaining how semantic vector search, trigram‑based regex filtering, AST‑driven chunking, custom embeddings trained on agent trajectories, Merkle‑tree change detection, and Turbopuffer's namespace‑per‑repo vector store combine to deliver sub‑second, accurate code retrieval even in monorepos with tens of thousands of files.

CursorMerkle treecode indexing
0 likes · 21 min read
How Cursor Instantly Understands Massive Codebases
Shuge Unlimited
Shuge Unlimited
Jun 14, 2026 · Artificial Intelligence

Beyond Vector Storage: Inside Milvus 2.6’s Three‑Layer AI Agent Architecture

Milvus 2.6 transforms from a pure vector‑storage backend into a full‑stack AI‑Agent infrastructure by introducing a three‑layer capability system—coding‑rule, protocol, and runtime—covering memory, retrieval, and tool backends, hybrid search, strict operation ordering, and multiple integration paths, while contrasting traditional RAG with agent‑driven modes.

AI AgentsHybrid SearchMCP
0 likes · 20 min read
Beyond Vector Storage: Inside Milvus 2.6’s Three‑Layer AI Agent Architecture
AI Engineer Programming
AI Engineer Programming
Jun 14, 2026 · Artificial Intelligence

10 RAG Architectures Every AI Engineer Should Master

The article debunks the claim that Retrieval‑Augmented Generation is obsolete, explains why huge context windows are impractical, and systematically presents ten RAG patterns—from basic Naïve RAG to advanced Graph and Multimodal RAG—detailing their trade‑offs, costs, and suitable use cases.

AI ArchitectureEmbedding ModelsRAG
0 likes · 16 min read
10 RAG Architectures Every AI Engineer Should Master
SpringMeng
SpringMeng
Jun 14, 2026 · Artificial Intelligence

How I Built an AI Contract Review System for 60,000 RMB in One Month

In 45 days a two‑person team delivered an AI‑powered contract review platform that parses PDFs, extracts key clauses, flags risks, and integrates with enterprise tools, using Python, FastAPI, LangChain, large language models, vector databases and OCR technologies.

AIContract ReviewFastAPI
0 likes · 7 min read
How I Built an AI Contract Review System for 60,000 RMB in One Month
DataFunTalk
DataFunTalk
Jun 13, 2026 · Artificial Intelligence

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices

This article examines the practical challenges of deploying Retrieval‑Augmented Generation (RAG) in enterprise settings, detailing the modular architecture, offline and online pipelines, hybrid retrieval, multi‑stage ranking, knowledge filtering, and two‑stage generation techniques that together improve search completeness, ranking quality, and answer accuracy.

Enterprise AIHybrid SearchKnowledge Graph
0 likes · 21 min read
Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices
DataFunSummit
DataFunSummit
Jun 9, 2026 · Artificial Intelligence

From Poor RAG Performance to Production‑Ready Systems: A Deep Technical Walkthrough

The article dissects why early RAG deployments suffer from low recall, hallucinations and runaway costs, then presents a step‑by‑step diagnostic framework, hybrid search architecture, knowledge‑engineering tricks, caching and routing strategies, and explores advanced GraphRAG and Agentic RAG techniques to build reliable, enterprise‑grade solutions.

Agentic RAGGraphRAGHybrid Search
0 likes · 20 min read
From Poor RAG Performance to Production‑Ready Systems: A Deep Technical Walkthrough
AI Architecture Path
AI Architecture Path
Jun 5, 2026 · Artificial Intelligence

Supermemory Tops Three Authority Benchmarks, Solving AI Forgetting

Supermemory, the open‑source AI memory engine, eliminates repeated forgetting by offering a zero‑configuration, multi‑modal memory layer that tops LongMemEval, LoCoMo and ConvoMo benchmarks, integrates automatic learning, mixed RAG‑Memory search, built‑in connectors, privacy tags, and multiple deployment options from no‑code web to local offline versions.

AI memoryPrivacyRAG
0 likes · 14 min read
Supermemory Tops Three Authority Benchmarks, Solving AI Forgetting
Java Architect Handbook
Java Architect Handbook
Jun 3, 2026 · Artificial Intelligence

What Is Retrieval‑Augmented Generation (RAG) and Why It Matters for LLM Interviews

The article explains Retrieval‑Augmented Generation (RAG), why large language models suffer from hallucination, knowledge cutoff, domain gaps and traceability issues, and how RAG’s offline‑online pipeline, comparison with fine‑tuning and long‑context approaches, and emerging trends like Agentic and Graph‑RAG can be discussed in technical interviews.

AI interviewLarge Language ModelPrompt Engineering
0 likes · 12 min read
What Is Retrieval‑Augmented Generation (RAG) and Why It Matters for LLM Interviews
Linyb Geek Road
Linyb Geek Road
May 31, 2026 · Artificial Intelligence

From Prompt to Harness: The Three Evolutions of AI Engineering

The article traces AI engineering's three-stage evolution—from single‑turn Prompt Engineering, through multi‑turn Context Engineering, to system‑level Harness Engineering—explaining the problems each stage solves, the techniques introduced, concrete examples, and why the shift matters for scalable, reliable AI agents.

AI EngineeringAgentHarness Engineering
0 likes · 11 min read
From Prompt to Harness: The Three Evolutions of AI Engineering
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 27, 2026 · Artificial Intelligence

Building a Multimodal Search with Alibaba Cloud Elasticsearch and Qwen‑VL

This article demonstrates how to integrate Alibaba Cloud Elasticsearch with the Qwen‑VL large model and DashScope Embedding API to extract image features and perform multimodal vector search, covering text‑to‑image, text‑to‑text, image‑to‑image, and image‑to‑text queries, with step‑by‑step code, environment setup, data loading, indexing, and a Streamlit demo.

AI EmbeddingDashScopeElasticsearch
0 likes · 8 min read
Building a Multimodal Search with Alibaba Cloud Elasticsearch and Qwen‑VL
Su San Talks Tech
Su San Talks Tech
May 25, 2026 · Artificial Intelligence

Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python

This tutorial explains Retrieval‑Augmented Generation (RAG) from fundamentals to a full pipeline, covering text chunking strategies, VoyageAI embeddings, vector‑store implementation, BM25 lexical search, and a multi‑index retriever that fuses semantic and lexical results with Reciprocal Rank Fusion.

BM25ChunkingPython
0 likes · 48 min read
Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python

Why Offline Deployment of Dify Is So Challenging – 10 Common Pitfalls and Solutions

Deploying Dify in an offline environment is fraught with hidden traps—from missing Docker images and vector‑database dependencies to network subnet conflicts, plugin‑daemon crashes, and silent external service time‑outs—requiring careful preparation, configuration, and maintenance to achieve a stable setup.

DifyDockerNetwork Configuration
0 likes · 14 min read
Why Offline Deployment of Dify Is So Challenging – 10 Common Pitfalls and Solutions
AI Architecture Hub
AI Architecture Hub
May 19, 2026 · Artificial Intelligence

Agent Memory: From Theory to Practical Implementation

The article explains how AI agents can acquire long‑term memory by combining three functions—coherence, context, and learning—with four memory types, describes the full retrieval‑store loop, and provides a step‑by‑step Python implementation using OpenAI embeddings, ChromaDB, and forgetting strategies.

AI AgentsChromaDBMemory systems
0 likes · 17 min read
Agent Memory: From Theory to Practical Implementation
IT Services Circle
IT Services Circle
May 17, 2026 · Artificial Intelligence

60 Essential AI Terms Every Programmer Should Master

This article walks programmers through 60 core AI concepts—from the basics of large language models and tokens to advanced topics like prompt engineering, retrieval‑augmented generation, fine‑tuning, and inference optimization—organized into progressive skill levels and illustrated with concrete examples and code snippets.

AIInference OptimizationPrompt Engineering
0 likes · 25 min read
60 Essential AI Terms Every Programmer Should Master
AI Engineer Programming
AI Engineer Programming
May 16, 2026 · Artificial Intelligence

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

This article examines practical ways to improve Retrieval‑Augmented Generation (RAG) retrieval quality—covering vector database choices, data chunking, embedding models, query expansion, and re‑ranking—while weighing performance gains against operational costs through multiple real‑world case studies.

LLMQuery ExpansionRAG
0 likes · 16 min read
How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis
DataFunSummit
DataFunSummit
May 7, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw

The article details Volcano Engine’s LAS AI team’s analysis, selection, and deep optimization of the LanceDB vector database as the core memory plugin for the enterprise‑grade OpenClaw (ArkClaw) agent platform, covering comparative evaluation, custom enhancements, and a vision for a cloud‑edge collaborative memory lake.

ArkClawAutoDreamContext Engine
0 likes · 16 min read
How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw
java1234
java1234
May 5, 2026 · Artificial Intelligence

Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI

The author announces a refreshed Spring AI 2.0 video tutorial series and provides a detailed overview of the framework’s design goals, provider‑agnostic API, full‑type model support, Spring integration, enterprise value, typical use cases, and a comparison with competing Java AI libraries.

AI FrameworkJavaLangChain4j
0 likes · 7 min read
Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI
AI Architect Hub
AI Architect Hub
May 3, 2026 · Artificial Intelligence

Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared

This article compares five popular vector databases—Chroma, Milvus, Weaviate, Qdrant, and FAISS—detailing their positions, strengths, weaknesses, suitable scenarios, a selection‑dimension matrix, common pitfalls, code implementations for a unified RAG pipeline, best‑practice recommendations, and thought questions to guide engineers in choosing and migrating vector stores.

ChromaFAISSMilvus
0 likes · 23 min read
Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared
DataFunSummit
DataFunSummit
May 3, 2026 · Artificial Intelligence

From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems

The article analyzes why early RAG deployments often fall short, dissects the most common technical pain points—from document parsing to vector overload—and presents a systematic roadmap that includes hybrid search, reranking, GraphRAG, Agentic RAG, model selection, scalability tricks, and security controls for robust B‑side production.

Agentic RAGEnterprise AIGraphRAG
0 likes · 20 min read
From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems
AI Explorer
AI Explorer
May 2, 2026 · Artificial Intelligence

How Sim Studio Redefines Open-Source AI Agent Orchestration with 28K+ Stars

Sim Studio is an open-source AI agent orchestration platform that provides a visual workflow builder, Copilot-driven natural-language node creation, and native vector-database integration, enabling developers and product teams to construct, deploy, and manage AI-powered employee clusters without writing glue code.

AI AgentsCopilotSim Studio
0 likes · 6 min read
How Sim Studio Redefines Open-Source AI Agent Orchestration with 28K+ Stars
Shuge Unlimited
Shuge Unlimited
Apr 29, 2026 · Databases

Milvus Storage Tuning in Practice: 25× Query Speedup and Three Tricks to Cut Memory Usage by Half

This article walks through Milvus 2.3‑2.6.x storage optimizations—Mmap, tiered storage, and clustering compaction—explaining their principles, configuration hierarchy, benchmark results, and concrete deployment templates that together can boost query performance up to 25‑fold while halving memory consumption.

MilvusPerformance TuningStorage Optimization
0 likes · 24 min read
Milvus Storage Tuning in Practice: 25× Query Speedup and Three Tricks to Cut Memory Usage by Half
AI Illustrated Series
AI Illustrated Series
Apr 27, 2026 · Artificial Intelligence

Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers

This extensive interview guide covers 22 core RAG questions, detailing the definition, workflow, embedding selection, vector database choices, retrieval optimization, multi‑turn handling, context compression, evaluation metrics, knowledge‑graph integration, operational challenges, Agentic and hybrid RAG, document update strategies, similarity algorithms, and hallucination mitigation, providing concrete examples and practical advice for AI interview preparation.

AI interviewEmbeddingRAG
0 likes · 29 min read
Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 27, 2026 · Artificial Intelligence

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

The article walks through the practical challenges of turning a RAG demo into a production system for 5,000 insurance documents, covering knowledge‑base chunking, embedding model selection, recall‑threshold tuning, hybrid vector‑BM25 retrieval, intent‑aware query routing, prompt constraints, confidence scoring, and operational scaling, with concrete metrics and code examples.

EmbeddingHybrid RetrievalIntent Recognition
0 likes · 16 min read
Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers
AI Illustrated Series
AI Illustrated Series
Apr 25, 2026 · Artificial Intelligence

How AI Agents Remember Everything: A Deep Dive into Memory System Design

The article explains why large language models lack persistent memory, introduces a three‑layer memory architecture for AI agents—sensory, working, and long‑term memory—and details how vector databases, embedding models, and retrieval strategies enable cross‑session knowledge retention and personalized assistance.

AI AgentEmbeddingMemory Architecture
0 likes · 24 min read
How AI Agents Remember Everything: A Deep Dive into Memory System Design
ByteDance Data Platform
ByteDance Data Platform
Apr 23, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents

This article details the technical evaluation and deep integration of LanceDB as a memory plugin for the OpenClaw‑based ArkClaw agent platform, covering plugin selection, core enhancements such as mixed retrieval, hierarchical memory, Autodream processing, Context Engine optimizations, Git‑style version control, and the vision of a unified edge‑cloud memory lake.

AI AgentsArkClawLLM memory
0 likes · 12 min read
How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents
Linyb Geek Road
Linyb Geek Road
Apr 22, 2026 · Artificial Intelligence

How to Build Short‑Term and Long‑Term Memory for LLM Agents Using Vector DBs and RAG

The article analyzes Agent memory design by comparing human short‑term and long‑term memory, explains context‑window management strategies, outlines persistent storage options such as vector databases, relational stores, knowledge graphs and fine‑tuning, and presents a three‑layer architecture with write, retrieval and forgetting mechanisms.

Agent MemoryLLMLangChain
0 likes · 15 min read
How to Build Short‑Term and Long‑Term Memory for LLM Agents Using Vector DBs and RAG
Linyb Geek Road
Linyb Geek Road
Apr 22, 2026 · Artificial Intelligence

How to Design an Effective Memory Module for LLM Agents?

The article analyzes why memory is essential for practical LLM agents, categorizes four memory types, proposes a perception‑judgment‑refinement‑storage pipeline, introduces a three‑dimensional retrieval scoring model, and outlines a three‑layer architecture with reflection, merging, and forgetting mechanisms.

AgentLLMMemory Design
0 likes · 15 min read
How to Design an Effective Memory Module for LLM Agents?
DeepHub IMBA
DeepHub IMBA
Apr 21, 2026 · Artificial Intelligence

Designing Persistent Memory for Production AI Agents: A Five‑Stage Pipeline and Four Design Patterns

Production AI agents require persistent memory to maintain continuity, learn from interactions, and recover from failures, but naïvely stuffing full conversation history into the LLM context incurs prohibitive latency and cost; this article outlines four memory types, a five‑stage pipeline, four design patterns, and practical metrics for building efficient, auditable memory systems.

AI AgentsKnowledge GraphLLM
0 likes · 27 min read
Designing Persistent Memory for Production AI Agents: A Five‑Stage Pipeline and Four Design Patterns
dbaplus Community
dbaplus Community
Apr 19, 2026 · Databases

Why Vector Databases Exist: Overcoming SQL’s Blind Spot in AI Search

This guide explains how traditional relational databases and SQL struggle with semantic queries needed for AI applications, introduces vector databases and HNSW indexing for efficient similarity search, compares their architectures, and presents a real‑world fraud detection system that combines both technologies.

AIB+TreeHNSW
0 likes · 17 min read
Why Vector Databases Exist: Overcoming SQL’s Blind Spot in AI Search
AI Architect Hub
AI Architect Hub
Apr 19, 2026 · Artificial Intelligence

Mastering RAG: From Data Cleaning to Vector DBs in AI Applications

This article introduces the second stage of a large‑model application series, detailing the value of Retrieval‑Augmented Generation (RAG), its architecture, and a step‑by‑step outline covering data cleaning, text chunking, vectorization, vector‑DB selection, recall strategies, reranking, and prompt construction.

AILLMPrompt Engineering
0 likes · 4 min read
Mastering RAG: From Data Cleaning to Vector DBs in AI Applications
Big Data and Microservices
Big Data and Microservices
Apr 19, 2026 · Artificial Intelligence

Why Do AI Agents Forget? Understanding Short‑Term and Long‑Term Memory

This article explains how AI agents store information using short‑term (context window) and long‑term (vector database, RAG, knowledge graph) memory, illustrates the concepts with everyday analogies, and shows how proper memory design improves real‑world applications like customer service bots and personal assistants.

AI AgentsAI memoryKnowledge Graph
0 likes · 6 min read
Why Do AI Agents Forget? Understanding Short‑Term and Long‑Term Memory
Code Mala Tang
Code Mala Tang
Apr 17, 2026 · Industry Insights

Beyond Memory: How Context Substrates Are Redefining AI Agents

A comprehensive analysis of over 900 GitHub repositories reveals two distinct paradigms for agent memory—backend storage and context substrates—highlighting their technical differences, strengths, limitations, and the emerging shift toward context engineering for long‑running AI agents.

AIAgent MemoryKnowledge Graph
0 likes · 15 min read
Beyond Memory: How Context Substrates Are Redefining AI Agents
Big Data and Microservices
Big Data and Microservices
Apr 17, 2026 · Industry Insights

What Is a Vector Database? Features, Indexing, and Top Open‑Source Options

This article explains what a vector database is, how it stores and retrieves high‑dimensional vector data, outlines its key characteristics and indexing mechanisms, compares it with traditional databases, and reviews common open‑source vector database solutions such as Milvus, Faiss, Weaviate, PgVector, Chroma, LanceDB, Elasticsearch and Qdrant.

AIEmbeddingIndexing
0 likes · 14 min read
What Is a Vector Database? Features, Indexing, and Top Open‑Source Options
Linyb Geek Road
Linyb Geek Road
Apr 17, 2026 · Artificial Intelligence

Clarifying the Key Components of AI Large‑Model Development: Vectors, Vector Models, and RAG

This article explains how vectors encode text or images, how vector (embedding) models generate these numeric representations, why specialized vector databases are needed for similarity search, and how Retrieval‑Augmented Generation (RAG) combines them to produce reliable answers while stressing the necessity of using the same model throughout the pipeline.

AILarge Language ModelRAG
0 likes · 8 min read
Clarifying the Key Components of AI Large‑Model Development: Vectors, Vector Models, and RAG
Alibaba Cloud Native
Alibaba Cloud Native
Apr 14, 2026 · Artificial Intelligence

The Hidden Memory Crisis in AI Agents—and a Scalable Solution

AI agents often forget user intents after a few interactions, leading to poor experience and lost business, and while building a reliable memory system is technically feasible, teams face challenges in storage, retrieval, consistency, scalability, compliance, and operational overhead, which AgentLoop MemoryStore aims to solve with a serverless, enterprise‑grade architecture.

AI memoryAgentLoopOpenClaw
0 likes · 21 min read
The Hidden Memory Crisis in AI Agents—and a Scalable Solution
IT Services Circle
IT Services Circle
Apr 14, 2026 · Artificial Intelligence

What Is RAG? A Complete Guide to Retrieval‑Augmented Generation for AI Engineers

This article explains Retrieval‑Augmented Generation (RAG), covering why large language models need external knowledge, the full offline‑and‑online workflow, document chunking, embedding evolution, vector database choices, multi‑path retrieval, evaluation metrics, hallucination types, and practical strategies to mitigate them.

AI evaluationEmbeddingRAG
0 likes · 55 min read
What Is RAG? A Complete Guide to Retrieval‑Augmented Generation for AI Engineers
Senior Tony
Senior Tony
Apr 11, 2026 · Databases

Why Vectors Need a Dedicated Database and How Milvus Solves It

This article explains what vectors are, why traditional relational databases struggle with high‑dimensional similarity queries, and how the open‑source Milvus vector database efficiently stores, indexes, and retrieves massive vectors for AI applications such as semantic search, image matching, and recommendation.

AI ApplicationsANNDatabases
0 likes · 5 min read
Why Vectors Need a Dedicated Database and How Milvus Solves It
James' Growth Diary
James' Growth Diary
Apr 10, 2026 · Artificial Intelligence

Designing Agent Memory Systems: Short‑Term, Long‑Term, and Knowledge Graph Layers

The article breaks down how to build a three‑layer memory architecture for AI agents—short‑term context windows with sliding‑window summarization, long‑term semantic retrieval via vector databases with selective storage and time decay, and a knowledge‑graph layer for relational reasoning—plus implementation tips and common pitfalls.

Agent MemoryKnowledge GraphLangChain
0 likes · 19 min read
Designing Agent Memory Systems: Short‑Term, Long‑Term, and Knowledge Graph Layers
Shuge Unlimited
Shuge Unlimited
Apr 10, 2026 · Artificial Intelligence

How Zilliz’s Two Skills Enable AI to Code with pymilvus and Manage Cloud Clusters

This article dissects Zilliz’s Milvus Skill and Zilliz Cloud Skill, showing how a modular set of reference files teaches AI agents to generate pymilvus Python code for vector databases and to operate Zilliz Cloud via CLI, while comparing their architecture, security design, and ecosystem role.

AI AgentCloud ManagementHybrid Search
0 likes · 20 min read
How Zilliz’s Two Skills Enable AI to Code with pymilvus and Manage Cloud Clusters
AI Engineer Programming
AI Engineer Programming
Apr 6, 2026 · Artificial Intelligence

Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code

This article defines agent memory, outlines its three core components and memory classifications, then provides a detailed comparative analysis of the memory designs in Claude Agent SDK, OpenAI Codex CLI, OpenClaw, and Claude Code, highlighting trade‑offs, implementation details, and engineering implications.

Agent MemoryClaudeContext Management
0 likes · 29 min read
Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 3, 2026 · Artificial Intelligence

Why Post‑Filtering Fails in Enterprise RAG and How to Securely Pre‑Filter

Enterprise RAG systems often mistakenly apply post‑filtering, retrieving unauthorized documents before permission checks, which violates audit compliance, wastes Top‑K slots, and risks data leakage in multi‑tenant environments; this article explains why pre‑filtering at the vector search layer, proper metadata design, token validation, and dynamic permission handling are essential.

Multi‑tenantPermission controlRAG
0 likes · 15 min read
Why Post‑Filtering Fails in Enterprise RAG and How to Securely Pre‑Filter
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 1, 2026 · Artificial Intelligence

How to Design an Effective Agent Memory System for Enterprise AI Assistants

This article explains why AI agents need a structured memory module, outlines three memory types from cognitive science, details short‑term and long‑term storage architectures using vector databases, and provides concrete code and management strategies—including conflict resolution, TTL expiration, and privacy compliance—to build a robust Agent Memory system.

Agent MemoryLLMMemory Management
0 likes · 23 min read
How to Design an Effective Agent Memory System for Enterprise AI Assistants
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 27, 2026 · Artificial Intelligence

Securing RAG Systems: A Three‑Layer Permission Framework for Banking AI

This article explains why vector databases lack row‑level security, presents a three‑layer permission architecture—including JWT authentication, Milvus metadata or partition filtering, and post‑retrieval validation—covers document security levels, PostgreSQL RLS, audit logging, caching strategies, and offers interview‑ready talking points.

JWTMilvusPermission Management
0 likes · 18 min read
Securing RAG Systems: A Three‑Layer Permission Framework for Banking AI
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Mar 20, 2026 · Artificial Intelligence

Why Vector‑Based RAG Falls Short and How PageIndex’s Reasoning‑Based Retrieval Solves It

This article analyzes the fundamental limitations of traditional vector‑based Retrieval‑Augmented Generation, introduces Vectify AI’s reasoning‑driven PageIndex framework, and explains how hierarchical, non‑vector indexing enables more accurate, context‑aware document retrieval for complex, domain‑specific texts.

AILLMPageIndex
0 likes · 15 min read
Why Vector‑Based RAG Falls Short and How PageIndex’s Reasoning‑Based Retrieval Solves It
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Mar 10, 2026 · Artificial Intelligence

How Anthropic and Palantir Collaborate on Modern Warfare Information Mining

The article analyzes Palantir's ontology-driven knowledge graph dominance, its shift from graph to vector databases, the three‑layer partnership with Anthropic and AWS, the Digital Twin scaling law, and the technical challenges of data heterogeneity, scaling uncertainty, annotation scarcity, and real‑time computation in modern warfare information mining.

AWSAnthropicDigital Twin
0 likes · 9 min read
How Anthropic and Palantir Collaborate on Modern Warfare Information Mining
Woodpecker Software Testing
Woodpecker Software Testing
Mar 6, 2026 · Artificial Intelligence

How RAG Testing Teams Can Successfully Transform in 2024

With RAG becoming the backbone of enterprise AI, traditional API‑UI testing misses critical semantic errors, leading to high hallucination rates; this article outlines why conventional methods fail and presents a three‑pillar transformation—skill rebuilding, process reengineering, and advanced tooling—backed by real‑world case studies.

AI testingHallucinationLLM
0 likes · 9 min read
How RAG Testing Teams Can Successfully Transform in 2024
Shuge Unlimited
Shuge Unlimited
Feb 27, 2026 · Databases

Why Is Milvus, the 43K‑Star Vector Database, So Powerful?

This article analyzes Milvus—its open‑source origins, three deployment modes, four‑layer architecture, eight‑plus indexing algorithms, real‑world case studies, and a detailed comparison with competitors—highlighting its strengths, weaknesses, common pitfalls, and when it’s the right choice for large‑scale AI workloads.

AI workloadsCloud NativeIndexing
0 likes · 15 min read
Why Is Milvus, the 43K‑Star Vector Database, So Powerful?
DataFunSummit
DataFunSummit
Feb 25, 2026 · Artificial Intelligence

Why RAG Fails in Production and How to Fix It: Expert Insights

This article summarizes a DataFun‑hosted roundtable where leading AI experts dissect the gap between RAG’s promise and real‑world deployment, exposing low recall, hallucinations, and cost overruns, then present systematic diagnostics, evaluation metrics, hybrid search, and engineering best practices to reliably operationalize RAG in enterprise settings.

Enterprise AIHybrid SearchLLM
0 likes · 18 min read
Why RAG Fails in Production and How to Fix It: Expert Insights
AI Waka
AI Waka
Feb 23, 2026 · Artificial Intelligence

Essential Books to Master Generative AI: From NLP to Multimodal Apps

This guide outlines the key competencies for generative AI professionals and curates a focused reading list—covering NLP fundamentals, software engineering, LLM libraries, vector databases, and multimodal AI—to help readers build practical expertise and deploy impactful AI solutions.

AI learningBook RecommendationsGenerative AI
0 likes · 9 min read
Essential Books to Master Generative AI: From NLP to Multimodal Apps
AI Engineering
AI Engineering
Feb 23, 2026 · Databases

Is Zvec the ‘SQLite Moment’ for Vector Databases?

Alibaba’s newly open‑sourced Zvec brings an in‑process vector database that claims millisecond searches over billions of vectors, supports dense and sparse embeddings, installs via a single pip command, and runs on anything from laptops to edge devices, though users warn of memory limits and unverified security concerns.

PythonRAGZvec
0 likes · 3 min read
Is Zvec the ‘SQLite Moment’ for Vector Databases?
Qborfy AI
Qborfy AI
Feb 18, 2026 · Artificial Intelligence

How Retrieval‑Augmented Generation (RAG) Supercharges LLM Answers – Complete Guide & Code

This article explains Retrieval‑Augmented Generation (RAG), detailing its offline knowledge‑base construction and online retrieval‑enhanced generation workflow, comparing it with traditional and fine‑tuned models, and providing step‑by‑step LangChain implementations, advanced techniques, and practical use‑case demos.

Embedding ModelsHybrid SearchLangChain
0 likes · 16 min read
How Retrieval‑Augmented Generation (RAG) Supercharges LLM Answers – Complete Guide & Code
DataFunTalk
DataFunTalk
Feb 11, 2026 · Artificial Intelligence

Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System

This round‑table dissects the gap between RAG’s hype and real‑world production, exposing common pitfalls such as low recall, hallucinations and cost overruns, and then delivers a systematic diagnostic framework, hybrid search strategies, fine‑tuning rules, and practical best‑practice roadmaps for building reliable enterprise RAG solutions.

Agentic RAGHybrid SearchLLM
0 likes · 20 min read
Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System
Shuge Unlimited
Shuge Unlimited
Feb 11, 2026 · Operations

How to Easily Manage Operations of 10 Milvus Clusters with an Agent Skill

This article walks through the real‑world pain points of monitoring dozens of Milvus collections across multiple clusters, then details a Python‑based Skill that automates connection handling, aggregates collection metadata, evaluates index health with a three‑state model, and provides unified health checks, performance testing, and capacity analysis for reliable large‑scale vector database operations.

Index managementMilvusMonitoring
0 likes · 18 min read
How to Easily Manage Operations of 10 Milvus Clusters with an Agent Skill
Java Architecture Diary
Java Architecture Diary
Feb 10, 2026 · Artificial Intelligence

Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector

This guide explains why pure vector retrieval often fails for version‑specific queries, introduces hybrid search that combines semantic and keyword matching, and provides step‑by‑step code and SQL examples for enabling PgVector hybrid search in LangChain4j 1.11.0.

Full-Text SearchHybrid SearchLangChain4j
0 likes · 11 min read
Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector
Architecture and Beyond
Architecture and Beyond
Feb 8, 2026 · Artificial Intelligence

Designing Scalable Long-Term Memory for AI Agents: Capture, Compress, Retrieve

This article explains how to build a controllable, editable, and cost‑effective long‑term memory system for AI agents by categorizing memory types, structuring a three‑stage pipeline of capture, AI‑driven compression, and smart retrieval, and choosing appropriate storage back‑ends such as files, knowledge bases, or databases.

Agent DesignKnowledge Baseartificial-intelligence
0 likes · 18 min read
Designing Scalable Long-Term Memory for AI Agents: Capture, Compress, Retrieve
Amazon Cloud Developers
Amazon Cloud Developers
Jan 14, 2026 · Databases

How OpenSearch Service Boosts Vector Database Build Speed by Up to 10× and Cuts Costs by 75%

Amazon OpenSearch Service now offers serverless GPU‑accelerated vector indexing and automatic optimization, enabling users to build billion‑scale vector databases up to ten times faster, reduce indexing costs to one‑quarter, and balance latency, quality, and memory without manual tuning.

AWS CLIAmazon OpenSearch ServiceGPU Acceleration
0 likes · 9 min read
How OpenSearch Service Boosts Vector Database Build Speed by Up to 10× and Cuts Costs by 75%
Sohu Tech Products
Sohu Tech Products
Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

AIKnowledge BaseLLM
0 likes · 14 min read
Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

Knowledge ExtractionLLMPython
0 likes · 15 min read
How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM
0 likes · 11 min read
Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection
Architects' Tech Alliance
Architects' Tech Alliance
Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM
0 likes · 15 min read
Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment
HyperAI Super Neural
HyperAI Super Neural
Dec 12, 2025 · Artificial Intelligence

AI Open‑Source Forum Recap: Video Generation, Vision, Vector DBs, AI‑Native Language

The AI Open‑Source Forum brought together researchers from Peking University, Tsinghua, Zilliz and MoonBit to share open‑source advances in audio‑synchronized video generation, vector database architecture, lightweight vision backbones, and an AI‑native programming language, highlighting datasets, system designs, and future collaborative directions.

AIAI‑Native ProgrammingVision models
0 likes · 12 min read
AI Open‑Source Forum Recap: Video Generation, Vision, Vector DBs, AI‑Native Language
macrozheng
macrozheng
Dec 3, 2025 · Databases

How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance

Redis has introduced a multithreaded query engine that dramatically reduces latency and increases throughput—up to 16×—for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications compared to traditional single‑threaded architectures and competing vector databases.

Performance BenchmarkRAGRedis
0 likes · 6 min read
How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance
Raymond Ops
Raymond Ops
Nov 23, 2025 · Databases

How to Install and Run Milvus Vector Database with Docker Compose

This guide introduces Milvus, an open‑source vector database for AI workloads, outlines its key features and common use cases, and provides step‑by‑step Docker‑Compose commands to set up Milvus, its storage backend MinIO, and the Attu management UI.

AttuDocker ComposeMilvus
0 likes · 8 min read
How to Install and Run Milvus Vector Database with Docker Compose
Data Party THU
Data Party THU
Nov 9, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

This article walks through the core RAG pipeline, explains why chunking is the linchpin of retrieval quality, and provides detailed definitions, trade‑offs, and implementation examples for five chunking techniques—fixed, recursive, semantic, structure‑aware, and delayed—so you can choose the right approach for any document‑heavy AI application.

AIChunkingLLM
0 likes · 10 min read
Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed
dbaplus Community
dbaplus Community
Nov 3, 2025 · Artificial Intelligence

How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms

This article explains how Retrieval‑Augmented Generation (RAG) combines vector databases with large language models to let non‑technical users ask natural‑language questions and receive precise SQL statements, detailing the workflow, architecture, chunking methods, performance gains, and remaining challenges.

Data PlatformLLMRAG
0 likes · 17 min read
How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Oct 9, 2025 · Artificial Intelligence

How Short‑Term and Long‑Term Memory Power LLM‑Based Agents

This article explains the definitions, technical implementations, functions, limitations, and collaborative workflow of short‑term and long‑term memory in large‑language‑model agents, detailing context windows, attention mechanisms, vector storage, retrieval strategies, and future research directions for building personalized, continuously learning AI agents.

Agent MemoryLLMShort-term Memory
0 likes · 11 min read
How Short‑Term and Long‑Term Memory Power LLM‑Based Agents
DataFunSummit
DataFunSummit
Oct 6, 2025 · Artificial Intelligence

Why Vector Lakes Are the Next Frontier for AI Data Management

This article explains how Zilliz's Vector Lake extends traditional data lakes with a unified storage‑compute architecture optimized for massive unstructured and vector data, detailing its background, key data types, autonomous‑driving use case, data flow, architecture, and deployment options.

AI data managementData LakeVector Lake
0 likes · 13 min read
Why Vector Lakes Are the Next Frontier for AI Data Management
JD Tech Talk
JD Tech Talk
Sep 28, 2025 · Artificial Intelligence

What Is Retrieval‑Augmented Generation (RAG) and How Does It Power Modern AI?

This article explains Retrieval‑Augmented Generation (RAG), an AI framework that combines traditional information retrieval with large language models, detailing its core workflow—from knowledge preparation, chunking, and embedding to vector database storage and the question‑answering stage—while highlighting key challenges, tools, and optimization strategies.

AIChunkingEmbedding
0 likes · 15 min read
What Is Retrieval‑Augmented Generation (RAG) and How Does It Power Modern AI?
Bilibili Tech
Bilibili Tech
Sep 26, 2025 · Artificial Intelligence

How RAG Transforms Natural Language Queries into Accurate SQL for Business Users

This article explains how Retrieval‑Augmented Generation (RAG) combines large language models with vector databases to let non‑technical staff query massive membership data using plain language, detailing the workflow, technical architecture, optimization challenges, and real‑world impact on data‑driven decision making.

AIData PlatformLLM
0 likes · 17 min read
How RAG Transforms Natural Language Queries into Accurate SQL for Business Users
AI Large Model Application Practice
AI Large Model Application Practice
Sep 23, 2025 · Artificial Intelligence

How MindsDB Turns Any Data Source into an AI‑Powered Query Engine

This article walks through installing MindsDB, configuring its unified data access layer, and demonstrates how to query across relational databases, files, and vector stores while injecting AI models—including traditional ML, LLMs, and embedding models—directly into SQL for intelligent data retrieval and analysis.

AI data integrationLLMMindsDB
0 likes · 16 min read
How MindsDB Turns Any Data Source into an AI‑Powered Query Engine
DataFunTalk
DataFunTalk
Sep 20, 2025 · Artificial Intelligence

Why Chroma’s Context Engineering Is Redefining AI Search Infrastructure

Jeff Huber, founder of Chroma, discusses the startup’s mission to turn AI demos into production‑grade applications, critiques the hype around RAG, emphasizes the importance of Context Engineering, and explains how Chroma’s open‑source vector database and cloud service aim to simplify AI search for developers.

AIChromacontext engineering
0 likes · 32 min read
Why Chroma’s Context Engineering Is Redefining AI Search Infrastructure
Data STUDIO
Data STUDIO
Sep 18, 2025 · Artificial Intelligence

Build a RAG App from Scratch: Master Text Chunking, Vector Retrieval, and Coreference Resolution

This tutorial walks through building a Retrieval‑Augmented Generation (RAG) system from the ground up, covering document parsing, text chunking strategies, vector store creation with ChromaDB, semantic search, prompt engineering for LLMs, conversation memory, coreference handling, and practical optimization tips, all illustrated with complete Python code.

ChromaDBPythonRAG
0 likes · 19 min read
Build a RAG App from Scratch: Master Text Chunking, Vector Retrieval, and Coreference Resolution
Data Thinking Notes
Data Thinking Notes
Sep 7, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart

This article explains the core architecture of AI agents powered by large language models, detailing how planning, short‑term and long‑term memory, and tool integration work together through vector databases, retrieval‑augmented generation, and summarization to enable stateful, intelligent interactions across multiple sessions.

AI AgentLLMmemory
0 likes · 10 min read
Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart