Tagged articles
192 articles
Page 1 of 2
AI Architecture Hub
AI Architecture Hub
May 19, 2026 · Artificial Intelligence

Agent Memory: From Theory to Practical Implementation

The article explains how AI agents can acquire long‑term memory by combining three functions—coherence, context, and learning—with four memory types, describes the full retrieval‑store loop, and provides a step‑by‑step Python implementation using OpenAI embeddings, ChromaDB, and forgetting strategies.

AI agentsChromaDBPython
0 likes · 17 min read
Agent Memory: From Theory to Practical Implementation
IT Services Circle
IT Services Circle
May 17, 2026 · Artificial Intelligence

60 Essential AI Terms Every Programmer Should Master

This article walks programmers through 60 core AI concepts—from the basics of large language models and tokens to advanced topics like prompt engineering, retrieval‑augmented generation, fine‑tuning, and inference optimization—organized into progressive skill levels and illustrated with concrete examples and code snippets.

AIFine-tuningInference Optimization
0 likes · 25 min read
60 Essential AI Terms Every Programmer Should Master
AI Engineer Programming
AI Engineer Programming
May 16, 2026 · Artificial Intelligence

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

This article examines practical ways to improve Retrieval‑Augmented Generation (RAG) retrieval quality—covering vector database choices, data chunking, embedding models, query expansion, and re‑ranking—while weighing performance gains against operational costs through multiple real‑world case studies.

LLMRAGcost-benefit
0 likes · 16 min read
How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis
DataFunSummit
DataFunSummit
May 7, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw

The article details Volcano Engine’s LAS AI team’s analysis, selection, and deep optimization of the LanceDB vector database as the core memory plugin for the enterprise‑grade OpenClaw (ArkClaw) agent platform, covering comparative evaluation, custom enhancements, and a vision for a cloud‑edge collaborative memory lake.

ArkClawAutodreamContext Engine
0 likes · 16 min read
How LanceDB Powers Enterprise‑Level Memory in Volcano Engine’s OpenClaw
java1234
java1234
May 5, 2026 · Artificial Intelligence

Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI

The author announces a refreshed Spring AI 2.0 video tutorial series and provides a detailed overview of the framework’s design goals, provider‑agnostic API, full‑type model support, Spring integration, enterprise value, typical use cases, and a comparison with competing Java AI libraries.

AI FrameworkJavaLangChain4j
0 likes · 7 min read
Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI
AI Architect Hub
AI Architect Hub
May 3, 2026 · Artificial Intelligence

Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared

This article compares five popular vector databases—Chroma, Milvus, Weaviate, Qdrant, and FAISS—detailing their positions, strengths, weaknesses, suitable scenarios, a selection‑dimension matrix, common pitfalls, code implementations for a unified RAG pipeline, best‑practice recommendations, and thought questions to guide engineers in choosing and migrating vector stores.

ChromaFAISSMilvus
0 likes · 23 min read
Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared
DataFunSummit
DataFunSummit
May 3, 2026 · Artificial Intelligence

From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems

The article analyzes why early RAG deployments often fall short, dissects the most common technical pain points—from document parsing to vector overload—and presents a systematic roadmap that includes hybrid search, reranking, GraphRAG, Agentic RAG, model selection, scalability tricks, and security controls for robust B‑side production.

Agentic RAGEnterprise AIFine-tuning
0 likes · 20 min read
From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems
AI Explorer
AI Explorer
May 2, 2026 · Artificial Intelligence

How Sim Studio Redefines Open-Source AI Agent Orchestration with 28K+ Stars

Sim Studio is an open-source AI agent orchestration platform that provides a visual workflow builder, Copilot-driven natural-language node creation, and native vector-database integration, enabling developers and product teams to construct, deploy, and manage AI-powered employee clusters without writing glue code.

AI agentsCopilotSim Studio
0 likes · 6 min read
How Sim Studio Redefines Open-Source AI Agent Orchestration with 28K+ Stars
Shuge Unlimited
Shuge Unlimited
Apr 29, 2026 · Databases

Milvus Storage Tuning in Practice: 25× Query Speedup and Three Tricks to Cut Memory Usage by Half

This article walks through Milvus 2.3‑2.6.x storage optimizations—Mmap, tiered storage, and clustering compaction—explaining their principles, configuration hierarchy, benchmark results, and concrete deployment templates that together can boost query performance up to 25‑fold while halving memory consumption.

MilvusStorage Optimizationclustering compaction
0 likes · 24 min read
Milvus Storage Tuning in Practice: 25× Query Speedup and Three Tricks to Cut Memory Usage by Half
AI Illustrated Series
AI Illustrated Series
Apr 27, 2026 · Artificial Intelligence

Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers

This extensive interview guide covers 22 core RAG questions, detailing the definition, workflow, embedding selection, vector database choices, retrieval optimization, multi‑turn handling, context compression, evaluation metrics, knowledge‑graph integration, operational challenges, Agentic and hybrid RAG, document update strategies, similarity algorithms, and hallucination mitigation, providing concrete examples and practical advice for AI interview preparation.

AI InterviewEmbeddingKnowledge Retrieval
0 likes · 29 min read
Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 27, 2026 · Artificial Intelligence

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

The article walks through the practical challenges of turning a RAG demo into a production system for 5,000 insurance documents, covering knowledge‑base chunking, embedding model selection, recall‑threshold tuning, hybrid vector‑BM25 retrieval, intent‑aware query routing, prompt constraints, confidence scoring, and operational scaling, with concrete metrics and code examples.

EmbeddingHybrid RetrievalPrompt engineering
0 likes · 16 min read
Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers
AI Illustrated Series
AI Illustrated Series
Apr 25, 2026 · Artificial Intelligence

How AI Agents Remember Everything: A Deep Dive into Memory System Design

The article explains why large language models lack persistent memory, introduces a three‑layer memory architecture for AI agents—sensory, working, and long‑term memory—and details how vector databases, embedding models, and retrieval strategies enable cross‑session knowledge retention and personalized assistance.

AI AgentEmbeddingLong-term Memory
0 likes · 24 min read
How AI Agents Remember Everything: A Deep Dive into Memory System Design
ByteDance Data Platform
ByteDance Data Platform
Apr 23, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents

This article details the technical evaluation and deep integration of LanceDB as a memory plugin for the OpenClaw‑based ArkClaw agent platform, covering plugin selection, core enhancements such as mixed retrieval, hierarchical memory, Autodream processing, Context Engine optimizations, Git‑style version control, and the vision of a unified edge‑cloud memory lake.

AI agentsArkClawLLM Memory
0 likes · 12 min read
How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents
DeepHub IMBA
DeepHub IMBA
Apr 21, 2026 · Artificial Intelligence

Designing Persistent Memory for Production AI Agents: A Five‑Stage Pipeline and Four Design Patterns

Production AI agents require persistent memory to maintain continuity, learn from interactions, and recover from failures, but naïvely stuffing full conversation history into the LLM context incurs prohibitive latency and cost; this article outlines four memory types, a five‑stage pipeline, four design patterns, and practical metrics for building efficient, auditable memory systems.

AI agentsDesign PatternsKnowledge Graph
0 likes · 27 min read
Designing Persistent Memory for Production AI Agents: A Five‑Stage Pipeline and Four Design Patterns
dbaplus Community
dbaplus Community
Apr 19, 2026 · Databases

Why Vector Databases Exist: Overcoming SQL’s Blind Spot in AI Search

This guide explains how traditional relational databases and SQL struggle with semantic queries needed for AI applications, introduces vector databases and HNSW indexing for efficient similarity search, compares their architectures, and presents a real‑world fraud detection system that combines both technologies.

AIB+TreeHNSW
0 likes · 17 min read
Why Vector Databases Exist: Overcoming SQL’s Blind Spot in AI Search
AI Architect Hub
AI Architect Hub
Apr 19, 2026 · Artificial Intelligence

Mastering RAG: From Data Cleaning to Vector DBs in AI Applications

This article introduces the second stage of a large‑model application series, detailing the value of Retrieval‑Augmented Generation (RAG), its architecture, and a step‑by‑step outline covering data cleaning, text chunking, vectorization, vector‑DB selection, recall strategies, reranking, and prompt construction.

AILLMPrompt engineering
0 likes · 4 min read
Mastering RAG: From Data Cleaning to Vector DBs in AI Applications
Big Data and Microservices
Big Data and Microservices
Apr 19, 2026 · Artificial Intelligence

Why Do AI Agents Forget? Understanding Short‑Term and Long‑Term Memory

This article explains how AI agents store information using short‑term (context window) and long‑term (vector database, RAG, knowledge graph) memory, illustrates the concepts with everyday analogies, and shows how proper memory design improves real‑world applications like customer service bots and personal assistants.

AI agentsAI memoryKnowledge Graph
0 likes · 6 min read
Why Do AI Agents Forget? Understanding Short‑Term and Long‑Term Memory
Code Mala Tang
Code Mala Tang
Apr 17, 2026 · Industry Insights

Beyond Memory: How Context Substrates Are Redefining AI Agents

A comprehensive analysis of over 900 GitHub repositories reveals two distinct paradigms for agent memory—backend storage and context substrates—highlighting their technical differences, strengths, limitations, and the emerging shift toward context engineering for long‑running AI agents.

AIAgent MemoryKnowledge Graph
0 likes · 15 min read
Beyond Memory: How Context Substrates Are Redefining AI Agents
Big Data and Microservices
Big Data and Microservices
Apr 17, 2026 · Industry Insights

What Is a Vector Database? Features, Indexing, and Top Open‑Source Options

This article explains what a vector database is, how it stores and retrieves high‑dimensional vector data, outlines its key characteristics and indexing mechanisms, compares it with traditional databases, and reviews common open‑source vector database solutions such as Milvus, Faiss, Weaviate, PgVector, Chroma, LanceDB, Elasticsearch and Qdrant.

AIEmbeddingindexing
0 likes · 14 min read
What Is a Vector Database? Features, Indexing, and Top Open‑Source Options
Alibaba Cloud Native
Alibaba Cloud Native
Apr 14, 2026 · Artificial Intelligence

The Hidden Memory Crisis in AI Agents—and a Scalable Solution

AI agents often forget user intents after a few interactions, leading to poor experience and lost business, and while building a reliable memory system is technically feasible, teams face challenges in storage, retrieval, consistency, scalability, compliance, and operational overhead, which AgentLoop MemoryStore aims to solve with a serverless, enterprise‑grade architecture.

AI memoryAgent ArchitectureAgentLoop
0 likes · 21 min read
The Hidden Memory Crisis in AI Agents—and a Scalable Solution
IT Services Circle
IT Services Circle
Apr 14, 2026 · Artificial Intelligence

What Is RAG? A Complete Guide to Retrieval‑Augmented Generation for AI Engineers

This article explains Retrieval‑Augmented Generation (RAG), covering why large language models need external knowledge, the full offline‑and‑online workflow, document chunking, embedding evolution, vector database choices, multi‑path retrieval, evaluation metrics, hallucination types, and practical strategies to mitigate them.

AI EvaluationEmbeddingRAG
0 likes · 55 min read
What Is RAG? A Complete Guide to Retrieval‑Augmented Generation for AI Engineers
Senior Tony
Senior Tony
Apr 11, 2026 · Databases

Why Vectors Need a Dedicated Database and How Milvus Solves It

This article explains what vectors are, why traditional relational databases struggle with high‑dimensional similarity queries, and how the open‑source Milvus vector database efficiently stores, indexes, and retrieves massive vectors for AI applications such as semantic search, image matching, and recommendation.

AI applicationsANNMilvus
0 likes · 5 min read
Why Vectors Need a Dedicated Database and How Milvus Solves It
James' Growth Diary
James' Growth Diary
Apr 10, 2026 · Artificial Intelligence

Designing Agent Memory Systems: Short‑Term, Long‑Term, and Knowledge Graph Layers

The article breaks down how to build a three‑layer memory architecture for AI agents—short‑term context windows with sliding‑window summarization, long‑term semantic retrieval via vector databases with selective storage and time decay, and a knowledge‑graph layer for relational reasoning—plus implementation tips and common pitfalls.

Agent MemoryKnowledge GraphLangChain
0 likes · 19 min read
Designing Agent Memory Systems: Short‑Term, Long‑Term, and Knowledge Graph Layers
Shuge Unlimited
Shuge Unlimited
Apr 10, 2026 · Artificial Intelligence

How Zilliz’s Two Skills Enable AI to Code with pymilvus and Manage Cloud Clusters

This article dissects Zilliz’s Milvus Skill and Zilliz Cloud Skill, showing how a modular set of reference files teaches AI agents to generate pymilvus Python code for vector databases and to operate Zilliz Cloud via CLI, while comparing their architecture, security design, and ecosystem role.

AI AgentCloud ManagementHybrid Search
0 likes · 20 min read
How Zilliz’s Two Skills Enable AI to Code with pymilvus and Manage Cloud Clusters
AI Engineer Programming
AI Engineer Programming
Apr 6, 2026 · Artificial Intelligence

Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code

This article defines agent memory, outlines its three core components and memory classifications, then provides a detailed comparative analysis of the memory designs in Claude Agent SDK, OpenAI Codex CLI, OpenClaw, and Claude Code, highlighting trade‑offs, implementation details, and engineering implications.

Agent MemoryClaudeContext management
0 likes · 29 min read
Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 3, 2026 · Artificial Intelligence

Why Post‑Filtering Fails in Enterprise RAG and How to Securely Pre‑Filter

Enterprise RAG systems often mistakenly apply post‑filtering, retrieving unauthorized documents before permission checks, which violates audit compliance, wastes Top‑K slots, and risks data leakage in multi‑tenant environments; this article explains why pre‑filtering at the vector search layer, proper metadata design, token validation, and dynamic permission handling are essential.

Pre-filteringRAGSecurity
0 likes · 15 min read
Why Post‑Filtering Fails in Enterprise RAG and How to Securely Pre‑Filter
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 1, 2026 · Artificial Intelligence

How to Design an Effective Agent Memory System for Enterprise AI Assistants

This article explains why AI agents need a structured memory module, outlines three memory types from cognitive science, details short‑term and long‑term storage architectures using vector databases, and provides concrete code and management strategies—including conflict resolution, TTL expiration, and privacy compliance—to build a robust Agent Memory system.

Agent MemoryLLMMem0
0 likes · 23 min read
How to Design an Effective Agent Memory System for Enterprise AI Assistants
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 27, 2026 · Artificial Intelligence

Securing RAG Systems: A Three‑Layer Permission Framework for Banking AI

This article explains why vector databases lack row‑level security, presents a three‑layer permission architecture—including JWT authentication, Milvus metadata or partition filtering, and post‑retrieval validation—covers document security levels, PostgreSQL RLS, audit logging, caching strategies, and offers interview‑ready talking points.

JWTMilvusPostgreSQL RLS
0 likes · 18 min read
Securing RAG Systems: A Three‑Layer Permission Framework for Banking AI
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Mar 20, 2026 · Artificial Intelligence

Why Vector‑Based RAG Falls Short and How PageIndex’s Reasoning‑Based Retrieval Solves It

This article analyzes the fundamental limitations of traditional vector‑based Retrieval‑Augmented Generation, introduces Vectify AI’s reasoning‑driven PageIndex framework, and explains how hierarchical, non‑vector indexing enables more accurate, context‑aware document retrieval for complex, domain‑specific texts.

AILLMPageIndex
0 likes · 15 min read
Why Vector‑Based RAG Falls Short and How PageIndex’s Reasoning‑Based Retrieval Solves It
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Mar 10, 2026 · Artificial Intelligence

How Anthropic and Palantir Collaborate on Modern Warfare Information Mining

The article analyzes Palantir's ontology-driven knowledge graph dominance, its shift from graph to vector databases, the three‑layer partnership with Anthropic and AWS, the Digital Twin scaling law, and the technical challenges of data heterogeneity, scaling uncertainty, annotation scarcity, and real‑time computation in modern warfare information mining.

AWSAnthropicDigital Twin
0 likes · 9 min read
How Anthropic and Palantir Collaborate on Modern Warfare Information Mining
Woodpecker Software Testing
Woodpecker Software Testing
Mar 6, 2026 · Artificial Intelligence

How RAG Testing Teams Can Successfully Transform in 2024

With RAG becoming the backbone of enterprise AI, traditional API‑UI testing misses critical semantic errors, leading to high hallucination rates; this article outlines why conventional methods fail and presents a three‑pillar transformation—skill rebuilding, process reengineering, and advanced tooling—backed by real‑world case studies.

AI testingLLMMLOps
0 likes · 9 min read
How RAG Testing Teams Can Successfully Transform in 2024
Shuge Unlimited
Shuge Unlimited
Feb 27, 2026 · Databases

Why Is Milvus, the 43K‑Star Vector Database, So Powerful?

This article analyzes Milvus—its open‑source origins, three deployment modes, four‑layer architecture, eight‑plus indexing algorithms, real‑world case studies, and a detailed comparison with competitors—highlighting its strengths, weaknesses, common pitfalls, and when it’s the right choice for large‑scale AI workloads.

AI workloadsCloud NativeDeployment
0 likes · 15 min read
Why Is Milvus, the 43K‑Star Vector Database, So Powerful?
DataFunSummit
DataFunSummit
Feb 25, 2026 · Artificial Intelligence

Why RAG Fails in Production and How to Fix It: Expert Insights

This article summarizes a DataFun‑hosted roundtable where leading AI experts dissect the gap between RAG’s promise and real‑world deployment, exposing low recall, hallucinations, and cost overruns, then present systematic diagnostics, evaluation metrics, hybrid search, and engineering best practices to reliably operationalize RAG in enterprise settings.

Enterprise AIHybrid SearchLLM
0 likes · 18 min read
Why RAG Fails in Production and How to Fix It: Expert Insights
AI Waka
AI Waka
Feb 23, 2026 · Artificial Intelligence

Essential Books to Master Generative AI: From NLP to Multimodal Apps

This guide outlines the key competencies for generative AI professionals and curates a focused reading list—covering NLP fundamentals, software engineering, LLM libraries, vector databases, and multimodal AI—to help readers build practical expertise and deploy impactful AI solutions.

AI learningBook RecommendationsLangChain
0 likes · 9 min read
Essential Books to Master Generative AI: From NLP to Multimodal Apps
AI Engineering
AI Engineering
Feb 23, 2026 · Databases

Is Zvec the ‘SQLite Moment’ for Vector Databases?

Alibaba’s newly open‑sourced Zvec brings an in‑process vector database that claims millisecond searches over billions of vectors, supports dense and sparse embeddings, installs via a single pip command, and runs on anything from laptops to edge devices, though users warn of memory limits and unverified security concerns.

PythonRAGZvec
0 likes · 3 min read
Is Zvec the ‘SQLite Moment’ for Vector Databases?
Qborfy AI
Qborfy AI
Feb 18, 2026 · Artificial Intelligence

How Retrieval‑Augmented Generation (RAG) Supercharges LLM Answers – Complete Guide & Code

This article explains Retrieval‑Augmented Generation (RAG), detailing its offline knowledge‑base construction and online retrieval‑enhanced generation workflow, comparing it with traditional and fine‑tuned models, and providing step‑by‑step LangChain implementations, advanced techniques, and practical use‑case demos.

Hybrid SearchLangChainPrompt engineering
0 likes · 16 min read
How Retrieval‑Augmented Generation (RAG) Supercharges LLM Answers – Complete Guide & Code
DataFunTalk
DataFunTalk
Feb 11, 2026 · Artificial Intelligence

Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System

This round‑table dissects the gap between RAG’s hype and real‑world production, exposing common pitfalls such as low recall, hallucinations and cost overruns, and then delivers a systematic diagnostic framework, hybrid search strategies, fine‑tuning rules, and practical best‑practice roadmaps for building reliable enterprise RAG solutions.

Agentic RAGFine-tuningHybrid Search
0 likes · 20 min read
Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System
Shuge Unlimited
Shuge Unlimited
Feb 11, 2026 · Operations

How to Easily Manage Operations of 10 Milvus Clusters with an Agent Skill

This article walks through the real‑world pain points of monitoring dozens of Milvus collections across multiple clusters, then details a Python‑based Skill that automates connection handling, aggregates collection metadata, evaluates index health with a three‑state model, and provides unified health checks, performance testing, and capacity analysis for reliable large‑scale vector database operations.

Index ManagementMilvusOperations Automation
0 likes · 18 min read
How to Easily Manage Operations of 10 Milvus Clusters with an Agent Skill
Java Architecture Diary
Java Architecture Diary
Feb 10, 2026 · Artificial Intelligence

Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector

This guide explains why pure vector retrieval often fails for version‑specific queries, introduces hybrid search that combines semantic and keyword matching, and provides step‑by‑step code and SQL examples for enabling PgVector hybrid search in LangChain4j 1.11.0.

Full‑Text SearchHybrid SearchLangChain4j
0 likes · 11 min read
Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector
Architecture and Beyond
Architecture and Beyond
Feb 8, 2026 · Artificial Intelligence

Designing Scalable Long-Term Memory for AI Agents: Capture, Compress, Retrieve

This article explains how to build a controllable, editable, and cost‑effective long‑term memory system for AI agents by categorizing memory types, structuring a three‑stage pipeline of capture, AI‑driven compression, and smart retrieval, and choosing appropriate storage back‑ends such as files, knowledge bases, or databases.

Agent DesignKnowledge BaseLong-term Memory
0 likes · 18 min read
Designing Scalable Long-Term Memory for AI Agents: Capture, Compress, Retrieve
Sohu Tech Products
Sohu Tech Products
Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

AIKnowledge BaseLLM
0 likes · 14 min read
Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG
0 likes · 15 min read
How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM
0 likes · 11 min read
Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection
Architects' Tech Alliance
Architects' Tech Alliance
Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM
0 likes · 15 min read
Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment
HyperAI Super Neural
HyperAI Super Neural
Dec 12, 2025 · Artificial Intelligence

AI Open‑Source Forum Recap: Video Generation, Vision, Vector DBs, AI‑Native Language

The AI Open‑Source Forum brought together researchers from Peking University, Tsinghua, Zilliz and MoonBit to share open‑source advances in audio‑synchronized video generation, vector database architecture, lightweight vision backbones, and an AI‑native programming language, highlighting datasets, system designs, and future collaborative directions.

AIAI‑Native ProgrammingVideo Generation
0 likes · 12 min read
AI Open‑Source Forum Recap: Video Generation, Vision, Vector DBs, AI‑Native Language
macrozheng
macrozheng
Dec 3, 2025 · Databases

How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance

Redis has introduced a multithreaded query engine that dramatically reduces latency and increases throughput—up to 16×—for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications compared to traditional single‑threaded architectures and competing vector databases.

RAGdatabase scalingmultithreading
0 likes · 6 min read
How Redis’s New Multithreaded Query Engine Boosts Vector Search Performance
Raymond Ops
Raymond Ops
Nov 23, 2025 · Databases

How to Install and Run Milvus Vector Database with Docker Compose

This guide introduces Milvus, an open‑source vector database for AI workloads, outlines its key features and common use cases, and provides step‑by‑step Docker‑Compose commands to set up Milvus, its storage backend MinIO, and the Attu management UI.

AttuDocker ComposeMilvus
0 likes · 8 min read
How to Install and Run Milvus Vector Database with Docker Compose
Data Party THU
Data Party THU
Nov 9, 2025 · Artificial Intelligence

Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed

This article walks through the core RAG pipeline, explains why chunking is the linchpin of retrieval quality, and provides detailed definitions, trade‑offs, and implementation examples for five chunking techniques—fixed, recursive, semantic, structure‑aware, and delayed—so you can choose the right approach for any document‑heavy AI application.

AILLMRAG
0 likes · 10 min read
Mastering Chunking Strategies for Effective RAG: Fixed, Recursive, Semantic, Structured, and Delayed
dbaplus Community
dbaplus Community
Nov 3, 2025 · Artificial Intelligence

How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms

This article explains how Retrieval‑Augmented Generation (RAG) combines vector databases with large language models to let non‑technical users ask natural‑language questions and receive precise SQL statements, detailing the workflow, architecture, chunking methods, performance gains, and remaining challenges.

Data PlatformLLMRAG
0 likes · 17 min read
How RAG Turns Natural Language Queries into Accurate SQL for Data Platforms
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Oct 9, 2025 · Artificial Intelligence

How Short‑Term and Long‑Term Memory Power LLM‑Based Agents

This article explains the definitions, technical implementations, functions, limitations, and collaborative workflow of short‑term and long‑term memory in large‑language‑model agents, detailing context windows, attention mechanisms, vector storage, retrieval strategies, and future research directions for building personalized, continuously learning AI agents.

Agent MemoryLLMLong-term Memory
0 likes · 11 min read
How Short‑Term and Long‑Term Memory Power LLM‑Based Agents
DataFunSummit
DataFunSummit
Oct 6, 2025 · Artificial Intelligence

Why Vector Lakes Are the Next Frontier for AI Data Management

This article explains how Zilliz's Vector Lake extends traditional data lakes with a unified storage‑compute architecture optimized for massive unstructured and vector data, detailing its background, key data types, autonomous‑driving use case, data flow, architecture, and deployment options.

AI data managementData LakeVector Lake
0 likes · 13 min read
Why Vector Lakes Are the Next Frontier for AI Data Management
JD Tech Talk
JD Tech Talk
Sep 28, 2025 · Artificial Intelligence

What Is Retrieval‑Augmented Generation (RAG) and How Does It Power Modern AI?

This article explains Retrieval‑Augmented Generation (RAG), an AI framework that combines traditional information retrieval with large language models, detailing its core workflow—from knowledge preparation, chunking, and embedding to vector database storage and the question‑answering stage—while highlighting key challenges, tools, and optimization strategies.

AIEmbeddingLLM
0 likes · 15 min read
What Is Retrieval‑Augmented Generation (RAG) and How Does It Power Modern AI?
Bilibili Tech
Bilibili Tech
Sep 26, 2025 · Artificial Intelligence

How RAG Transforms Natural Language Queries into Accurate SQL for Business Users

This article explains how Retrieval‑Augmented Generation (RAG) combines large language models with vector databases to let non‑technical staff query massive membership data using plain language, detailing the workflow, technical architecture, optimization challenges, and real‑world impact on data‑driven decision making.

AIData PlatformLLM
0 likes · 17 min read
How RAG Transforms Natural Language Queries into Accurate SQL for Business Users
AI Large Model Application Practice
AI Large Model Application Practice
Sep 23, 2025 · Artificial Intelligence

How MindsDB Turns Any Data Source into an AI‑Powered Query Engine

This article walks through installing MindsDB, configuring its unified data access layer, and demonstrates how to query across relational databases, files, and vector stores while injecting AI models—including traditional ML, LLMs, and embedding models—directly into SQL for intelligent data retrieval and analysis.

AI data integrationLLMMindsDB
0 likes · 16 min read
How MindsDB Turns Any Data Source into an AI‑Powered Query Engine
DataFunTalk
DataFunTalk
Sep 20, 2025 · Artificial Intelligence

Why Chroma’s Context Engineering Is Redefining AI Search Infrastructure

Jeff Huber, founder of Chroma, discusses the startup’s mission to turn AI demos into production‑grade applications, critiques the hype around RAG, emphasizes the importance of Context Engineering, and explains how Chroma’s open‑source vector database and cloud service aim to simplify AI search for developers.

AIChromaContext Engineering
0 likes · 32 min read
Why Chroma’s Context Engineering Is Redefining AI Search Infrastructure
Data STUDIO
Data STUDIO
Sep 18, 2025 · Artificial Intelligence

Build a RAG App from Scratch: Master Text Chunking, Vector Retrieval, and Coreference Resolution

This tutorial walks through building a Retrieval‑Augmented Generation (RAG) system from the ground up, covering document parsing, text chunking strategies, vector store creation with ChromaDB, semantic search, prompt engineering for LLMs, conversation memory, coreference handling, and practical optimization tips, all illustrated with complete Python code.

ChromaDBPythonRAG
0 likes · 19 min read
Build a RAG App from Scratch: Master Text Chunking, Vector Retrieval, and Coreference Resolution
Data Thinking Notes
Data Thinking Notes
Sep 7, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart

This article explains the core architecture of AI agents powered by large language models, detailing how planning, short‑term and long‑term memory, and tool integration work together through vector databases, retrieval‑augmented generation, and summarization to enable stateful, intelligent interactions across multiple sessions.

AI AgentLLMMemory
0 likes · 10 min read
Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart
Data Thinking Notes
Data Thinking Notes
Aug 31, 2025 · Artificial Intelligence

Embedding's Role in Retrieval‑Augmented Generation: Basics, Challenges & Future

This article explains how embedding technology converts unstructured data into vector representations, powers precise retrieval in Retrieval‑Augmented Generation (RAG), outlines the evolution of embedding models, discusses current challenges such as long‑text handling and domain adaptation, and highlights emerging solutions.

AIEmbeddingRAG
0 likes · 12 min read
Embedding's Role in Retrieval‑Augmented Generation: Basics, Challenges & Future
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 21, 2025 · Artificial Intelligence

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

This article details the challenges of building an AI‑powered defect deduplication system using Retrieval‑Augmented Generation, explains why LLMs produce composite (spliced) results, diagnoses the root cause as information loss in the RAG pipeline, and presents a step‑by‑step solution that restores atomicity of records for reliable duplicate detection.

AI debuggingKnowledge BaseLLM
0 likes · 14 min read
Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It
TAL Education Technology
TAL Education Technology
Jul 31, 2025 · Databases

How Milvus Powers Billion-Scale Vector Search for AI at TAL Education

This article explains how TAL Education leverages the open‑source Milvus vector database—covering its architecture, features, cloud‑native deployment, monitoring, and real‑world AI applications such as intelligent grading and multimodal search—to handle billions of vectors with millisecond‑level similarity retrieval.

AICloud NativeEducation Technology
0 likes · 14 min read
How Milvus Powers Billion-Scale Vector Search for AI at TAL Education
DeWu Technology
DeWu Technology
Jul 30, 2025 · Databases

Why Milvus Outperforms Traditional Databases: Deep Dive into Vector DB Architecture

This article explores the evolution, architecture, and operational challenges of vector databases like Milvus and Zilliz, comparing them with traditional databases, detailing indexing strategies such as HNSW and DiskANN, migration plans, performance benchmarks, and future directions for large‑scale AI‑driven search systems.

AIMilvusindexing
0 likes · 26 min read
Why Milvus Outperforms Traditional Databases: Deep Dive into Vector DB Architecture
AI Algorithm Path
AI Algorithm Path
Jun 26, 2025 · Artificial Intelligence

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

This guide breaks down the ten core building blocks of a production‑ready RAG pipeline—from input handling and vector stores to prompt engineering, LLM inference, observability, and evaluation—showing why each piece matters, common pitfalls, and practical best‑practice recommendations.

LLMObservabilityPrompt engineering
0 likes · 9 min read
The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System
ByteDance Data Platform
ByteDance Data Platform
Jun 11, 2025 · Databases

BlendHouse: The Award‑Winning Cloud‑Native Vector Database Redefining Search

ByteHouse’s BlendHouse, a cloud‑native vector database system presented at ICDE 2025, won the Best Industry and Application Paper Award, showcasing a high‑performance, universally designed framework with deep mixed‑query optimization that outperforms dedicated vector databases in read/write speed and supports large‑scale multimodal retrieval.

BlendHouseICDE 2025cloud-native
0 likes · 6 min read
BlendHouse: The Award‑Winning Cloud‑Native Vector Database Redefining Search
AntData
AntData
May 20, 2025 · Artificial Intelligence

How Vector Retrieval Powers AI: Challenges, Solutions, and VSAG’s Open‑Source Breakthrough

The article examines the rapid growth of unstructured data, explains the fundamentals and resource‑intensive nature of vector retrieval, presents Ant Group’s engineering practices—including hybrid HNSW‑DiskANN indexing, performance tricks like BSA pruning and memory prefetching, sparse‑vector and feedback‑driven recall improvements—and outlines the open‑source VSAG roadmap and ecosystem integrations.

AI InfrastructurePerformance OptimizationVector Retrieval
0 likes · 18 min read
How Vector Retrieval Powers AI: Challenges, Solutions, and VSAG’s Open‑Source Breakthrough
DeWu Technology
DeWu Technology
May 9, 2025 · Artificial Intelligence

Growth Story of a Technical Lead: Building a One‑Stop Large‑Model Training and Inference Platform at Dewu

Meng, a former Tencent and Alibaba engineer, led Dewu’s one‑stop large‑model training and inference platform, cutting integration costs, creating a shared GPU pool and CI/CD pipeline, building a Milvus vector‑database, and driving self‑directed learning that boosted business value, user experience, and set a roadmap for future RAG and cloud‑native optimizations.

AI PlatformCareer DevelopmentLarge Model
0 likes · 18 min read
Growth Story of a Technical Lead: Building a One‑Stop Large‑Model Training and Inference Platform at Dewu
Fun with Large Models
Fun with Large Models
Apr 18, 2025 · Artificial Intelligence

How RAG Works: From Data Prep to LLM Generation Explained

This article breaks down Retrieval‑Augmented Generation (RAG) into its three core stages—data preparation, data retrieval, and LLM generation—showing how document chunking, embedding, vector databases, similarity search, and optional re‑ranking combine to let large language models produce more accurate, knowledge‑grounded answers.

EmbeddingLLMRAG
0 likes · 9 min read
How RAG Works: From Data Prep to LLM Generation Explained
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Apr 10, 2025 · Artificial Intelligence

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

This guide walks through creating a Retrieval‑Augmented Generation (RAG) system using Spring Boot 3.4.2, Milvus vector database, and the bge‑m3 embedding model via Ollama, covering environment setup, dependency configuration, vector store operations, and integration with a large language model to deliver refined, similarity‑based answers.

EmbeddingLLMMilvus
0 likes · 11 min read
Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 3, 2025 · Artificial Intelligence

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

This article explains the Model Context Protocol (MCP) as a standard for LLM‑data integration, describes Retrieval‑Augmented Generation (RAG) techniques to reduce hallucinations, and introduces vector databases like Milvus that store high‑dimensional embeddings for efficient AI retrieval tasks.

LLMMCPMilvus
0 likes · 7 min read
Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration
Architect
Architect
Mar 29, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

This article guides developers without an AI background through the fundamentals of building large‑language‑model applications, covering prompt engineering, multi‑turn interaction, function calling, retrieval‑augmented generation, vector databases, code assistants, and the MCP protocol for AI agents.

AI AgentEmbeddingFunction Calling
0 likes · 51 min read
How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained
DaTaobao Tech
DaTaobao Tech
Mar 19, 2025 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Retrieval‑augmented generation (RAG) enhances large language models by integrating a preprocessing pipeline—cleaning, chunking, embedding, and vector storage—with a query‑driven retrieval and prompt‑injection workflow, leveraging vector databases, multi‑stage recall, advanced prompting, and comprehensive evaluation metrics to mitigate knowledge cut‑off, hallucinations, and security issues.

LLMRAGRetrieval Augmented Generation
0 likes · 27 min read
Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques
Architect
Architect
Mar 18, 2025 · Artificial Intelligence

2025 AI Agent Technology Stack: Layers, Core Functions, and Future Directions

The article outlines the 2025 AI Agent technology stack, detailing its five layered architecture—model serving, storage & memory, tooling, framework orchestration, and deployment—while discussing current trends, challenges, and future directions such as tool ecosystem expansion, self‑evolution, and edge‑cloud hybrid deployments.

AI AgentDeploymentObservability
0 likes · 12 min read
2025 AI Agent Technology Stack: Layers, Core Functions, and Future Directions
Tencent Technical Engineering
Tencent Technical Engineering
Mar 10, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

This guide shows non‑AI developers how to create large‑model applications by mastering prompt engineering, multi‑turn interactions, Retrieval‑Augmented Generation, function calling, and AI‑Agent integration, with practical code examples, tool design patterns, and deployment tips.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained
DevOps
DevOps
Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents
IT Services Circle
IT Services Circle
Mar 8, 2025 · Databases

PostgreSQL Overtaking MySQL: Cloud Adoption, Vector DB Advantage, and Future Database Landscape

The article analyzes recent industry data and expert observations showing PostgreSQL surpassing MySQL in cloud instance counts, CPU usage, and ecosystem support, especially in vector‑database and serverless contexts, while highlighting MySQL's strategic shortcomings and predicting PostgreSQL's dominance in the coming years.

Cloud DatabasesDatabase TrendsPostgreSQL
0 likes · 5 min read
PostgreSQL Overtaking MySQL: Cloud Adoption, Vector DB Advantage, and Future Database Landscape
Cognitive Technology Team
Cognitive Technology Team
Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

AgenticEnterprise searchLLM
0 likes · 18 min read
Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval
Tencent Cloud Developer
Tencent Cloud Developer
Mar 4, 2025 · Artificial Intelligence

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

The guide teaches non‑AI developers how to build practical LLM‑powered applications by mastering prompt engineering, function calling, retrieval‑augmented generation, and AI agents, and introduces the Modal Context Protocol for seamless tool integration, offering a clear learning path to leverage large language models without deep theory.

AI AgentFunction CallingLLM
0 likes · 48 min read
A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents
Cognitive Technology Team
Cognitive Technology Team
Feb 28, 2025 · Artificial Intelligence

Comparative Study of Traditional RAG, GraphRAG, and DeepSearcher for Knowledge Retrieval and Generation

This article examines why Retrieval‑Augmented Generation (RAG) is needed, compares traditional RAG, GraphRAG, and the DeepSearcher framework across architecture, data organization, retrieval mechanisms, result generation, efficiency and accuracy, and provides step‑by‑step implementation guides and experimental results using vector and graph databases.

DeepSearcherGraphRAGKnowledge Retrieval
0 likes · 20 min read
Comparative Study of Traditional RAG, GraphRAG, and DeepSearcher for Knowledge Retrieval and Generation