Tagged articles
26 articles
Page 1 of 1
James' Growth Diary
James' Growth Diary
Apr 26, 2026 · Databases

Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go

This article walks through the complete workflow of turning split text into high‑dimensional vectors, choosing the right embedding model, selecting an appropriate similarity metric, comparing index structures such as Flat, IVF, HNSW and PQ, and finally picking a vector database and integrating it with LangChain.js for production‑grade RAG pipelines.

LangChainRAGembeddings
0 likes · 25 min read
Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go
AI Engineer Programming
AI Engineer Programming
Apr 21, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantic Vectors: Understanding Embeddings and Similarity Search (Part 1)

The article explains how diverse data can be represented as high‑dimensional vectors, describes exact and approximate nearest‑neighbor search, explores vector quantization, product quantization, locality‑sensitive hashing, and HNSW graphs, and analyzes their speed, accuracy, and memory trade‑offs for large‑scale similarity retrieval.

HNSWLSHembeddings
0 likes · 16 min read
From Bag‑of‑Words to Semantic Vectors: Understanding Embeddings and Similarity Search (Part 1)
James' Growth Diary
James' Growth Diary
Apr 19, 2026 · Artificial Intelligence

Vector Database Basics: Embeddings, Similarity Search, and Index Structures

This article explains how embeddings turn text into high‑dimensional vectors, compares commercial and open‑source embedding models, details cosine, Euclidean and inner‑product similarity metrics, reviews common index structures such as Flat, IVF, HNSW and PQ, and shows how to choose and use a vector database with LangChain.js while avoiding typical pitfalls.

LangChainRAGembeddings
0 likes · 25 min read
Vector Database Basics: Embeddings, Similarity Search, and Index Structures
DeepHub IMBA
DeepHub IMBA
Apr 11, 2026 · Artificial Intelligence

Understanding Vector Similarity Search: Flat Index, IVF, and HNSW

This article explains why vector databases are needed for semantic search of unstructured data and provides a detailed, step‑by‑step comparison of three core vector similarity algorithms—cosine similarity, Flat Index, IVF, and HNSW—highlighting their trade‑offs in accuracy and speed.

HNSWIVFembeddings
0 likes · 10 min read
Understanding Vector Similarity Search: Flat Index, IVF, and HNSW
Open Source Tech Hub
Open Source Tech Hub
Feb 19, 2026 · Artificial Intelligence

Build Retrieval‑Augmented Generation (RAG) Agents in PHP with Neuron AI

This guide explains the fundamentals of Retrieval‑Augmented Generation, how embeddings and vector databases enable contextual AI agents, and provides step‑by‑step instructions for installing Neuron AI, writing a PHP RAG class, loading knowledge, and monitoring the agent in production.

AI agentsNeuron AIPHP
0 likes · 13 min read
Build Retrieval‑Augmented Generation (RAG) Agents in PHP with Neuron AI
BirdNest Tech Talk
BirdNest Tech Talk
Oct 21, 2025 · Artificial Intelligence

How Vector Stores Enable Lightning‑Fast Semantic Search in LangChain

This article explains what vector stores are, outlines their core workflow of adding, querying, and searching embeddings, compares popular back‑ends like FAISS, Chroma, and Pinecone, and walks through a complete Chinese‑language example using LangChain’s FAISS integration with detailed code and result analysis.

AIFAISSLangChain
0 likes · 10 min read
How Vector Stores Enable Lightning‑Fast Semantic Search in LangChain
BirdNest Tech Talk
BirdNest Tech Talk
Oct 20, 2025 · Artificial Intelligence

How Embedding Models Power Semantic Search: A Hands‑On LangChain Guide

This article explains what embeddings are, how LangChain’s Embeddings interface abstracts various providers, compares common models, and walks through a complete Python example that uses a Chinese‑optimized HuggingFace model to generate document and query vectors, compute cosine similarity, and identify the most relevant text.

LangChainNLPPython
0 likes · 9 min read
How Embedding Models Power Semantic Search: A Hands‑On LangChain Guide
Qborfy AI
Qborfy AI
Aug 25, 2025 · Artificial Intelligence

Unlocking AI Understanding: A Deep Dive into Embeddings and Their Real‑World Applications

This article explains how embeddings transform discrete items such as text, images, or user actions into continuous vectors, walks through the step‑by‑step workflow—from tokenization to normalization—highlights core properties, compares popular models, and showcases practical use cases in e‑commerce intent filtering and medical image retrieval, all backed by concrete examples and code.

AI fundamentalsembeddingsmodel comparison
0 likes · 7 min read
Unlocking AI Understanding: A Deep Dive into Embeddings and Their Real‑World Applications
Qborfy AI
Qborfy AI
Jun 7, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) Chatbot with LangChain and Streamlit

This guide walks through the complete process of creating a RAG‑powered question‑answering bot using LangChain, Streamlit, and vector‑store embeddings, covering theory, architecture, data loading, chunking, vector indexing, retrieval, LLM integration, and full code implementation with practical examples.

ChatbotLangChainPython
0 likes · 13 min read
Build a Retrieval‑Augmented Generation (RAG) Chatbot with LangChain and Streamlit
Coder Circle
Coder Circle
May 28, 2025 · Artificial Intelligence

Core AI Concepts Every Spring AI Developer Should Know

This article explains fundamental AI concepts—including models, prompts, prompt templates, embeddings, tokens, structured output, data integration, RAG, and tool calling—and shows how Spring AI simplifies their use for Java developers building intelligent applications.

AI modelsPrompt engineeringRAG
0 likes · 13 min read
Core AI Concepts Every Spring AI Developer Should Know
dbaplus Community
dbaplus Community
Feb 23, 2025 · Databases

Why Vector Databases Are Really Just Search Engines in Disguise

The article traces the evolution of embedding technology from a secret weapon of tech giants to a mainstream developer tool, explains the rapid rise and subsequent integration of vector databases into traditional search engines, and argues that vector databases are essentially search engines with added vector capabilities.

AI InfrastructureRAGdatabase integration
0 likes · 9 min read
Why Vector Databases Are Really Just Search Engines in Disguise
AI Large Model Application Practice
AI Large Model Application Practice
Jan 20, 2025 · Artificial Intelligence

How Embeddings Transform Simple Character Codes into Powerful Vectors for LLMs

This article explains how embeddings convert basic character indices into high‑dimensional vectors, describes their training via gradient descent, introduces the embedding matrix, and shows how these vectors enable modern language models to capture semantic relationships and be reused across tasks.

LLMNeural Networksembeddings
0 likes · 8 min read
How Embeddings Transform Simple Character Codes into Powerful Vectors for LLMs
Ops Development & AI Practice
Ops Development & AI Practice
Mar 16, 2024 · Databases

Why ChromaDB Is Becoming the Go-To Vector Store for AI Applications

ChromaDB is an open‑source, AI‑native vector database that efficiently stores, indexes, and retrieves high‑dimensional embeddings, offering fast similarity search, easy integration via flexible APIs, strong scalability, and active community support, making it suitable for recommendation systems, NLP, and image‑recognition workloads.

AIChromaDBembeddings
0 likes · 5 min read
Why ChromaDB Is Becoming the Go-To Vector Store for AI Applications
Bitu Technology
Bitu Technology
Jan 17, 2024 · Artificial Intelligence

Rosetta Stone: Scalable ID Mapping System for Tubi's Content Library Using LLMs and Embeddings

This article describes how Tubi built the Rosetta Stone system—a flexible ID mapping workflow that leverages large language models, embedding similarity ranking, and K‑nearest‑neighbors to unify and enrich metadata across a 200,000‑title library, improve content recommendation, and streamline operations.

Big DataLLMcontent ID mapping
0 likes · 10 min read
Rosetta Stone: Scalable ID Mapping System for Tubi's Content Library Using LLMs and Embeddings
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 12, 2024 · Artificial Intelligence

Understanding Vector Databases, ANN Algorithms, and Their Integration with Large Language Models

This article explains the fundamentals of vector databases, how high‑dimensional vector data is generated and stored, reviews common ANN search algorithms such as Flat, k‑means and LSH, discusses benchmarking and product selection, and demonstrates practical integration of vector stores with LLMs using LangChain and Python code.

ANNLLM integrationPython
0 likes · 17 min read
Understanding Vector Databases, ANN Algorithms, and Their Integration with Large Language Models
AI Large Model Application Practice
AI Large Model Application Practice
Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

LLMLangChainPDF processing
0 likes · 12 min read
How to Extract and Embed Tables and Images from PDFs for Multimodal RAG
Open Source Linux
Open Source Linux
Sep 8, 2023 · Artificial Intelligence

How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text

This article explains the inner workings of ChatGPT, covering how large language models predict the next token using probability distributions, the role of embeddings, the transformer architecture with attention heads, training methods, loss functions, and why such a massive neural network can produce coherent, human‑like language.

ChatGPTLanguage ModelNeural Networks
0 likes · 79 min read
How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text
ITPUB
ITPUB
Jul 5, 2023 · Databases

Why Vector Databases Are Essential for Building Industry‑Specific LLM Applications

Vector databases enable efficient similarity search and storage of high‑dimensional embeddings, allowing enterprises to combine large language models with proprietary knowledge assets to create domain‑specific, accurate, and up‑to‑date AI services, as illustrated with open‑source solutions Chroma and Milvus.

AIChromaLLM
0 likes · 11 min read
Why Vector Databases Are Essential for Building Industry‑Specific LLM Applications
Architect
Architect
May 29, 2023 · Artificial Intelligence

Understanding Embeddings and Vector Databases for LLM Applications

This article explains what embeddings and vector databases are, how they are generated with models like OpenAI's Ada, why they enable semantic search and help overcome large language model token limits, and demonstrates a practical workflow for retrieving relevant document chunks using cosine similarity.

LLMembeddingsinformation retrieval
0 likes · 7 min read
Understanding Embeddings and Vector Databases for LLM Applications
Alipay Experience Technology
Alipay Experience Technology
Mar 21, 2023 · Artificial Intelligence

How to Make OpenAI’s API Understand Ultra‑Long Insurance Policies

This article explains how to overcome OpenAI's token limits by splitting massive insurance documents into manageable chunks, vectorizing them with embeddings, using a custom "broccoli" algorithm for intelligent segmentation, and compressing text with dictionary mapping and tokenization techniques to enable accurate question‑answering via the API.

APIDocument SplittingNLP
0 likes · 22 min read
How to Make OpenAI’s API Understand Ultra‑Long Insurance Policies
Top Architect
Top Architect
Mar 1, 2023 · Artificial Intelligence

Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques

This article provides a comprehensive overview of how ChatGPT works, covering its probabilistic text generation, transformer architecture, embedding representations, neural network training processes, and the underlying principles that enable large language models to produce coherent and meaningful human-like language.

AIChatGPTLanguage Model
0 likes · 80 min read
Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques
Bitu Technology
Bitu Technology
Jul 8, 2022 · Artificial Intelligence

Applying NLP and Machine Learning to Classify Tubi User Feedback

This article explains how Tubi leverages natural‑language processing, sentence embeddings (USE and BERT), and LightGBM models to automatically categorize large volumes of Net Promoter Score comments and customer‑support tickets, enabling data‑driven product decisions and workflow automation.

LightGBMNLPTubi
0 likes · 11 min read
Applying NLP and Machine Learning to Classify Tubi User Feedback