Tagged articles
27 articles
Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Apr 26, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)

The article explains how embedding techniques encode semantic information into numeric vectors, covering Word2Vec and GloVe fundamentals, BERT anisotropy, SimCSE contrastive learning, alignment and uniformity metrics, ANN index structures such as HNSW, IVF and PQ, Matryoshka representation learning, practical deployment challenges, and evaluation best practices.

ANNBERTEmbedding
0 likes · 23 min read
From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)
Senior Tony
Senior Tony
Apr 11, 2026 · Databases

Why Vectors Need a Dedicated Database and How Milvus Solves It

This article explains what vectors are, why traditional relational databases struggle with high‑dimensional similarity queries, and how the open‑source Milvus vector database efficiently stores, indexes, and retrieves massive vectors for AI applications such as semantic search, image matching, and recommendation.

AI applicationsANNMilvus
0 likes · 5 min read
Why Vectors Need a Dedicated Database and How Milvus Solves It
Volcano Engine Developer Services
Volcano Engine Developer Services
Dec 5, 2025 · Artificial Intelligence

Why Vectors Power Scalable AI Search and How S3 Vectors Redefines Storage

This article explains how high‑dimensional vectors enable semantic AI search, compares exact and approximate nearest‑neighbor algorithms, examines the challenges of large‑scale vector storage, and evaluates AWS S3 Vectors' architecture, pricing, and hybrid solutions for cost‑effective, high‑performance retrieval.

AI semanticsANNS3 Vectors
0 likes · 17 min read
Why Vectors Power Scalable AI Search and How S3 Vectors Redefines Storage
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
May 22, 2025 · Artificial Intelligence

Scalable Overload-Aware Graph-Based Index Construction for 10‑Billion‑Scale Vector Similarity Search (SOGAIC)

The paper introduces SOGAIC, a scalable overload‑aware graph‑based index construction system for billion‑scale vector similarity search that uses adaptive overlapping partitioning and load‑balanced distributed scheduling to cut construction time by 47.3% while maintaining high recall.

ANNDistributed Schedulinggraph index
0 likes · 13 min read
Scalable Overload-Aware Graph-Based Index Construction for 10‑Billion‑Scale Vector Similarity Search (SOGAIC)
StarRocks
StarRocks
Feb 11, 2025 · Databases

How StarRocks Supercharges Vector Search: 7× Faster Queries and 1/3 Cost

This article explains the principles and practical implementation of vector retrieval in StarRocks, covering approximate nearest‑neighbor algorithms, index design, query planning, performance optimizations, real‑world case studies, and future challenges, showing how query latency dropped from 15 seconds to 2 seconds while cutting costs to a third.

ANNHNSWIVFPQ
0 likes · 25 min read
How StarRocks Supercharges Vector Search: 7× Faster Queries and 1/3 Cost
Baidu Tech Salon
Baidu Tech Salon
Nov 22, 2024 · Artificial Intelligence

How GPU‑Accelerated ANN Search Cuts Costs and Boosts Throughput in High‑Volume Retrieval

This article analyzes a GPU‑based approximate nearest neighbor (ANN) retrieval solution built on NVIDIA's RAFT library, detailing algorithm selection, offline indexing tricks, batch online search design, performance results on a 25‑million‑vector workload, and cost‑saving implications for large‑scale search services.

ANNGPUIVF_INT8
0 likes · 21 min read
How GPU‑Accelerated ANN Search Cuts Costs and Boosts Throughput in High‑Volume Retrieval
Baidu Geek Talk
Baidu Geek Talk
Nov 20, 2024 · Artificial Intelligence

Boosting ANN Search with GPU: Inside RAFT’s IVF_INT8 Implementation

This article examines how Baidu and NVIDIA leveraged the open‑source RAFT library to build a GPU‑accelerated approximate nearest neighbor (ANN) retrieval system, detailing algorithm choices, offline indexing, online batch processing, performance results, and practical guidelines for deploying ANN on GPUs.

ANNGPUIVF_INT8
0 likes · 20 min read
Boosting ANN Search with GPU: Inside RAFT’s IVF_INT8 Implementation
JD Tech
JD Tech
May 17, 2024 · Artificial Intelligence

Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency

The article details how JD's advertising retrieval platform tackles the core challenge of balancing limited compute resources with massive data by optimizing compute allocation, improving model scoring efficiency, and enhancing iteration speed through distributed execution graphs, adaptive algorithms, and platform‑level infrastructure improvements.

ANNAdvertisingDeep Learning
0 likes · 24 min read
Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 12, 2024 · Artificial Intelligence

Understanding Vector Databases, ANN Algorithms, and Their Integration with Large Language Models

This article explains the fundamentals of vector databases, how high‑dimensional vector data is generated and stored, reviews common ANN search algorithms such as Flat, k‑means and LSH, discusses benchmarking and product selection, and demonstrates practical integration of vector stores with LLMs using LangChain and Python code.

ANNLLM integrationPython
0 likes · 17 min read
Understanding Vector Databases, ANN Algorithms, and Their Integration with Large Language Models
Sohu Tech Products
Sohu Tech Products
Nov 1, 2023 · Databases

Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions

Douyin tackled vector‑retrieval challenges by optimizing HNSW and creating a high‑performance IVF algorithm, implementing custom scalar quantization, SIMD acceleration, and a DSL‑driven engine that merges filtering with search, then built a cloud‑native, storage‑compute‑separated vector database (VikingDB) delivering sub‑10 ms latency, real‑time updates, multi‑tenant support, and secure, scalable retrieval for LLM‑driven applications.

ANNLLM integrationStorage Compute Separation
0 likes · 18 min read
Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions
Baidu Geek Talk
Baidu Geek Talk
Oct 31, 2023 · Artificial Intelligence

Interview on Baidu's Open‑Source Large‑Scale Vector Search Engine Puck

Baidu has open‑sourced its high‑performance, trillion‑scale vector search engine Puck—originally built for ultra‑large image‑search workloads, winner of multiple BIGANN categories, now supporting diverse embeddings alongside the medium‑size Tinker algorithm—to accelerate community innovation, improve code quality, and broaden AI retrieval applications across search, recommendation and cloud services.

AIANNBaidu
0 likes · 12 min read
Interview on Baidu's Open‑Source Large‑Scale Vector Search Engine Puck
DataFunTalk
DataFunTalk
Oct 30, 2023 · Databases

Engineering Practices and Evolution of Douyin’s Cloud‑Native Vector Database

This article outlines Douyin’s step‑by‑step engineering evolution of its cloud‑native vector database, covering the background of vector search, core concepts, algorithmic optimizations, storage‑compute separation, streaming updates, multi‑tenant orchestration, and future applications such as large language model integration.

ANNCloud NativeDouyin
0 likes · 17 min read
Engineering Practices and Evolution of Douyin’s Cloud‑Native Vector Database
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 21, 2023 · Artificial Intelligence

How Vector Search Powers AI: From Embeddings to Real‑World Applications

This article explains how vector search converts unstructured data such as speech, images, video, and text into high‑dimensional embeddings, explores common algorithms like Brute‑Force, ANN, and HNSW, and presents optimization techniques that dramatically improve recall and query‑per‑second performance for large‑scale AI retrieval systems.

AIANNEmbedding
0 likes · 27 min read
How Vector Search Powers AI: From Embeddings to Real‑World Applications
Baidu Geek Talk
Baidu Geek Talk
Sep 4, 2023 · Artificial Intelligence

Puck: Baidu’s Open‑Source High‑Performance ANN Retrieval Engine

Puck, Baidu’s open‑source Approximate Nearest Neighbor engine built on the proprietary Puck and Tinker algorithms, delivers high recall, accuracy and throughput across tiny to trillion‑scale datasets, outperforms rivals in benchmarks—including first‑place BIGANN 2021—while offering a simple, extensible API, proven reliability in dozens of Baidu services, and an Apache 2.0 license encouraging community contributions.

ANNBaiduBenchmark
0 likes · 7 min read
Puck: Baidu’s Open‑Source High‑Performance ANN Retrieval Engine
dbaplus Community
dbaplus Community
Aug 26, 2023 · Databases

What Is a Vector Database? A Simple Guide from Kids to Engineers

This article demystifies vector databases by first explaining the concept with a five‑year‑old analogy, then expanding to technical details for developers, covering how embeddings work, the differences from relational databases, ANN search, indexing, similarity metrics, and why vector stores outperform raw NumPy arrays for large‑scale similarity retrieval.

ANNdatabasesmachine learning
0 likes · 9 min read
What Is a Vector Database? A Simple Guide from Kids to Engineers
21CTO
21CTO
May 16, 2023 · Databases

How Cassandra’s New Vector Search Transforms AI Applications

This article explains how Cassandra’s newly added vector data type and ANN search capabilities empower AI developers to store, index, and query high‑dimensional embeddings at scale, enabling use cases such as image retrieval, recommendation, and large‑language‑model integration.

AIANNcassandra
0 likes · 10 min read
How Cassandra’s New Vector Search Transforms AI Applications
Zhuanzhuan Tech
Zhuanzhuan Tech
Sep 21, 2022 · Artificial Intelligence

Vector Retrieval and Product Quantization with Faiss

This article explains the challenges of large‑scale vector retrieval, compares Faiss index types such as brute‑force, graph‑based and product quantization, and details how product quantization works, its memory‑speed trade‑offs, hierarchical quantization, and practical hyper‑parameter tuning.

ANNEmbeddingFAISS
0 likes · 9 min read
Vector Retrieval and Product Quantization with Faiss
Baidu Geek Talk
Baidu Geek Talk
Feb 14, 2022 · Artificial Intelligence

How Baidu’s PUCK Dominated the First BigANN Vector Search Competition

The inaugural BigANN competition, organized by NeurIPS, showcased large‑scale ANN research, and Baidu's self‑developed PUCK algorithm secured top scores across all four tracks by leveraging multi‑layer quantization, two‑level inverted indexing, and extensive system‑level optimizations.

ANNBigANNPUCK
0 likes · 8 min read
How Baidu’s PUCK Dominated the First BigANN Vector Search Competition
Kuaishou Tech
Kuaishou Tech
Dec 10, 2021 · Artificial Intelligence

Kuaishou and Tsinghua University Win NeurIPS'21 Billion-Scale ANN Challenge with FAISS‑Optimized KST_ANN Solution

On December 6, Kuaishou and Tsinghua University’s joint team secured first place in the NeurIPS'21 Billion‑Scale Approximate Nearest Neighbor Search Challenge by leveraging a FAISS‑optimized, memory‑efficient KST_ANN algorithm that achieved over 6% higher recall on multiple billion‑scale datasets, showcasing the practical impact of large‑scale vector retrieval in AI‑driven services.

AIANNFAISS
0 likes · 5 min read
Kuaishou and Tsinghua University Win NeurIPS'21 Billion-Scale ANN Challenge with FAISS‑Optimized KST_ANN Solution
Kuaishou Tech
Kuaishou Tech
Nov 29, 2021 · Artificial Intelligence

Starry Vector Retrieval Platform: Architecture, Features, and Performance

The article describes the design, challenges, architecture, key features, algorithm optimizations, and future roadmap of Kuaishou's Starry vector retrieval platform, which delivers high‑performance, high‑reliability, and easy‑to‑use large‑scale ANN search for diverse business scenarios.

AI PlatformANNPerformance Optimization
0 likes · 14 min read
Starry Vector Retrieval Platform: Architecture, Features, and Performance
Baidu Geek Talk
Baidu Geek Talk
May 10, 2021 · Industry Insights

How Baidu’s GNOIMI Powers Billion‑Scale Rich Media Retrieval

Baidu’s rich‑media retrieval system combines CNN‑based feature extraction with an Approximate Nearest Neighbor engine called GNOIMI, employing hierarchical clustering, product quantization, and optimized indexing to achieve sub‑millisecond search over billions of images, videos and audio, supporting anti‑spam, recommendation and risk‑control across dozens of services.

ANNGNOIMIHNSW
0 likes · 16 min read
How Baidu’s GNOIMI Powers Billion‑Scale Rich Media Retrieval
ITPUB
ITPUB
Jul 25, 2020 · Backend Development

How SimSvr Achieves Billion‑Scale Real‑Time ANN Search for Recommendations

SimSvr is a high‑performance, distributed feature‑retrieval component designed for recommendation systems that supports billion‑scale indexes, sub‑millisecond query latency, real‑time and batch updates, multi‑model AB‑testing, and advanced filtering, all while running on Tencent's production workloads.

ANNRecommendation Systemsfeature retrieval
0 likes · 17 min read
How SimSvr Achieves Billion‑Scale Real‑Time ANN Search for Recommendations
DataFunTalk
DataFunTalk
Apr 6, 2020 · Artificial Intelligence

Introducing DeepMatch: An Open‑Source Library for Deep Retrieval Matching Algorithms

DeepMatch is an open‑source Python library that implements several mainstream deep‑learning based recall‑matching algorithms, provides easy installation via pip, detailed usage examples with code, and supports exporting user and item vectors for ANN search, making it ideal for rapid experimentation and learning in recommendation systems.

ANNDeep LearningPython
0 likes · 10 min read
Introducing DeepMatch: An Open‑Source Library for Deep Retrieval Matching Algorithms