Tagged articles
7 articles
Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Apr 21, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantic Vectors: Understanding Embeddings and Similarity Search (Part 1)

The article explains how diverse data can be represented as high‑dimensional vectors, describes exact and approximate nearest‑neighbor search, explores vector quantization, product quantization, locality‑sensitive hashing, and HNSW graphs, and analyzes their speed, accuracy, and memory trade‑offs for large‑scale similarity retrieval.

HNSWLSHembeddings
0 likes · 16 min read
From Bag‑of‑Words to Semantic Vectors: Understanding Embeddings and Similarity Search (Part 1)
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Aug 5, 2025 · Artificial Intelligence

Enterprise Semantic Search: Key Q&A on Scoring, Recall, LSH, Chunking, and Embedding Dimensions

This article answers practical questions about enterprise semantic search, explaining how Reciprocal Rank Fusion normalizes mixed scoring, how to control vector result size, the trade‑offs of LSH parameters, word‑ and sentence‑based chunking strategies with version‑specific defaults, and flexible embedding dimensionality.

ElasticsearchLSHRRF
0 likes · 8 min read
Enterprise Semantic Search: Key Q&A on Scoring, Recall, LSH, Chunking, and Embedding Dimensions
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Jul 30, 2025 · Backend Development

From Keyword Matching to Semantic Understanding: Building an Intelligent E‑Commerce Search Engine

The article analyzes the semantic gap in e‑commerce search, compares traditional keyword matching with vector‑based retrieval, and provides a step‑by‑step implementation using Elasticsearch/Easysearch pipelines, embedding models, and a hybrid search strategy to improve user intent understanding.

EasysearchElasticsearchHybrid Search
0 likes · 11 min read
From Keyword Matching to Semantic Understanding: Building an Intelligent E‑Commerce Search Engine
DeWu Technology
DeWu Technology
Jul 27, 2022 · Artificial Intelligence

Overview of Nearest Neighbor Search Algorithms

The article reviews how high‑dimensional vector representations in deep‑learning applications require efficient approximate nearest‑neighbor search, comparing K‑d trees, hierarchical k‑means trees, locality‑sensitive hashing, product quantization, and HNSW graphs, and discusses practical FAISS implementations and how algorithm choice depends on data size, recall, latency, and resources.

FAISSHNSWKD-Tree
0 likes · 8 min read
Overview of Nearest Neighbor Search Algorithms
IEG Growth Platform Technology Team
IEG Growth Platform Technology Team
Jan 17, 2022 · Artificial Intelligence

Introduction to Vector Retrieval, Distance Metrics, and Fundamental Algorithms

This article introduces the concept of vector retrieval, outlines its diverse application scenarios, explains common distance metrics for both floating‑point and binary vectors, and surveys fundamental approximate nearest‑neighbor algorithms including tree‑based, graph‑based, quantization, and hashing methods.

HNSWKD-TreeLSH
0 likes · 22 min read
Introduction to Vector Retrieval, Distance Metrics, and Fundamental Algorithms
Laiye Technology Team
Laiye Technology Team
Jan 7, 2022 · Artificial Intelligence

Understanding Vector Retrieval: Principles, Applications, and High‑Performance Algorithms

This article explains how deep learning transforms unstructured data into dense vectors, defines vector retrieval, outlines its many use cases such as product, video, and text search, discusses challenges in learning effective embeddings, and reviews high‑performance algorithms like LSH, neighbor graphs, and product quantization.

AI applicationsDeep LearningHNSW
0 likes · 21 min read
Understanding Vector Retrieval: Principles, Applications, and High‑Performance Algorithms
JD Tech Talk
JD Tech Talk
Nov 30, 2020 · Big Data

Scalable Time Series Similarity Search in Big Data: Partitioning, Dimensionality Reduction, and LSH Approaches

This article examines the challenges of performing time‑series similarity queries on massive datasets and presents three scalable solutions—partition‑based indexing, dimensionality‑reduction using MinHash, and a combined approach with Locality Sensitive Hashing—to reduce computation while preserving similarity accuracy.

Big DataLSHMinhash
0 likes · 10 min read
Scalable Time Series Similarity Search in Big Data: Partitioning, Dimensionality Reduction, and LSH Approaches