Artificial Intelligence 9 min read

How Embedding Models Power Semantic Search: A Hands‑On LangChain Guide

This article explains what embeddings are, how LangChain’s Embeddings interface abstracts various providers, compares common models, and walks through a complete Python example that uses a Chinese‑optimized HuggingFace model to generate document and query vectors, compute cosine similarity, and identify the most relevant text.

BirdNest Tech Talk

Oct 20, 2025

How Embedding Models Power Semantic Search: A Hands‑On LangChain Guide

Embeddings convert discrete items such as words, sentences, or whole documents into continuous vectors—a long list of floating‑point numbers—so that semantically similar texts map to nearby points in vector space. By measuring the distance between two vectors, typically with cosine similarity, we can quantify semantic similarity, which underpins modern semantic search and Retrieval‑Augmented Generation (RAG) applications.

LangChain Embeddings Interface

LangChain defines a standard Embeddings class that unifies interaction with various embedding providers (OpenAI, Hugging Face, Cohere, etc.). The interface offers two core methods: embed_documents(self, texts: List[str]) -> List[List[float]] Input: a list of document strings.

Output: a list of vectors, one per document.

Use case: batch creation of embeddings for building a vector database. embed_query(self, text: str) -> List[float] Input: a single query string.

Output: a vector representing the query.

Use case: generating an embedding for a real‑time user query during similarity search.

Some providers train separate models for document indexing and query encoding; the split design lets LangChain call the appropriate model for optimal retrieval. For many models (e.g., OpenAI text-embedding-3-small), both methods may invoke the same underlying model.

Common Embedding Models

OpenAIEmbeddings

: uses OpenAI models such as text-embedding-3-small; high performance and widely adopted in commercial settings. HuggingFaceEmbeddings: loads and runs open‑source models from the Hugging Face Hub, useful for on‑premise deployment or language‑specific needs (e.g., Chinese). GoogleGenerativeAIEmbeddings: accesses Google Gemini’s embedding model.

Using a Chinese‑Optimized HuggingFace Embedding Model

Because OpenAI services are unavailable in some regions, the example demonstrates HuggingFaceEmbeddings with the Chinese‑optimized model shibing624/text2vec-base-chinese.

embedder = HuggingFaceEmbeddings(
    model_name="shibing624/text2vec-base-chinese",
    model_kwargs={"device": "cpu"},
    encode_kwargs={"normalize_embeddings": True}
)

Next, a list of four Chinese example documents is prepared (translated here for clarity):

documents = [
    "The weather is great today, clear sky.",
    "I love eating ice cream, especially strawberry flavor.",
    "Large language models are an important branch of deep learning.",
    "How can I learn programming efficiently?"
]

The script then creates embeddings for the documents and a query, computes similarity via dot product, prints each document’s score, and selects the most similar document.

# Create document embeddings
doc_vecs = embedder.embed_documents(documents)
# Create query embedding
query = "What knowledge is needed to study AI?"
q_vec = embedder.embed_query(query)
# Compute similarity (dot product)
import numpy as np
sims = np.array(doc_vecs).dot(np.array(q_vec))
for i, doc in enumerate(documents):
    print(f"Doc {i+1}: '{doc}' 
Similarity = {sims[i]:.4f}
")
most_similar_idx = np.argmax(sims)
print(f"
Most relevant document: '{documents[most_similar_idx]}'")
print(f"Similarity score: {sims[most_similar_idx]:.4f}")

Running the script produces a deprecation warning (the class will be moved to langchain‑huggingface) and then outputs similarity scores, e.g., the fourth document (“How can I learn programming efficiently?”) receives the highest similarity of 0.5390 to the query.

Takeaway

The example shows how to run an open‑source Chinese embedding model locally, generate vector representations for both documents and queries, and perform a simple semantic similarity search, illustrating a practical workflow for building localized search systems.

References

How to: embed text data [1]

How to: cache embedding results [2]

How to: create a custom embeddings class [3]

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python LangChain vector databases NLP semantic search huggingface Embeddings

Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.