How to Build an Image Search Engine with CNN and Milvus: A Step‑by‑Step Guide
This article walks through the complete engineering workflow for building an image‑search system, covering CNN‑based feature extraction with VGG16, vector normalization, image preprocessing, black‑edge removal, and practical deployment of the Milvus vector database including hardware requirements, capacity planning, collection/partition design, and search result handling.
Overview
The article presents a practical implementation of an image‑search system, focusing on two core tasks: extracting image feature vectors with a convolutional neural network (CNN) and managing those vectors using the Milvus vector search engine.
CNN + VGG16 Feature Extraction
Feature vectors are generated by loading a pre‑trained VGG16 model (without the top classification layers) via Keras and TensorFlow. The following Python code demonstrates the process:
from keras.applications import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)The resulting features array is the image’s feature vector.
Normalization
To simplify later similarity calculations, the feature vector is L2‑normalized:
from numpy import linalg as LA
norm_feat = feat[0] / LA.norm(feat[0])Image Loading Alternatives
Besides image.load_img, raw image bytes can be converted directly to a PIL Image object:
import io
from PIL import Image
# img_bytes: raw image bytes
img = Image.open(io.BytesIO(img_bytes)).convert('RGB')
img = img.resize((224, 224), Image.NEAREST)Both approaches yield the same img object; the key steps are RGB conversion and resizing.
Black‑Edge Removal
Images often contain black borders that add noise. The following function removes rows and columns that are entirely zero (RGB (0,0,0)) using NumPy:
def RemoveBlackEdge(img):
"""Remove horizontal black edges from a PIL image"""
width = img.width
img = image.img_to_array(img)
img_without_black = img[~np.all(img == np.zeros((1, width, 3)), axis=(1,2))]
return image.array_to_img(img_without_black)Milvus Vector Search Engine
Milvus stores and searches the feature vectors. It requires a CPU that supports the avx2 instruction set; you can verify support on Linux with: cat /proc/cpuinfo | grep flags | grep avx2 If avx2 is absent, a compatible machine is needed.
Capacity Planning
Each float32 dimension occupies 4 bytes. For a 512‑dimensional vector:
1 000 vectors ≈ 2 MB
1 000 000 vectors ≈ 2 GB
10 000 000 vectors ≈ 20 GB
100 000 000 vectors ≈ 200 GB
1 000 000 000 vectors ≈ 2 TB
To keep all data in memory, the system must have at least this amount of RAM; otherwise Milvus will spill to disk.
System Configuration
Milvus uses collections (tables) and partitions (sub‑tables) to organize data. Collections store ID + vector rows; partitions are logical subdivisions of a collection. Metadata is managed internally by SQLite or optionally by an external MySQL instance. When the number of collections or partitions exceeds ~50 000 (or 4 096 in version 0.8.0), SQLite becomes a bottleneck, so MySQL is recommended for large deployments.
Index Selection
Choosing the appropriate index type (e.g., IVF_FLAT, HNSW) depends on the trade‑off between search speed, memory usage, and recall. Refer to Milvus documentation for detailed guidance.
Search Result Handling
Milvus returns ID + distance pairs, where distance ranges from 0 (identical) to 1 (completely different). Users must filter out placeholder IDs such as -1. Pagination is not built‑in; to emulate page N with page size S, request topK = N × S results and keep the last S entries.
Similarity Threshold
Business logic should define a distance threshold (e.g., 0.2) to decide whether two images are considered similar; vectors with a distance below the threshold are treated as matches.
Conclusion
The guide demonstrates a complete pipeline—from CNN‑based feature extraction and preprocessing to vector storage, indexing, and retrieval with Milvus—providing a solid foundation for building production‑grade image‑search applications.
System Architect Go
Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
