Artificial Intelligence 11 min read

How to Build an Image Search Engine with CNN and Milvus: A Step‑by‑Step Guide

This article walks through the complete engineering workflow for building an image‑search system, covering CNN‑based feature extraction with VGG16, vector normalization, image preprocessing, black‑edge removal, and practical deployment of the Milvus vector database including hardware requirements, capacity planning, collection/partition design, and search result handling.

System Architect Go

Apr 11, 2020

How to Build an Image Search Engine with CNN and Milvus: A Step‑by‑Step Guide

Overview

The article presents a practical implementation of an image‑search system, focusing on two core tasks: extracting image feature vectors with a convolutional neural network (CNN) and managing those vectors using the Milvus vector search engine.

CNN + VGG16 Feature Extraction

Feature vectors are generated by loading a pre‑trained VGG16 model (without the top classification layers) via Keras and TensorFlow. The following Python code demonstrates the process:

from keras.applications import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)

The resulting features array is the image’s feature vector.

Normalization

To simplify later similarity calculations, the feature vector is L2‑normalized:

from numpy import linalg as LA
norm_feat = feat[0] / LA.norm(feat[0])

Image Loading Alternatives

Besides image.load_img, raw image bytes can be converted directly to a PIL Image object:

import io
from PIL import Image
# img_bytes: raw image bytes
img = Image.open(io.BytesIO(img_bytes)).convert('RGB')
img = img.resize((224, 224), Image.NEAREST)

Both approaches yield the same img object; the key steps are RGB conversion and resizing.

Black‑Edge Removal

Images often contain black borders that add noise. The following function removes rows and columns that are entirely zero (RGB (0,0,0)) using NumPy:

def RemoveBlackEdge(img):
    """Remove horizontal black edges from a PIL image"""
    width = img.width
    img = image.img_to_array(img)
    img_without_black = img[~np.all(img == np.zeros((1, width, 3)), axis=(1,2))]
    return image.array_to_img(img_without_black)

Milvus Vector Search Engine

Milvus stores and searches the feature vectors. It requires a CPU that supports the avx2 instruction set; you can verify support on Linux with: cat /proc/cpuinfo | grep flags | grep avx2 If avx2 is absent, a compatible machine is needed.

Capacity Planning

Each float32 dimension occupies 4 bytes. For a 512‑dimensional vector:

1 000 vectors ≈ 2 MB

1 000 000 vectors ≈ 2 GB

10 000 000 vectors ≈ 20 GB

100 000 000 vectors ≈ 200 GB

1 000 000 000 vectors ≈ 2 TB

To keep all data in memory, the system must have at least this amount of RAM; otherwise Milvus will spill to disk.

System Configuration

Milvus uses collections (tables) and partitions (sub‑tables) to organize data. Collections store ID + vector rows; partitions are logical subdivisions of a collection. Metadata is managed internally by SQLite or optionally by an external MySQL instance. When the number of collections or partitions exceeds ~50 000 (or 4 096 in version 0.8.0), SQLite becomes a bottleneck, so MySQL is recommended for large deployments.

Index Selection

Choosing the appropriate index type (e.g., IVF_FLAT, HNSW) depends on the trade‑off between search speed, memory usage, and recall. Refer to Milvus documentation for detailed guidance.

Search Result Handling

Milvus returns ID + distance pairs, where distance ranges from 0 (identical) to 1 (completely different). Users must filter out placeholder IDs such as -1. Pagination is not built‑in; to emulate page N with page size S, request topK = N × S results and keep the last S entries.

Similarity Threshold

Business logic should define a distance threshold (e.g., 0.2) to decide whether two images are considered similar; vectors with a distance below the threshold are treated as matches.

Conclusion

The guide demonstrates a complete pipeline—from CNN‑based feature extraction and preprocessing to vector storage, indexing, and retrieval with Milvus—providing a solid foundation for building production‑grade image‑search applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

CNN Python Vector Database Milvus feature extraction image search VGG16

Written by

System Architect Go

Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.