How Alibaba Engineers Shrank Image Features from 120KB to 13KB and Cut Euclidean Distance to 9µs

This article details how Alibaba's engineers reduced per‑image feature size from 120 KB to an average of 13 KB and accelerated Euclidean distance calculations from 50 µs to 9 µs through bitmap, offset, and duplicate‑value compression techniques, enabling efficient large‑scale image retrieval.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba Engineers Shrank Image Features from 120KB to 13KB and Cut Euclidean Distance to 9µs

Background

Users capture a photo of a clothing item on mobile devices and expect to find similar images; the project aims to satisfy this demand by indexing five categories, each containing five million images. Each image is represented by a 30,976‑dimensional float vector, which is scaled and stored as int32 values.

Challenges

Before compression, each image occupied about 120 KB, resulting in a total storage requirement of 3 TB and necessitating 100 machines with 30 GB memory each. Euclidean distance computation between two images took roughly 50 µs, which was too slow for online search.

Results

Compressed each image to an average size of 13 KB.

Reduced Euclidean distance computation time to an average of 9 µs.

Initial Attempts – Bitmap Method

Observing that each vector has about 7,000 non‑zero entries, a bitmap of 4 KB (30,976 / 8) was used to mark zero positions, and only non‑zero values were stored (≈7 KB), yielding a compressed size of ~32 KB per image.

int bitmap_len = size / 8 + 8;
uint64_t *bitmap = (uint64_t*)(cmpr_buf);
int32_t *data   = (int32_t*)(cmpr_buf + bitmap_len);
for (unsigned int i=0;i<size;i++) {
    if (list[i] != 0) {
        data[index++] = list[i];
        bitmap[i/64] |= bit_mask_tab[i % 64];
    }
}

However, distance calculation required iterating over the entire bitmap (30,976 steps) and performing bitwise AND operations, resulting in >100 µs per pair.

Offset Compression

Instead of a bitmap, only the offsets (int16) and values (int32) of non‑zero entries were stored, requiring 42 KB per image (7 KB × (2 + 4)).

for (int i=0; i<size; i++) {
    if (list[i] != 0) {
        index++;
    }
}
*(int16_t*) cmpr_buf = index;
int16_t *p_off = (int16_t*)cmpr_buf + 1;
int32_t *p_data = (int32_t*)(((char *)cmpr_buf) + sizeof(int16_t) * index + sizeof(int16_t));
index = 0;
for (int i=0; i<size; i++) {
    if (list[i] != 0) {
        p_off[index] = i;
        p_data[index] = list[i];
        index++;
    }
}

Distance calculation using merged offsets reduced time to 55 µs, but conditional branches still limited performance.

Performance Optimization

Recognizing that online queries compare one uncompressed query image against many compressed images, the algorithm pre‑computes the squared sum of the uncompressed image once, stores the squared sum of each compressed image during compression, and merges offsets to compute the dot product efficiently. This approach lowered distance time to about 11 µs.

Further Compression Techniques

Offset size reduction : Store offset differences using uint8; when a gap exceeds 255, insert a zero placeholder. This brings per‑image size to ~35 KB.

Value size reduction : Most values fit in uint16; larger values are stored as int32 at the buffer tail, achieving ~21 KB per image.

Removing duplicate consecutive values : Encode runs of identical values, compressing images to ~13 KB but increasing distance time to 35 µs due to extra bit‑mask operations.

Final method : Keep groups of duplicate values together without storing repeat counts, resulting in ~14 KB per image and restoring distance time to 9 µs (≈3000 + 1300 offset traversals).

Summary of Methods

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance optimizationimage compressionlarge-scale retrievaleuclidean distancefeature vectors
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.