Cloud Computing 9 min read

How Baidu’s LBS Cloud Storage Revolutionizes Massive Geospatial Data Retrieval

Baidu’s LBS Cloud Storage and Retrieval platform offers developers a fully managed, high‑performance solution for massive geospatial data, featuring free large‑scale storage, GeoHash‑based spatial queries, real‑time updates, strong isolation, and a series of architectural optimizations that dramatically improve latency, availability, and scalability.

Baidu Maps Tech Team
Baidu Maps Tech Team
Baidu Maps Tech Team
How Baidu’s LBS Cloud Storage Revolutionizes Massive Geospatial Data Retrieval

System Features

Free massive storage space, supporting tens of millions of records per table.

Efficient geospatial search using GeoHash algorithm, handling tens of thousands of QPS.

High real‑time performance: data updates propagate to the search side within seconds.

High availability: storage availability 4‑9, search availability 5‑9.

Strong flexibility: customizable columns, attributes, and field participation in search.

Data safety: safety and security mechanisms, three‑copy replication, strict user isolation via AK keys.

Initial Architecture and Evolution

The platform consists of several core modules:

Control Service : authentication, traffic, and quota control; all external LBS cloud services pass through it.

Storage Access Layer : parses and forwards storage requests, optionally publishing updates to the search side.

Search Access Layer : parses search requests, converts stored column attributes into AS‑compatible queries, and forwards them to the backend search cluster.

AS (Advanced Search Unit) : receives requests, performs DA analysis, and forwards to the basic search cluster.

DA (Data Analyzer) : query parsing, including tokenization, where/what analysis.

AC (Access Controller) : routes search and incremental update messages to the appropriate basic search unit.

Build Cluster : periodically merges full and incremental indexes and pushes them to the basic search cluster.

Cloud Analysis : provides user search behavior analysis reports.

Cloud Display : visualizes search and analysis data.

Problem 1 – Index Isolation

Original design mixed all users' indexes in a single inverted list, causing performance degradation as user count grew and leading to long‑tail latency and missed results.

Goal: Fully isolate users with independent index ranges and improve performance.

Solution: Redesign the full‑index structure to support ordered intervals with possible duplicate keys, introduce a secondary index (Table.meta) recording start position and length for each user’s index range.

The new design separates tables A, B, C, etc., reducing base search latency from 12.7 ms to 7 ms and cutting the >100 ms tail proportion from 2.82 % to 1.58 %.

Problem 2 – Access Layer Bottleneck

The initial access layer, implemented in PHP, was ten times slower than the C++ driver due to multiple storage cluster calls.

Solution: Refactor the access layer in C++, switch from short to long connections, bringing latency down to a few tens of milliseconds.

Problem 3 – Summary Retrieval Latency

Fetching summary details after obtaining document IDs incurred >20 ms latency.

Goal: Reduce summary retrieval latency to 10 ms.

Solution: Introduce a Redis cache for hot tables, asynchronously sync missed data to the cache via message queues, and merge three user‑related tables into two cached tables to cut one request.

Problem 4 – Batch Operations Overload

Version 2 added asynchronous batch operations (upload, delete, update), which broke quota protection and caused massive task queues, stressing the build cluster.

Solution: Enforce per‑user batch quotas, discard excess tasks, and apply traffic shaping to protect real‑time updates.

Summary

The described optimizations—index isolation, C++ refactoring, caching, and quota management—significantly improved the LBS cloud storage and retrieval system’s latency, availability, and scalability, laying a foundation for future real‑time update enhancements and richer storage capabilities.

Indexingcloud storageLBS
Baidu Maps Tech Team
Written by

Baidu Maps Tech Team

Want to see the Baidu Maps team's technical insights, learn how top engineers tackle tough problems, or join the team? Follow the Baidu Maps Tech Team to get the answers you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.