Tag

Real-time indexing

0 views collected around this technical thread.

macrozheng
macrozheng
Apr 18, 2025 · Big Data

How to Build Near Real-Time Elasticsearch Indexes for PB-Scale Data

This article explains why traditional databases like MySQL struggle with massive data, introduces Elasticsearch’s advantages, and details a practical architecture using Hive, Canal, and Otter to achieve near real‑time indexing of petabyte‑scale datasets with minimal latency.

Big DataCanalData Transfer Service
0 likes · 20 min read
How to Build Near Real-Time Elasticsearch Indexes for PB-Scale Data
Xianyu Technology
Xianyu Technology
Sep 27, 2022 · Backend Development

Design and Real-Time Optimization of Xianyu E‑commerce Search System

The article details Xianyu’s end‑to‑end product‑search architecture—covering tokenization, indexing, online request flow, offline index building, multi‑datacenter active‑active deployment, and supporting ad and debugging systems—and explains how expanding searcher capacity, separating query engines, grading updates, and diffusing auxiliary‑table writes together reduced latency from hours to near‑zero, enabling real‑time search.

Real-time indexinge-commercescalability
0 likes · 11 min read
Design and Real-Time Optimization of Xianyu E‑commerce Search System
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 20, 2021 · Big Data

Building Near Real-Time Elasticsearch Indexes for PB‑Scale Data

This article explains how to construct near real‑time Elasticsearch indexes for petabyte‑level datasets by comparing MySQL limitations, describing Elasticsearch fundamentals, and detailing a pipeline that uses Hive, wide tables, MySQL binlog, Canal, and Otter to achieve second‑level index updates.

Big DataCanalElasticsearch
0 likes · 18 min read
Building Near Real-Time Elasticsearch Indexes for PB‑Scale Data
Tencent Cloud Developer
Tencent Cloud Developer
Nov 10, 2020 · Big Data

Design and Optimization of a Real-Time Video Recommendation Indexing System

The article describes a real‑time video recommendation indexing system that replaces 30‑minute batch builds with an Elasticsearch‑based service, integrates prior and posterior data pipelines, ensures consistency via locking and version checks, enables zero‑downtime upgrades, smooths write spikes, and boosts recall performance through multi‑level caching and ES tuning, delivering sub‑40 ms latency and significant business growth.

Big DataElasticsearchFlink
0 likes · 13 min read
Design and Optimization of a Real-Time Video Recommendation Indexing System
Big Data Technology Architecture
Big Data Technology Architecture
Oct 24, 2019 · Big Data

Real-Time Search Engine Indexing with Flink: Architecture and Implementation

This article explains how to build a real-time search engine indexing pipeline using Flink, covering background, batch versus incremental indexing strategies, a hybrid architecture that merges both approaches, and a concrete cloud‑based implementation involving MySQL binlog, Logtail, SLS, and Elasticsearch.

Big DataElasticsearchFlink
0 likes · 5 min read
Real-Time Search Engine Indexing with Flink: Architecture and Implementation
58 Tech
58 Tech
Jan 25, 2019 · Backend Development

Search Engineering Architecture: Lessons from Zhihu and 58 Group

The article summarizes the evolution and redesign of Zhihu's search engine, details 58 Group's high‑performance uesearch architecture, real‑time indexing mechanisms, cloud‑native deployment with Kubernetes, and highlights key technical insights and future directions for large‑scale search systems.

KubernetesReal-time indexingRust
0 likes · 9 min read
Search Engineering Architecture: Lessons from Zhihu and 58 Group
Ctrip Technology
Ctrip Technology
Jul 10, 2018 · Databases

Designing Real‑Time Sharding Index and Replication with Elasticsearch for High‑Performance Order Queries

This article describes how Ctrip's hotel R&D team tackled growing order‑volume challenges by sharding the database, building a real‑time Elasticsearch index, implementing a custom replication pipeline, and applying various write‑ and read‑optimizations to achieve low latency and stable performance.

ElasticsearchReal-time indexingReplication
0 likes · 10 min read
Designing Real‑Time Sharding Index and Replication with Elasticsearch for High‑Performance Order Queries
58 Tech
58 Tech
Jun 1, 2018 · Backend Development

Design and Implementation of Real-Time Indexing in 58.com’s ESearch Search Engine

This article explains how 58.com’s in‑house C++ search kernel ESearch was architected to provide second‑level real‑time indexing, high‑concurrency low‑latency querying, flexible ranking models, and efficient storage structures for billions of daily queries across massive classified data.

C++Large ScaleReal-time indexing
0 likes · 13 min read
Design and Implementation of Real-Time Indexing in 58.com’s ESearch Search Engine