Tag

Otter

0 views collected around this technical thread.

macrozheng
macrozheng
Apr 18, 2025 · Big Data

How to Build Near Real-Time Elasticsearch Indexes for PB-Scale Data

This article explains why traditional databases like MySQL struggle with massive data, introduces Elasticsearch’s advantages, and details a practical architecture using Hive, Canal, and Otter to achieve near real‑time indexing of petabyte‑scale datasets with minimal latency.

Big DataCanalData Transfer Service
0 likes · 20 min read
How to Build Near Real-Time Elasticsearch Indexes for PB-Scale Data
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 20, 2021 · Big Data

Building Near Real-Time Elasticsearch Indexes for PB‑Scale Data

This article explains how to construct near real‑time Elasticsearch indexes for petabyte‑level datasets by comparing MySQL limitations, describing Elasticsearch fundamentals, and detailing a pipeline that uses Hive, wide tables, MySQL binlog, Canal, and Otter to achieve second‑level index updates.

Big DataCanalElasticsearch
0 likes · 18 min read
Building Near Real-Time Elasticsearch Indexes for PB‑Scale Data