Tagged articles
312 articles
Page 1 of 4
Code Ape Tech Column
Code Ape Tech Column
Dec 5, 2025 · Big Data

Optimizing 100K Record Retrieval from 10M‑Row Pools: ClickHouse, ES Scroll, ES+HBase, RediSearch

This article examines several engineering solutions for extracting up to 100,000 records from a ten‑million‑row pool, comparing multi‑threaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑plus‑HBase hybrid, and RediSearch + RedisJSON, and presents performance measurements and practical trade‑offs.

Big DataClickHouseElasticsearch
0 likes · 12 min read
Optimizing 100K Record Retrieval from 10M‑Row Pools: ClickHouse, ES Scroll, ES+HBase, RediSearch
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Oct 28, 2025 · Databases

Why HBase Can’t Connect to Zookeeper and How to Fix It

This guide explains why HBase may fail to connect to Zookeeper in distributed storage environments and provides step‑by‑step troubleshooting, including service checks, configuration validation, network testing, log analysis, version compatibility, service restarts, and Java code examples with retry logic.

HBaseZooKeeperconfiguration
0 likes · 11 min read
Why HBase Can’t Connect to Zookeeper and How to Fix It
Tech Freedom Circle
Tech Freedom Circle
Oct 23, 2025 · Databases

Why Consistent Hashing Fails: Why Redis, HBase, TiDB and Ceph Have Dropped It

The article examines the fundamental limitations of consistent hashing—its inability to preserve data locality, support range queries, and handle topology awareness—explaining why major storage systems such as Redis Cluster, TiDB, Ceph, and HBase have adopted alternative sharding strategies like hash slots, range partitioning, and CRUSH.

CRUSHCephHBase
0 likes · 45 min read
Why Consistent Hashing Fails: Why Redis, HBase, TiDB and Ceph Have Dropped It
Java Baker
Java Baker
Jul 7, 2025 · Databases

Choosing the Right Database Schema for Dynamic Business Field Expansion

This article compares five common database extension strategies—from simple MySQL column additions to a hybrid MySQL‑HBase solution—detailing their implementation, advantages, drawbacks, and ideal scenarios, helping architects select the most scalable and maintainable design for evolving business data requirements.

Database designDynamic FieldsHBase
0 likes · 8 min read
Choosing the Right Database Schema for Dynamic Business Field Expansion
Architect
Architect
Nov 8, 2024 · Backend Development

How Ctrip Scaled Its Travel Product Log System to Billions of Records

This article traces the evolution of Ctrip’s travel product log platform—from a single‑table DB approach to a platform‑wide ES + HBase solution—detailing the challenges of massive data volume, the architectural decisions, RowKey design, write and query flows, and the subsequent extensions that enabled billion‑scale log storage and fast retrieval.

Backend ArchitectureBig DataCtrip
0 likes · 17 min read
How Ctrip Scaled Its Travel Product Log System to Billions of Records
Code Ape Tech Column
Code Ape Tech Column
Oct 21, 2024 · Big Data

Design and Optimization of Querying 100k Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch

This article presents a business-driven requirement to extract no more than 100,000 records from a pool of tens of millions, evaluates four technical solutions—including multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑HBase hybrid, and RediSearch + RedisJSON—provides implementation details, performance measurements, and practical recommendations for large‑scale data querying.

Big DataHBaseRediSearch
0 likes · 11 min read
Design and Optimization of Querying 100k Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch
Efficient Ops
Efficient Ops
Oct 13, 2024 · Databases

Why Your MySQL Queries Are Slow and How to Fix Them with Indexes, ES & HBase

This article explains why MySQL queries become slow—covering index misuse, MDL locks, flush waits, large‑table bottlenecks, and read/write splitting—then shows how ElasticSearch’s inverted index and HBase’s column‑family design can complement MySQL for faster search and scalable storage.

Database OptimizationHBaseMySQL
0 likes · 20 min read
Why Your MySQL Queries Are Slow and How to Fix Them with Indexes, ES & HBase
dbaplus Community
dbaplus Community
Sep 24, 2024 · Backend Development

How Ctrip Scaled Its Vacation Product Log System to Billions of Records

This article recounts the evolution of Ctrip's vacation product log platform—from a single‑table DB solution to a platformized ES + HBase architecture—detailing the challenges of massive data volume, the design of RowKey, write and query flows, and the subsequent business and supplier empowerment.

ArchitectureBackendElasticsearch
0 likes · 14 min read
How Ctrip Scaled Its Vacation Product Log System to Billions of Records
Wukong Talks Architecture
Wukong Talks Architecture
Sep 23, 2024 · Backend Development

Evolution of the Ctrip Travel Product Log System: Architecture, Challenges, and Solutions

This article describes the development trajectory of Ctrip's travel product log system, detailing its three major phases—from a single‑table DB approach to a platform‑based solution and finally an empowered version—while discussing technical challenges, design decisions, and the implementation of HBase, Elasticsearch, and related components to handle billions of log entries efficiently.

ArchitectureBackendBig Data
0 likes · 15 min read
Evolution of the Ctrip Travel Product Log System: Architecture, Challenges, and Solutions
DaTaobao Tech
DaTaobao Tech
Sep 20, 2024 · Databases

Database Technology Evolution: From Hierarchical to Vector Databases

The article chronicles the evolution of database technology from early hierarchical and network models through relational, column‑store, document, key‑value, graph, time‑series, HTAP, and finally vector databases, detailing each system’s architecture, strengths, limitations, typical uses, and future trends toward specialization, distributed cloud‑native designs, and AI‑driven applications.

HBaseHTAPInfluxDB
0 likes · 52 min read
Database Technology Evolution: From Hierarchical to Vector Databases
High Availability Architecture
High Availability Architecture
Sep 11, 2024 · Backend Development

Evolution of Ctrip Vacation Product Log System: From Single‑Table DB to ES + HBase Platform

This article details the evolution of Ctrip's vacation product log system—from a simple single‑table DB in 2019, through a platformized ES + HBase architecture with custom RowKey design, to a V3.0 version that adds business and supplier empowerment, scalable storage, advanced search, and flexible data presentation for billions of daily change records.

BackendESHBase
0 likes · 13 min read
Evolution of Ctrip Vacation Product Log System: From Single‑Table DB to ES + HBase Platform
Architect
Architect
Aug 13, 2024 · Databases

Optimizing HBase for a Large‑Scale Content Platform: Selection, Performance Tuning, and Best Practices

This article examines why the unified content platform switched from MongoDB to HBase, outlines HBase’s high‑performance, scalability, and consistency features, and details four optimization techniques—including cluster upgrade, connection pooling, column‑read strategy, and compaction tuning—that significantly improved read/write latency and operational stability.

Database OptimizationHBaseNoSQL
0 likes · 15 min read
Optimizing HBase for a Large‑Scale Content Platform: Selection, Performance Tuning, and Best Practices
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Aug 8, 2024 · Big Data

How to Migrate HBase and HDFS Clusters Safely Without Downtime

This guide details a step‑by‑step migration plan for HBase and HDFS clusters, covering background, high‑availability architecture, role assignments, expansion and shrinkage of ZooKeeper and JournalNode, NameNode and DataNode migration, rolling restarts, and common upgrade pitfalls.

Big DataCluster MigrationHBase
0 likes · 12 min read
How to Migrate HBase and HDFS Clusters Safely Without Downtime
58 Tech
58 Tech
Jul 29, 2024 · Databases

HBase Cloud Migration: Architecture, Challenges, and Solutions

This technical report details the background, architecture, construction, core issues, migration plans, and future roadmap of moving 58's HBase clusters to a cloud‑native environment, highlighting cost reduction, operational automation, and performance optimizations.

Big DataCloud NativeHBase
0 likes · 22 min read
HBase Cloud Migration: Architecture, Challenges, and Solutions
JD Cloud Developers
JD Cloud Developers
Jul 17, 2024 · Databases

Choosing the Right Database: MySQL, Redis, HBase, ClickHouse, MongoDB, Elasticsearch, Neo4j, Prometheus & Milvus Explained

Explore nine major database technologies—from traditional relational MySQL to NoSQL Redis, columnar HBase and ClickHouse, document-oriented MongoDB, search engine Elasticsearch, graph Neo4j, time‑series Prometheus, and vector Milvus—plus practical best‑practice guides, real‑world polyglot persistence scenarios, and recommended resources for mastering modern data storage.

ClickHouseElasticsearchHBase
0 likes · 50 min read
Choosing the Right Database: MySQL, Redis, HBase, ClickHouse, MongoDB, Elasticsearch, Neo4j, Prometheus & Milvus Explained
JD Tech Talk
JD Tech Talk
Jul 17, 2024 · Databases

A Comprehensive Guide to 9 Database Types and Polyglot Persistence

This article provides an in‑depth overview of nine major database categories—including relational, key‑value, columnar, document, graph, time‑series, and vector databases—detailing their strengths, weaknesses, best practices, and typical application scenarios, and explains how polyglot persistence combines multiple databases for optimal performance and scalability.

ClickHouseElasticsearchHBase
0 likes · 41 min read
A Comprehensive Guide to 9 Database Types and Polyglot Persistence
JD Tech
JD Tech
Jul 15, 2024 · Databases

A Comprehensive Overview of Nine Database Types and Polyglot Persistence Practices

This article provides an in‑depth survey of nine database categories—including relational, key‑value, columnar, document, graph, time‑series, and vector databases—detailing their architectures, advantages, disadvantages, best‑practice recommendations, typical use cases, and how they can be combined in polyglot persistence solutions.

ClickHouseDatabase TypesHBase
0 likes · 41 min read
A Comprehensive Overview of Nine Database Types and Polyglot Persistence Practices
vivo Internet Technology
vivo Internet Technology
Jul 10, 2024 · Databases

HBase Optimization Practice in Vivo's Unified Content Platform

Vivo's unified content platform replaced its unwieldy 60 TB MongoDB store with HBase, then upgraded the cluster, introduced table‑specific connection pools, column‑only reads, tuned compaction, and leveraged multi‑version cells, cutting response times from seconds to under ten milliseconds and dramatically lowering operational costs while boosting read/write performance.

Columnar StorageCompaction OptimizationDatabase Optimization
0 likes · 16 min read
HBase Optimization Practice in Vivo's Unified Content Platform
vivo Internet Technology
vivo Internet Technology
May 8, 2024 · Databases

Troubleshooting and Repairing HBase Meta Table Issues

The article explains how HBase’s meta table stores region metadata, outlines common failures such as slow startup, RIT, region holes and overlaps, and provides step‑by‑step online and offline repair procedures—including command‑line tools and configuration tweaks—for both HBase 1.x and 2.x clusters.

HBCKHBaseMeta Table
0 likes · 20 min read
Troubleshooting and Repairing HBase Meta Table Issues
ITPUB
ITPUB
Apr 11, 2024 · Big Data

Query 100K Items from 10M+ Records: CK, ES Scroll, HBase, RediSearch

When faced with a business requirement to filter up to 100 000 records from a pool of tens of millions and then sort and de‑duplicate them, this article explores four technical solutions—multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, a combined Elasticsearch‑HBase approach, and RediSearch with RedisJSON—detailing their design, implementation, performance testing, and trade‑offs.

Big DataClickHouseElasticsearch
0 likes · 12 min read
Query 100K Items from 10M+ Records: CK, ES Scroll, HBase, RediSearch
vivo Internet Technology
vivo Internet Technology
Jan 24, 2024 · Big Data

Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing

Vivo’s trillion‑scale data platform evolved into a dual‑active real‑time and offline architecture that leverages multi‑datacenter clusters, Kafka/Pulsar caching, a unified sorting layer, HBase‑backed dimension tables, and micro‑batch Spark jobs to deliver low‑cost, high‑performance processing, 99.9% availability, and 99.9995% data‑integrity.

Data ArchitectureHBaseOffline Computing
0 likes · 16 min read
Evolution of Vivo's Trillions-Scale Data Architecture: Dual-Active Real-Time and Offline Computing
Huolala Tech
Huolala Tech
Dec 27, 2023 · Big Data

How HBase Compaction Tuning Boosts Performance at Scale

This article explains LSM‑Tree based HBase compaction concepts, compares Minor and Major compactions, and shares practical tuning steps—including disabling automatic major compactions, controlling merge size, leveraging off‑peak windows, and improving merge efficiency—to reduce I/O, CPU usage, and latency in production environments.

Big DataDatabase OptimizationHBase
0 likes · 11 min read
How HBase Compaction Tuning Boosts Performance at Scale
Sohu Tech Products
Sohu Tech Products
Aug 16, 2023 · Big Data

Understanding HBase Compaction: Principles, Process, Throttling Strategies and Real‑World Optimizations

This article explains HBase’s LSM‑Tree compaction fundamentals—including minor and major compaction triggers, file‑selection policies, dynamic throughput throttling, and practical tuning examples that show how adjusting size limits, thread pools, and off‑peak settings can dramatically improve read latency and cluster stability.

Big DataHBasePerformance Tuning
0 likes · 35 min read
Understanding HBase Compaction: Principles, Process, Throttling Strategies and Real‑World Optimizations
vivo Internet Technology
vivo Internet Technology
Jul 26, 2023 · Big Data

Understanding HBase Compaction: Principles, Process, Throttling Strategies, and Optimization Cases

Understanding HBase compaction involves knowing its minor and major merge types, trigger mechanisms, file‑selection policies such as RatioBased and Exploring, throttling controls based on file count, and practical tuning of key parameters to avoid latency spikes, as illustrated by real‑world production cases.

Big DataHBasecompaction
0 likes · 36 min read
Understanding HBase Compaction: Principles, Process, Throttling Strategies, and Optimization Cases
Huolala Tech
Huolala Tech
May 25, 2023 · Big Data

How Huolala Solved HBase Bulkload Challenges: A Practical Guide

This article details Huolala’s experience building a unified Hive‑to‑HBase pipeline, addressing low development efficiency, lack of monitoring, and HBase instability by evaluating two architectures, implementing a generic Transform tool, optimizing compaction and DistCp, and establishing stability and data‑validation mechanisms.

DistcpHBasebulkload
0 likes · 12 min read
How Huolala Solved HBase Bulkload Challenges: A Practical Guide
Selected Java Interview Questions
Selected Java Interview Questions
Mar 12, 2023 · Big Data

Design and Optimization of Querying 100K Records from Tens of Millions of Data Using ClickHouse, Elasticsearch, HBase, and RediSearch

This article presents a comprehensive design and performance‑optimization study for extracting up to 100 000 records from a pool of tens of millions, comparing multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, ES + HBase, and RediSearch + RedisJSON solutions, and provides practical recommendations based on measured latency and throughput.

ClickHouseHBaseRediSearch
0 likes · 11 min read
Design and Optimization of Querying 100K Records from Tens of Millions of Data Using ClickHouse, Elasticsearch, HBase, and RediSearch
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 24, 2023 · Big Data

Common Flink Task Submission Issues and Solutions on YARN

This article compiles frequent Flink job submission problems on YARN—including WordCount jar errors, HBase dependency conflicts, MySQL timeout, checkpoint restoration failures, parallelism limits, and unexpected container termination—provides root‑cause analysis and step‑by‑step remediation instructions.

Big DataCheckpointFlink
0 likes · 21 min read
Common Flink Task Submission Issues and Solutions on YARN
DataFunTalk
DataFunTalk
Feb 18, 2023 · Big Data

Xiaomi Data Governance Evolution: Cost Governance Practices for HDFS and HBase

The article outlines Xiaomi's data governance journey, focusing on storage‑service cost governance, describing the transition from simple cost‑centered governance to big‑data‑driven asset management, and detailing concrete HDFS and HBase practices that achieved significant resource and cost reductions.

Big DataData GovernanceHBase
0 likes · 15 min read
Xiaomi Data Governance Evolution: Cost Governance Practices for HDFS and HBase
Java Architect Essentials
Java Architect Essentials
Jan 31, 2023 · Big Data

Optimizing Large-Scale Data Retrieval: ClickHouse Pagination, Elasticsearch Scroll Scan, ES+HBase, and RediSearch + RedisJSON Solutions

This article examines a business requirement to filter and rank up to 100,000 records from a pool of tens of millions, presenting and evaluating four technical solutions—multithreaded ClickHouse pagination, Elasticsearch scroll‑scan deep paging, an ES‑HBase combined query, and a RediSearch + RedisJSON approach—along with performance data and code examples.

ClickHouseElasticsearchHBase
0 likes · 12 min read
Optimizing Large-Scale Data Retrieval: ClickHouse Pagination, Elasticsearch Scroll Scan, ES+HBase, and RediSearch + RedisJSON Solutions
ITPUB
ITPUB
Jan 4, 2023 · Databases

Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases

The article traces the evolution from Codd's relational model to modern RDBMS scaling limits, explains why centralized Hadoop/HBase architectures struggle with high‑concurrency workloads, and shows how Cassandra’s decentralized design—using consistent hashing, gossip, and virtual nodes—overcomes these bottlenecks while offering flexible consistency guarantees.

ConsistencyHBaseHDFS
0 likes · 22 min read
Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases
ITPUB
ITPUB
Dec 31, 2022 · Databases

Why HBase? Strengths, Weaknesses, Real‑World Scenarios, and Architecture Explained

This article examines HBase’s high reliability and performance as a column‑oriented NoSQL store, outlines its advantages and limitations, presents two practical use cases from e‑commerce, and details its data model, architecture components, and design considerations for effective deployment.

Big DataHBaseNoSQL
0 likes · 12 min read
Why HBase? Strengths, Weaknesses, Real‑World Scenarios, and Architecture Explained
Selected Java Interview Questions
Selected Java Interview Questions
Dec 29, 2022 · Backend Development

Optimizing Large‑Scale Data Retrieval with ClickHouse, Elasticsearch Scroll Scan, ES+HBase, and RediSearch+RedisJSON

This article examines a business requirement to filter up to 100 000 records from a pool of tens of millions, presenting and evaluating four backend solutions—multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑HBase hybrid, and RediSearch + RedisJSON—along with performance data and implementation details.

BackendClickHouseData Retrieval
0 likes · 11 min read
Optimizing Large‑Scale Data Retrieval with ClickHouse, Elasticsearch Scroll Scan, ES+HBase, and RediSearch+RedisJSON
Code Ape Tech Column
Code Ape Tech Column
Dec 28, 2022 · Big Data

Design and Optimization of Querying 100k Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and Redis

This article presents a comprehensive analysis and multiple design alternatives—including multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, ES+HBase hybrid, and RediSearch+RedisJSON—to efficiently filter, sort, and de‑duplicate up to 100,000 records from a pool of tens of millions, with detailed performance comparisons and code examples.

HBasequery optimizationredis
0 likes · 10 min read
Design and Optimization of Querying 100k Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and Redis
DaTaobao Tech
DaTaobao Tech
Oct 19, 2022 · Databases

Overview of LSM‑Tree Architecture and Its Use in Modern Databases

LSM‑Tree stores writes in an in‑memory MemTable then flushes ordered SSTables to disk, using Bloom filters and indexes to speed reads, while periodic compactions merge files; modern systems such as LevelDB, HBase, and ClickHouse adopt this design to achieve high write throughput despite slower point and range queries and occasional compaction overhead.

ClickHouseHBaseLSM‑Tree
0 likes · 11 min read
Overview of LSM‑Tree Architecture and Its Use in Modern Databases
DeWu Technology
DeWu Technology
Oct 10, 2022 · Big Data

Offline and Real-Time User Profile Fusion Architecture

The architecture combines a nightly batch job that generates offline user profiles stored in HBase with a Flink‑based stream layer that lazily loads those profiles on app start and creates real‑time updates, then fuses both streams into a unified, timestamp‑ordered profile in Redis, forming a Lambda‑style pipeline.

Batch ProcessingFlinkHBase
0 likes · 10 min read
Offline and Real-Time User Profile Fusion Architecture
dbaplus Community
dbaplus Community
Sep 26, 2022 · Backend Development

How Ctrip Replaced HBase with VictoriaMetrics & ClickHouse for Scalable Metrics Monitoring

Ctrip’s internal Dashboard monitoring platform, originally built on HBase, was redesigned by migrating its core writer and storage components to a hybrid VictoriaMetrics‑ClickHouse solution, delivering faster queries, higher write stability, and full Prometheus compatibility while keeping the user experience unchanged.

ClickHouseDashboardHBase
0 likes · 13 min read
How Ctrip Replaced HBase with VictoriaMetrics & ClickHouse for Scalable Metrics Monitoring
JavaEdge
JavaEdge
Sep 7, 2022 · Databases

Understanding HBase: Architecture, Data Model, and Read/Write Mechanics

This article provides a comprehensive overview of HBase, covering its column‑oriented design, core components such as HMaster, RegionServer and ZooKeeper, the data model with column families and row keys, and detailed step‑by‑step write and read processes for distributed storage.

Big DataHBaseNoSQL
0 likes · 16 min read
Understanding HBase: Architecture, Data Model, and Read/Write Mechanics
ITPUB
ITPUB
Aug 29, 2022 · Backend Development

How Ctrip Replaced HBase with VictoriaMetrics & ClickHouse for Scalable Metrics Monitoring

This article details Ctrip's internal Dashboard monitoring platform, explains why its HBase‑based TSDB became a bottleneck, and describes the step‑by‑step migration to a hybrid VictoriaMetrics‑ClickHouse solution with upgraded writers, unified query APIs, performance gains, and future roadmap.

ClickHouseHBaseMetrics
0 likes · 13 min read
How Ctrip Replaced HBase with VictoriaMetrics & ClickHouse for Scalable Metrics Monitoring
DeWu Technology
DeWu Technology
Aug 19, 2022 · Big Data

DeWu Reach Strategy Platform and HBase Buffer Pool Architecture

The DeWu Reach Strategy platform uses a task‑strategy‑action model and an HBase‑backed buffer pool that temporarily stores billions of user records, enabling large‑scale algorithmic push, AB testing, and dynamic horizontal scaling while ensuring even data distribution and low‑latency processing.

Big DataHBaseReach Strategy
0 likes · 9 min read
DeWu Reach Strategy Platform and HBase Buffer Pool Architecture
Dada Group Technology
Dada Group Technology
Aug 10, 2022 · Databases

Evolution of JD.com Delivery Review System Architecture and Storage Strategy

This article details the JD.com delivery review system's business scenarios, architectural evolution from a MySQL‑based design to a diversified storage stack using HBase, Redis, Elasticsearch and TiDB, and discusses the performance challenges, solutions, and future outlook for scalable, high‑availability data management.

HBaseScalabilityTiDB
0 likes · 15 min read
Evolution of JD.com Delivery Review System Architecture and Storage Strategy
DataFunTalk
DataFunTalk
Jul 26, 2022 · Big Data

Feature Platform Architecture and Stream‑Batch Integrated Solutions

This talk presents Shuhe Technology’s feature platform, detailing its four‑layer architecture, feature storage services, stream‑batch integrated processing, event‑center design, consistency models, and four model‑strategy invocation schemes, illustrating data flows from MySQL through Sqoop, Kafka, Flink, HBase and ClickHouse.

Big DataClickHouseFlink
0 likes · 17 min read
Feature Platform Architecture and Stream‑Batch Integrated Solutions
DeWu Technology
DeWu Technology
Jul 8, 2022 · Big Data

Optimizing Large-Scale Product Set Refresh with RoaringBitmap

By representing pre‑and post‑refresh SPU sets as RoaringBitmaps and diffing them, the system avoids full‑insert writes, cuts memory usage by orders of magnitude, speeds refreshes by over 50 % and reduces write volume nearly 87 %, solving large‑scale tag‑based product refresh challenges.

BitmapDataStructureHBase
0 likes · 14 min read
Optimizing Large-Scale Product Set Refresh with RoaringBitmap
Architecture Digest
Architecture Digest
Jun 7, 2022 · Big Data

Design and Optimization Strategies for Querying 100K Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch

This article examines a business requirement to filter up to 100,000 items from a pool of tens of millions, presenting and evaluating four technical solutions—multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑HBase hybrid, and RediSearch + RedisJSON—along with performance data and implementation details.

HBaseRediSearchRedisJSON
0 likes · 10 min read
Design and Optimization Strategies for Querying 100K Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch
Java Baker
Java Baker
Jun 7, 2022 · Databases

Mastering HBase RowKey Design: Principles, Use Cases, and Architecture

Learn why HBase outperforms MySQL for massive, historical data, explore key rowkey design principles such as composite keys, field ordering, length alignment, and hotspot mitigation, and see practical examples like cold‑hot data separation and transaction logs, plus a concise overview of HBase’s core architecture.

Database ArchitectureHBaseNoSQL
0 likes · 5 min read
Mastering HBase RowKey Design: Principles, Use Cases, and Architecture
Top Architect
Top Architect
Jun 6, 2022 · Big Data

Optimizing Large‑Scale Data Pagination with ClickHouse, Elasticsearch, HBase, and Redis

This article presents a comprehensive analysis and multiple optimization strategies—including multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑HBase hybrid approach, and RediSearch + RedisJSON—to efficiently filter and sort up to 10 W records from a pool of tens of millions while reducing query latency and system complexity.

ClickHouseHBasePerformance
0 likes · 11 min read
Optimizing Large‑Scale Data Pagination with ClickHouse, Elasticsearch, HBase, and Redis
IT Architects Alliance
IT Architects Alliance
Jun 5, 2022 · Big Data

Optimizing 10K‑Record Queries from Tens of Millions: CK, ES, HBase & Redis Strategies

This article examines a real‑world requirement to extract no more than 100 000 rows from a pool of tens of millions, comparing multithreaded ClickHouse pagination, Elasticsearch scroll‑scan deep paging, an ES‑HBase hybrid query, and a RediSearch‑RedisJSON approach, and presents performance measurements and practical conclusions.

ClickHouseElasticsearchHBase
0 likes · 12 min read
Optimizing 10K‑Record Queries from Tens of Millions: CK, ES, HBase & Redis Strategies
IT Architects Alliance
IT Architects Alliance
May 19, 2022 · Big Data

How Apache Kylin Enables Sub‑Second OLAP on Massive Data Sets

Apache Kylin leverages pre‑computed OLAP cubes on Hadoop/Spark/Flink to deliver sub‑second query responses for massive datasets, detailing its architecture, integration with BI platforms, user security, cube building, monitoring, and storage using HBase, illustrating how it overcomes big‑data analytical challenges.

Apache KylinBig DataHBase
0 likes · 12 min read
How Apache Kylin Enables Sub‑Second OLAP on Massive Data Sets
Big Data Technology & Architecture
Big Data Technology & Architecture
May 12, 2022 · Databases

Understanding B+ Trees and Log‑Structured Merge (LSM) Trees and Their Use in HBase

This article explains the fundamentals of B+ trees, introduces log‑structured merge (LSM) trees as a modern alternative for write‑intensive workloads, and demonstrates how HBase leverages LSM trees—including MemStore, HFile, compaction, and Bloom filters—to achieve efficient storage and retrieval in NoSQL environments.

B+TreeHBaseLSM‑Tree
0 likes · 7 min read
Understanding B+ Trees and Log‑Structured Merge (LSM) Trees and Their Use in HBase
vivo Internet Technology
vivo Internet Technology
Mar 9, 2022 · Big Data

Incremental Synchronization of Massive HBase Data to a Data Warehouse: Solution Overview and Performance Evaluation

The paper proposes a generic, timeRange‑based incremental extraction method for synchronizing tens of billions of HBase rows to a data warehouse, demonstrating that it avoids full‑table scans, automatically detects schema changes, and delivers significantly lower latency than Hive mapping or timestamp‑based approaches, and has been integrated into a unified big‑data platform.

Big DataHBaseIncremental Sync
0 likes · 8 min read
Incremental Synchronization of Massive HBase Data to a Data Warehouse: Solution Overview and Performance Evaluation
High Availability Architecture
High Availability Architecture
Dec 16, 2021 · Big Data

iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices

The iQIYI Basic Data Platform unifies internal data exchange standards, integrates massive multi‑business data, and implements high‑availability solutions for ID services, messaging, HBase storage, and read‑write scaling, showcasing practical engineering approaches to big‑data reliability and performance.

Big DataDistributed SystemsHBase
0 likes · 11 min read
iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices
iQIYI Technical Product Team
iQIYI Technical Product Team
Dec 3, 2021 · Big Data

iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices

iQIYI’s Basic Data Platform unifies data exchange across dozens of business lines by providing massive storage, distribution, online query and offline analysis services, employing an access layer, unified management, fine‑grained governance, dual‑cluster ID generation, active‑standby HBase with MongoDB WAL, RocketMQ messaging with server‑side filtering, and horizontally scalable read replicas to ensure high availability and performance.

Data PlatformHBaseMessage Queue
0 likes · 13 min read
iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 7, 2021 · Databases

Understanding Secondary Indexes and Coprocessor Solutions in HBase

This article explains the concept of secondary indexes in HBase, describes how coprocessors (including observers and endpoints) enable server‑side processing, compares coprocessor‑based solutions such as Apache Phoenix with non‑coprocessor approaches using Elasticsearch or Solr, and outlines their advantages and trade‑offs.

Big DataCoprocessorHBase
0 likes · 11 min read
Understanding Secondary Indexes and Coprocessor Solutions in HBase
Architecture Digest
Architecture Digest
Nov 2, 2021 · Databases

Comparative Analysis of MySQL and HBase: Architecture, Engine, and Use Cases

This article compares MySQL and HBase across architecture, storage engine, indexing structures (B+ tree vs LSM tree), data access features, and ecosystem integration, highlighting each system's strengths, limitations, and the scenarios where HBase is a suitable complement to MySQL for large‑scale data workloads.

ArchitectureB+TreeBig Data
0 likes · 9 min read
Comparative Analysis of MySQL and HBase: Architecture, Engine, and Use Cases
Selected Java Interview Questions
Selected Java Interview Questions
Oct 31, 2021 · Backend Development

Interview Experiences and Technical Questions from Major Chinese Tech Companies (JD, Meituan, Alibaba, Toutiao, Kuaishou)

The author, a second‑year master's student in Java backend development, summarizes interview questions and experiences from JD, Meituan, Alibaba, Toutiao and Kuaishou, covering Java concurrency, JVM locking, Netty, Redis, MySQL/HBase, distributed systems, and several algorithm problems.

BackendHBaseNetty
0 likes · 15 min read
Interview Experiences and Technical Questions from Major Chinese Tech Companies (JD, Meituan, Alibaba, Toutiao, Kuaishou)
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 30, 2021 · Databases

HBase Common Issues, Optimization Tips, and New Features in HBase 2.0

This article compiles frequently asked HBase questions, troubleshooting steps, performance optimization techniques, configuration guidance, and an overview of new HBase 2.0 features such as off‑heap memory, Procedure v2, In‑Memory Compaction, and MOB support, providing practical solutions for administrators and developers.

HBaseIn-Memory CompactionMOB
0 likes · 29 min read
HBase Common Issues, Optimization Tips, and New Features in HBase 2.0
dbaplus Community
dbaplus Community
Oct 21, 2021 · Databases

How We Scaled an E‑commerce Order System with Sharding, Consistent Hashing, and Zero‑Downtime Migration

This article details how a rapidly growing e‑commerce platform migrated from a single MySQL instance to a 16‑shard architecture using Sharding‑Jdbc, introduced consistent‑hashing to mitigate data skew, leveraged ES+HBase for multi‑dimensional queries, and implemented zero‑downtime migration strategies such as dual‑write and Canal replication.

ElasticsearchHBaseMySQL
0 likes · 21 min read
How We Scaled an E‑commerce Order System with Sharding, Consistent Hashing, and Zero‑Downtime Migration
IT Xianyu
IT Xianyu
Oct 14, 2021 · Databases

Comparing MySQL and HBase: Architecture, Engine, and Application Scenarios

This article compares MySQL and HBase by examining their architectural designs, storage engines, data access patterns, and ecosystem features, highlighting the strengths and trade‑offs of each system and outlining the scenarios where HBase is a suitable complement to MySQL.

ArchitectureB+TreeBig Data
0 likes · 5 min read
Comparing MySQL and HBase: Architecture, Engine, and Application Scenarios
Big Data Technology Architecture
Big Data Technology Architecture
Oct 14, 2021 · Databases

Performance Evaluation and Optimization of HBase 2.x Write Operations

This article presents a detailed performance test of HBase 2.x write throughput on a five‑node SSD cluster, identifies latency spikes caused by MemStore flush and ConcurrentSkipListMap size() overhead, and demonstrates how fixing the bug and applying in‑memory compaction dramatically reduces P999 and P9999 latency while preserving throughput.

HBaseIn-Memory CompactionMemStore
0 likes · 10 min read
Performance Evaluation and Optimization of HBase 2.x Write Operations
Sohu Tech Products
Sohu Tech Products
Aug 18, 2021 · Databases

Understanding Slow Queries, Index Optimization, and Search Solutions with MySQL, Elasticsearch, and HBase

This article explains why MySQL queries become slow, how proper indexing and index‑pushdown can improve performance, discusses common index‑failure causes, and then introduces Elasticsearch and HBase as complementary search and storage solutions for large‑scale data, including practical usage tips and architectural considerations.

ElasticsearchHBaseMySQL
0 likes · 18 min read
Understanding Slow Queries, Index Optimization, and Search Solutions with MySQL, Elasticsearch, and HBase
Python Programming Learning Circle
Python Programming Learning Circle
Aug 12, 2021 · Databases

Understanding MySQL Slow Queries, Index Optimization, ElasticSearch Basics, and HBase Overview

This article explains why MySQL queries become slow, how proper indexing—including B+‑tree, left‑most prefix, index push‑down, and covering indexes—can improve performance, outlines common causes of index failure, and then introduces ElasticSearch search capabilities and HBase column‑family storage as complementary solutions for large‑scale data handling.

Database OptimizationElasticsearchHBase
0 likes · 16 min read
Understanding MySQL Slow Queries, Index Optimization, ElasticSearch Basics, and HBase Overview
Big Data Technology Architecture
Big Data Technology Architecture
Aug 12, 2021 · Databases

Understanding HBase HLog and Fault Recovery Mechanisms

This article explains HBase's write path using Memstore and HLog, details the lifecycle of HLog including construction, rolling, expiration, and deletion, and thoroughly analyzes the three fault‑recovery models—Log Splitting, Distributed Log Splitting, and Distributed Log Replay—highlighting their processes, advantages, and configuration nuances.

Distributed SystemsHBaseHLog
0 likes · 14 min read
Understanding HBase HLog and Fault Recovery Mechanisms
The Dominant Programmer
The Dominant Programmer
Aug 2, 2021 · Big Data

How to Build a Beginner Hadoop Cluster on CentOS 7

This article introduces Apache Hadoop’s open‑source framework, explains its core components such as HDFS, MapReduce, ZooKeeper, HBase, Hive, Pig, Mahout, Sqoop, Flume, Chukwa, Oozi​e, Ambari and YARN, and outlines the steps to set up a beginner‑level Hadoop cluster on CentOS 7.

Big DataCentOS 7HBase
0 likes · 11 min read
How to Build a Beginner Hadoop Cluster on CentOS 7
JD Tech
JD Tech
Jul 30, 2021 · Databases

Practical Use of HBase in a Logistics HR Data Preprocessing Platform

This article details how the logistics HR data preprocessing platform processes around 20 million daily records by adopting HBase for high‑performance, scalable, column‑oriented storage, covering its architecture, read/write mechanisms, best practices, and performance considerations.

Big DataHBaseNoSQL
0 likes · 10 min read
Practical Use of HBase in a Logistics HR Data Preprocessing Platform
Big Data Technology Architecture
Big Data Technology Architecture
Jul 27, 2021 · Big Data

Key Components of the Big Data Ecosystem: Hadoop, Hive, HBase, Spark, Kafka, and Elasticsearch

This article introduces the most important and still mainstream components of the big data ecosystem—including Hadoop’s storage and compute framework, Hive data warehouse, HBase NoSQL database, Spark unified engine, Kafka messaging platform, and Elasticsearch search engine—explaining their core concepts, architectures, and typical use cases.

Big DataElasticsearchHBase
0 likes · 9 min read
Key Components of the Big Data Ecosystem: Hadoop, Hive, HBase, Spark, Kafka, and Elasticsearch
GrowingIO Tech Team
GrowingIO Tech Team
Jul 22, 2021 · Databases

How to Diagnose and Fix Common HBase RegionServer Crashes

This article examines frequent HBase RegionServer failures caused by long GC pauses, oversized scans, and HDFS decommissioning, outlines step‑by‑step troubleshooting procedures—including log searches, GC tuning, scan size limits, and monitoring strategies—and provides practical solutions to prevent and resolve these issues.

HBaseRegionServergc
0 likes · 14 min read
How to Diagnose and Fix Common HBase RegionServer Crashes
21CTO
21CTO
Jul 18, 2021 · Databases

Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help

This article examines common causes of slow MySQL queries, explains index mechanics and failures, then compares ElasticSearch’s fast tokenized search and HBase’s column‑oriented storage, offering practical guidance on when and how to use each technology.

Big DataDatabase PerformanceHBase
0 likes · 21 min read
Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help
JD Retail Technology
JD Retail Technology
Jul 5, 2021 · Backend Development

Design and Implementation of JD's Real-Time Browsing Record System

The article describes JD's real-time browsing record system architecture, detailing its four modules—storage, query, real-time reporting, and offline reporting—along with hot‑cold data separation, use of Jimdb, HBase, Kafka, and Flink to achieve millisecond‑level latency and high throughput for billions of user records.

BrowsingFlinkHBase
0 likes · 12 min read
Design and Implementation of JD's Real-Time Browsing Record System
MaGe Linux Operations
MaGe Linux Operations
Jul 4, 2021 · Databases

Why MySQL Queries Slow Down and How ES & HBase Can Help Optimize

This article explores common causes of MySQL slow queries such as index misuse and lock contention, explains indexing strategies like index pushdown and covering indexes, and then compares Elasticsearch and HBase as complementary solutions for large‑scale search and write‑intensive workloads, offering practical tips for performance optimization.

ElasticsearchHBaseMySQL
0 likes · 19 min read
Why MySQL Queries Slow Down and How ES & HBase Can Help Optimize
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 24, 2021 · Big Data

Comprehensive Overview of HBase Architecture, Design, and Operations

This article provides an in‑depth technical overview of HBase, covering its Bigtable origins, distributed column‑store design, core components such as ZooKeeper, HMaster and RegionServer, data flow, storage formats, row‑key design, bulk loading, SQL integration, indexing, coprocessors, and performance tuning for big‑data environments.

Columnar DatabaseHBaseHDFS
0 likes · 30 min read
Comprehensive Overview of HBase Architecture, Design, and Operations
Java Architect Essentials
Java Architect Essentials
Jun 21, 2021 · Databases

Understanding MySQL Slow Queries, Index Optimization, and Integration with Elasticsearch and HBase

This article explains why MySQL queries become slow, how index design and common pitfalls affect performance, introduces MDL locks and large‑table strategies, then compares Elasticsearch and HBase as complementary storage and search solutions, providing practical code examples and best‑practice recommendations.

Database OptimizationHBaseMySQL
0 likes · 16 min read
Understanding MySQL Slow Queries, Index Optimization, and Integration with Elasticsearch and HBase
Architecture Digest
Architecture Digest
Jun 21, 2021 · Databases

Using HBase for HR Performance Data Preprocessing Platform: Architecture, Concepts, and Best Practices

This article introduces the HR performance data preprocessing platform’s requirements, explains why HBase was selected as the storage solution, details its core concepts, architecture, data write/read processes, best practices, limitations, and presents performance metrics demonstrating its suitability for large‑scale, high‑throughput workloads.

Big DataDatabase ArchitectureHBase
0 likes · 12 min read
Using HBase for HR Performance Data Preprocessing Platform: Architecture, Concepts, and Best Practices
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 20, 2021 · Big Data

Why HBase Is the Ideal Choice for Large‑Scale HR Data Preprocessing

This article explains how HBase’s distributed column‑oriented architecture, high‑performance read/write capabilities, and flexible schema make it a cost‑effective solution for handling massive, unstructured HR performance data, covering its core concepts, cluster operation, best practices, and performance metrics.

Big DataHBasedata preprocessing
0 likes · 11 min read
Why HBase Is the Ideal Choice for Large‑Scale HR Data Preprocessing
Big Data Technology Architecture
Big Data Technology Architecture
Jun 16, 2021 · Big Data

HBase Read and Write Performance Optimization Guide

This guide details practical server‑side and client‑side techniques for improving HBase read and write throughput, covering rowkey design, BlockCache configuration, HFile management, compaction tuning, scan cache sizing, bulkload usage, WAL policies, and SSD storage options.

Database TuningHBaseread optimization
0 likes · 8 min read
HBase Read and Write Performance Optimization Guide
Efficient Ops
Efficient Ops
Jun 9, 2021 · Databases

Why MySQL Queries Go Slow and How to Fix Them with Indexes, ES, and HBase

This article explains why MySQL queries become slow, explores index-related pitfalls and optimization techniques, and then compares ElasticSearch and HBase as complementary solutions for large‑scale data and search scenarios, offering practical tips and code examples.

Database PerformanceHBaseMySQL
0 likes · 21 min read
Why MySQL Queries Go Slow and How to Fix Them with Indexes, ES, and HBase
IT Architects Alliance
IT Architects Alliance
Jun 5, 2021 · Big Data

How to Build a Real‑Time Recommendation System with Flink, HBase, and Docker

This article walks through a complete real‑time recommendation system built on Apache Flink, detailing its v2.0 architecture, modules for user behavior, interest, and product profiling, the recommendation algorithms (hot‑list, collaborative filtering, item similarity), and step‑by‑step Docker deployment of MySQL, Redis, HBase, and Kafka.

DockerFlinkHBase
0 likes · 11 min read
How to Build a Real‑Time Recommendation System with Flink, HBase, and Docker
Top Architect
Top Architect
May 31, 2021 · Databases

How to Achieve Fast Queries: MySQL Index Optimization, Large‑Table Strategies, Elasticsearch Basics, and HBase Overview

This article explains common causes of slow MySQL queries, how proper indexing and lock handling can improve performance, introduces Elasticsearch’s inverted‑index advantages and suitable use cases, and outlines HBase’s column‑family storage model and row‑key design for large‑scale data.

Big DataDatabase OptimizationHBase
0 likes · 18 min read
How to Achieve Fast Queries: MySQL Index Optimization, Large‑Table Strategies, Elasticsearch Basics, and HBase Overview