Tagged articles

HBase

312 articles · Page 3 of 4

Sep 23, 2019 · Backend Development

Design and Evolution of Feed Stream Architecture for High‑Throughput Applications

This article analyzes the business requirements, technical challenges, and mainstream architectural solutions for large‑scale feed streams, and proposes a step‑by‑step evolution path—from a simple push model using cloud Kafka and HBase to hybrid push‑pull and recommendation‑driven designs—suitable for startups and rapidly growing platforms.

HBaseKafkabackend

0 likes · 15 min read

Design and Evolution of Feed Stream Architecture for High‑Throughput Applications

Big Data Technology & Architecture

Sep 22, 2019 · Databases

Alibaba Cloud BDS Service for Non‑Stop HBase Cluster Migration

This article explains how Alibaba Cloud's BDS migration service enables continuous, high‑performance migration of HBase clusters—including schema, full data, and incremental sync—across version upgrades, hardware changes, network migrations, and cross‑region scenarios, while ensuring stability and minimal impact on live workloads.

Alibaba CloudBDSBig Data

0 likes · 10 min read

Alibaba Cloud BDS Service for Non‑Stop HBase Cluster Migration

Big Data Technology & Architecture

Sep 13, 2019 · Big Data

Differences and Relationship Between HBase and Hive in Big Data Architecture

The article explains that HBase and Hive occupy distinct roles in big‑data systems—HBase handles real‑time random queries on massive detail data, while Hive provides batch‑oriented SQL‑based processing on HDFS—and describes how they are typically combined in a data pipeline.

Big DataData ArchitectureHBase

0 likes · 5 min read

Differences and Relationship Between HBase and Hive in Big Data Architecture

Big Data Technology Architecture

Sep 7, 2019 · Big Data

Getting Started with Apache Phoenix: Installation, Basic Usage, and Java API

This article introduces Apache Phoenix as an open‑source SQL layer for HBase, explains its core concepts and performance benefits, provides step‑by‑step installation instructions, demonstrates basic SQL operations and shows how to use the Phoenix Java API with code examples.

HBaseInstallationJava

0 likes · 9 min read

Getting Started with Apache Phoenix: Installation, Basic Usage, and Java API

Big Data Technology & Architecture

Sep 5, 2019 · Databases

Understanding HBase Connection Management and Best Practices

The article explains why HBase client connections should not be pooled, describes common misuse patterns, and details how the heavyweight, thread‑safe Connection object internally manages connections to HMaster, RegionServers, and ZooKeeper, recommending a single shared Connection per application.

Big DataHBaseJava

0 likes · 10 min read

Understanding HBase Connection Management and Best Practices

Big Data Technology & Architecture

Sep 2, 2019 · Databases

Apache Phoenix Tutorial: Quick Start, Data Types, DML, Indexes, Salted Tables, and Advanced Features

This comprehensive guide introduces Apache Phoenix as an HBase SQL layer, covering quick‑start steps, supported data types, DML syntax, salted tables to prevent hotspots, various secondary index types, bulk‑load methods, auto‑increment IDs, dynamic columns, pagination, query plan analysis, and data migration techniques.

Apache PhoenixData MigrationHBase

0 likes · 33 min read

Apache Phoenix Tutorial: Quick Start, Data Types, DML, Indexes, Salted Tables, and Advanced Features

360 Tech Engineering

Aug 22, 2019 · Big Data

Design and Implementation of XStore: A Hadoop‑Based Sample Storage System

This article details the design, architecture, and operational experience of XStore, a Hadoop‑backed sample storage system that handles billions of APK and other binary samples, addressing functional and non‑functional requirements such as real‑time upload, large‑scale storage, high‑performance reads, and disaster recovery.

HBaseHDFSHadoop

0 likes · 11 min read

Design and Implementation of XStore: A Hadoop‑Based Sample Storage System

Big Data Technology Architecture

Aug 16, 2019 · Big Data

In‑Depth Overview of HBase Architecture

This article provides a comprehensive, illustrated explanation of Apache HBase's architecture, covering its master‑slave components, region management, Zookeeper coordination, data flow for reads and writes, storage structures, compaction processes, fault recovery, and the system's strengths and limitations within the Hadoop ecosystem.

CompactionHBaseHadoop

0 likes · 21 min read

Big Data Technology & Architecture

Aug 8, 2019 · Big Data

Comprehensive Guide to Apache Kylin: Architecture, Concepts, Cube Design and Optimization

This article provides an in‑depth overview of Apache Kylin’s pre‑computation architecture, data‑warehouse concepts, step‑by‑step cube creation from Hive tables, and advanced optimization techniques such as derived dimensions, aggregation groups, and HBase row‑key encoding to achieve sub‑second OLAP queries on massive datasets.

Apache KylinBig DataCube

0 likes · 20 min read

Comprehensive Guide to Apache Kylin: Architecture, Concepts, Cube Design and Optimization

Big Data Technology Architecture

Aug 5, 2019 · Big Data

Zookeeper in Distributed Systems: Roles in Kafka, Hadoop, HBase, and Solr

This article explains Zookeeper’s core concepts, its ZAB consensus protocol, and surveys its essential roles in major big‑data components such as Kafka, Hadoop, HBase, and Solr, illustrating how it provides configuration, naming, coordination, leader election, and high‑availability services across distributed architectures.

HBaseHadoopKafka

0 likes · 5 min read

Zookeeper in Distributed Systems: Roles in Kafka, Hadoop, HBase, and Solr

360 Tech Engineering

Jul 31, 2019 · Backend Development

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

This article explains how 360 Search processes billions of web pages daily, detailing its backend architecture, offline indexing, online retrieval, index organization, and relevance models that enable efficient search over a hundred‑billion‑scale web corpus.

Big DataHBaseIndexing

0 likes · 21 min read

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

System Architect Go

Jul 19, 2019 · Big Data

Introduction to HBase: Architecture, Data Model, and Operations

This article provides a comprehensive overview of HBase, covering its distributed column‑oriented architecture, data model components, storage mechanisms, read/write processes, WAL lifecycle, MemStore flushing, region splitting and merging, and failure recovery within the Hadoop ecosystem.

Big DataDistributed storageHBase

0 likes · 20 min read

Introduction to HBase: Architecture, Data Model, and Operations

dbaplus Community

Jul 18, 2019 · Databases

How JD.com Scales HBase to 90PB: Architecture, Optimizations, and Lessons

This article examines JD.com's massive HBase deployment, detailing its evolution from early adoption to a 90PB, 7,000‑node cluster, the platform's architecture, multi‑active disaster recovery, multi‑tenant isolation, and the integration of Phoenix for SQL‑based access, offering practical insights for large‑scale distributed storage.

Big DataDatabase ArchitectureDistributed storage

0 likes · 15 min read

How JD.com Scales HBase to 90PB: Architecture, Optimizations, and Lessons

DataFunTalk

Jul 17, 2019 · Big Data

BitBase: An HBase‑Based Solution for Billion‑Scale User Feature Analysis at Kuaishou

This article describes how Kuaishou built BitBase on HBase to store and compute billions of user feature logs with millisecond‑level latency, covering business requirements, technical selection, bitmap data modeling, system architecture, device‑ID handling, performance results, and future roadmap.

BitBaseBitmap IndexHBase

0 likes · 11 min read

BitBase: An HBase‑Based Solution for Billion‑Scale User Feature Analysis at Kuaishou

Big Data Technology Architecture

Jul 16, 2019 · Big Data

Optimizing HBase‑to‑Hive Data Transfer with SnapshotScanMR to Reduce RegionServer Load

The article describes how a large‑scale ETL process that previously used HBaseStorageHandler caused severe region server pressure, and how a new HBase‑to‑Hive task based on SnapshotScanMR was designed to bypass region servers, halve execution time, and double scanning performance.

ETLHBaseHive

0 likes · 6 min read

Optimizing HBase‑to‑Hive Data Transfer with SnapshotScanMR to Reduce RegionServer Load

Big Data Technology Architecture

Jul 15, 2019 · Databases

Understanding HBase Write Path, Memstore Limits, and Preventing RegionServer OOM

This article explains the complete HBase write flow, the problems caused by rapid writes such as memstore overflow and RegionTooBusyException, and provides configuration and operational strategies to avoid RegionServer out‑of‑memory crashes.

HBaseOOMRegionServer

0 likes · 7 min read

Understanding HBase Write Path, Memstore Limits, and Preventing RegionServer OOM

Zhongtong Tech

Jul 5, 2019 · Big Data

How SnapshotScanMR Doubles HBase‑to‑Hive ETL Speed and Relieves Cluster Load

This article explains how leveraging HBase's SnapshotScanMR feature to create a custom hbase2hiveBySnapshot task dramatically reduces region server pressure, halves ETL execution time, and improves cluster stability for large‑scale data back‑fill operations.

Big DataETLHBase

0 likes · 6 min read

How SnapshotScanMR Doubles HBase‑to‑Hive ETL Speed and Relieves Cluster Load

DataFunTalk

Jun 28, 2019 · Databases

Deep Dive into Phoenix Index Creation, Maintenance, and SQL Compilation

This article provides a detailed technical analysis of Phoenix's native index creation and maintenance mechanisms, the underlying source code for index building, the role of coprocessors, and the complete SQL compilation pipeline from parsing to execution, highlighting how hints and optimizers influence index usage.

CoprocessorHBasePhoenix

0 likes · 26 min read

Deep Dive into Phoenix Index Creation, Maintenance, and SQL Compilation

dbaplus Community

Jun 2, 2019 · Databases

Why Sharding (Database Partitioning) Beats Partitioning and NoSQL for Massive Data

The article explains why sharding (splitting databases and tables) is the preferred solution for handling massive user, order, and transaction data in high‑traffic internet applications, comparing it with partitioning and NoSQL/NewSQL alternatives, and detailing practical middleware choices, sharding column selection, and integration with Elasticsearch and HBase.

ElasticsearchHBaseMySQL

0 likes · 14 min read

Why Sharding (Database Partitioning) Beats Partitioning and NoSQL for Massive Data

Big Data Technology Architecture

Jun 1, 2019 · Big Data

Impact of Excessive HBase Partitions and How to Calculate Reasonable Region Numbers

The article explains how excessive HBase partitions can cause frequent flushes, compaction storms, high memory usage, long master assignment times, and reduced MapReduce concurrency, and provides formulas and guidelines for calculating a reasonable number of regions per RegionServer.

Big DataCluster stabilityHBase

0 likes · 8 min read

Impact of Excessive HBase Partitions and How to Calculate Reasonable Region Numbers

Big Data Technology & Architecture

May 29, 2019 · Cloud Native

Real-Time Computing Solutions with Flink and HBase: Architecture, Market Analysis, and Use Cases

The article presents Alibaba Cloud's real-time computing solution based on Flink and HBase, covering market competition, open‑source ecosystem, containerized architecture on Kubernetes, and typical applications such as online education video analysis, city‑brain traffic management, and fraud detection.

Big DataFlinkHBase

0 likes · 12 min read

Real-Time Computing Solutions with Flink and HBase: Architecture, Market Analysis, and Use Cases

Big Data Technology & Architecture

May 28, 2019 · Databases

Introduction to Apache Phoenix: An Open‑Source SQL Layer for HBase

Apache Phoenix is an open‑source SQL layer for HBase that lets developers use standard JDBC instead of the native HBase client API, offering features such as secondary indexes, transactions, and various SQL‑level optimizations while supporting full table creation, insertion, and querying.

HBaseJDBCPhoenix

0 likes · 7 min read

Introduction to Apache Phoenix: An Open‑Source SQL Layer for HBase

Big Data Technology Architecture

May 27, 2019 · Databases

Understanding HBase Compaction: Types, Triggers, Parameters, and Performance Impact

This article explains HBase's compaction mechanism, covering why it is needed, the differences between minor and major compaction, the conditions that trigger compaction, key configuration parameters, thread‑pool handling, compaction policies, and how compaction influences read and write performance in a large‑scale NoSQL database.

CompactionDatabasesHBase

0 likes · 12 min read

Understanding HBase Compaction: Types, Triggers, Parameters, and Performance Impact

Big Data Technology Architecture

May 21, 2019 · Databases

Postmortem Analysis of a 10‑Node HBase Cluster Outage and Mitigation Measures

This article presents a detailed post‑mortem of a 10‑node HBase cluster failure caused by excessive region count and memstore pressure, analyzes HDFS and datanode log errors, and outlines configuration adjustments and operational recommendations that restored the service and prevented future outages.

Cluster OutageCompactionHBase

0 likes · 16 min read

Postmortem Analysis of a 10‑Node HBase Cluster Outage and Mitigation Measures

Big Data Technology Architecture

May 13, 2019 · Big Data

Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions

The article analyzes how HBase regions being assigned to a single RegionServer create reliability issues such as jitter, service interruptions, and data loss, examines the underlying hardware, OS, and operational factors, and proposes system optimizations and replica-based high‑availability strategies to mitigate these problems.

HBaseHigh AvailabilityRegion

0 likes · 10 min read

Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions

Big Data Technology Architecture

May 8, 2019 · Databases

Understanding HBase Scan Process and Its Performance Compared to Parquet and Kudu

The article explains why HBase read operations are complex due to its LSM‑Tree storage and multi‑version design, details the step‑by‑step Scan workflow, discusses the reasons for its multi‑request architecture, compares scan performance with Parquet and Kudu, and offers recommendations for large‑scale data scanning.

DatabasesHBaseLSM‑Tree

0 likes · 7 min read

Understanding HBase Scan Process and Its Performance Compared to Parquet and Kudu

Big Data Technology & Architecture

May 7, 2019 · Databases

Design and Multi‑Tenant Management of HBase at Didi

This article details Didi's use of HBase for various online and offline workloads, covering multi‑language support, data types, rowkey designs for order, trajectory and ETA scenarios, multi‑tenant resource management with DHS and RS Group, and operational best practices.

GeoHashHBaseMulti‑tenant

0 likes · 12 min read

Design and Multi‑Tenant Management of HBase at Didi

Big Data Technology Architecture

May 6, 2019 · Databases

An Introduction to NoSQL and HBase: Concepts, Features, and Use Cases

This article explains the fundamentals of NoSQL databases, the CAP theorem trade‑offs, and why HBase—a column‑oriented, distributed NoSQL system—offers strong consistency, automatic sharding, high availability, and seamless Hadoop integration, while also outlining its ideal scenarios and limitations.

DatabasesHBaseNoSQL

0 likes · 7 min read

An Introduction to NoSQL and HBase: Concepts, Features, and Use Cases

Alibaba Cloud Developer

Apr 19, 2019 · Databases

Mastering HBase: From Basics to Architecture and Cluster Design

This article introduces HBase, its origins from Google Bigtable, core concepts such as RowKey, Column Family, and Versioning, and explains its logical and physical table views, storage mechanisms, and cluster architecture within the Hadoop ecosystem.

BigtableDistributed storageHBase

0 likes · 8 min read

Mastering HBase: From Basics to Architecture and Cluster Design

Youzan Coder

Apr 17, 2019 · Big Data

Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase

YouZan’s order data synchronization moves changes from MySQL through Canal‑parsed binlogs into a message queue, then uses sequential SeqNo‑based optimistic locking and HBase’s column‑version timestamps to guarantee ordering for both single‑ and multi‑table updates, while a Logstash‑style configurable pipeline feeds ES for search and HBase for detail queries, eliminating ordered‑queue bottlenecks and ensuring high‑throughput consistency.

BinlogCanalData synchronization

0 likes · 12 min read

Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase

Youzan Coder

Apr 12, 2019 · Industry Insights

How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs

This article details Youzan's evolution from a simple Flume‑based log collector to a multi‑tenant, Kafka‑buffered, Spark‑processed, HBase‑backed logging architecture that now handles hundreds of billions of log entries per day, highlighting challenges, design decisions, and future improvements.

ElasticsearchHBaseKafka

0 likes · 10 min read

How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs

Architecture Digest

Apr 11, 2019 · Big Data

Understanding Hadoop and HBase: Installation, Configuration, and Basic Operations

This guide introduces Hadoop and HBase fundamentals, explains their architectures and advantages, and provides step‑by‑step instructions for setting up a multi‑node Hadoop cluster, configuring core services, installing HBase, and performing basic HBase shell operations.

Big DataHBaseHadoop

0 likes · 18 min read

Understanding Hadoop and HBase: Installation, Configuration, and Basic Operations

JD Retail Technology

Apr 10, 2019 · Databases

HBase at JD.com: Architecture, Use Cases, and Evolution

This article explains how JD.com leverages the open‑source HBase database for massive, low‑latency data storage across various business lines, detailing its architecture, multi‑tenant isolation, disaster‑recovery mechanisms, and integration with Phoenix SQL for OLTP workloads.

Big DataDatabase ArchitectureDistributed storage

0 likes · 13 min read

HBase at JD.com: Architecture, Use Cases, and Evolution

58 Tech

Mar 27, 2019 · Databases

OpenTSDB Architecture, Data Model, Storage Optimizations, and Practical Use Cases

This article introduces OpenTSDB as a distributed, scalable time‑series database built on HBase, explains its architecture, data model, and storage optimizations, presents real‑world monitoring use cases, analyzes performance issues caused by high‑cardinality tags, and details the solution steps taken to restore query speed.

HBaseOpenTSDBStorage Optimization

0 likes · 9 min read

OpenTSDB Architecture, Data Model, Storage Optimizations, and Practical Use Cases

dbaplus Community

Mar 12, 2019 · Databases

Mastering HBase Cross‑Datacenter Migration: Snapshots, Architecture, and Real‑World Tips

This article provides a comprehensive technical guide on HBase, covering its core concepts, advantages and drawbacks, architecture layers, practical use cases, and a detailed step‑by‑step process for large‑scale cross‑datacenter migration using snapshot‑based strategies, with commands, diagrams, and lessons learned.

Big DataData MigrationDatabase Architecture

0 likes · 19 min read

Mastering HBase Cross‑Datacenter Migration: Snapshots, Architecture, and Real‑World Tips

Efficient Ops

Feb 24, 2019 · Databases

Why Row vs Column Storage Matters: Understanding HBase’s Column‑Family Model

This article explains the differences between row‑oriented and column‑oriented storage, compares their trade‑offs, and introduces HBase’s column‑family architecture, including row keys, column qualifiers, timestamps, cells, and how it maps to a multi‑dimensional map structure.

Big DataColumnar StorageDatabases

0 likes · 7 min read

Why Row vs Column Storage Matters: Understanding HBase’s Column‑Family Model

Youzan Coder

Feb 20, 2019 · Databases

HBase Read Path Analysis

The article first outlines HBase’s overall architecture and core components, then details the end‑to‑end read path—from client request routing to RegionServer processing, data organization and filtering—and finally presents practical client‑ and server‑side optimizations such as heterogeneous storage, HDFS short‑circuit, hedged reads, high‑availability reads, and warm‑up failure fixes, illustrated with Youzan’s production cluster.

HBaseTechnical Guidedatabase

0 likes · 17 min read

JD Tech

Feb 18, 2019 · Big Data

Understanding HBase: Advantages, Use Cases, Data Model, and Architecture

This article explains HBase as a high‑performance, column‑oriented distributed storage system, outlines its advantages and limitations, presents real‑world scenarios such as seller operation logs and message logs, and details its data structures, architecture components, and design considerations for big‑data applications.

Distributed storageHBaseNoSQL

0 likes · 9 min read

Understanding HBase: Advantages, Use Cases, Data Model, and Architecture

58 Tech

Jan 7, 2019 · Big Data

Comparison of Kuaishou BlobStore and 58 WOS Object Storage Systems

The article summarizes the technical talk from the 58 Group technology salon, detailing the architectures, scalability, high‑availability mechanisms, and storage models of Kuaishou's BlobStore and 58's WOS, and compares their design choices for large‑scale object storage.

BlobStoreDistributed storageHBase

0 likes · 9 min read

Comparison of Kuaishou BlobStore and 58 WOS Object Storage Systems

58 Tech

Dec 28, 2018 · Big Data

Kylin OLAP Platform Architecture, Optimizations, and 58.com Case Study

This article introduces Kylin, a HBase‑based multidimensional analysis platform, explains its architecture and various performance optimizations—including multi‑tenant support, dimension dictionary handling, and cube size estimation—while showcasing a real‑world deployment and case study at 58.com.

Cube OptimizationData WarehouseHBase

0 likes · 14 min read

Kylin OLAP Platform Architecture, Optimizations, and 58.com Case Study

Youzan Coder

Dec 28, 2018 · Big Data

Quantifying HBase Write Path: Disk and Network Costs for High‑Throughput Scenarios

This article analytically breaks down HBase's write pipeline, quantifies disk and network overheads for massive random writes, derives formulas for resource consumption under realistic assumptions, and offers concrete tuning recommendations to optimize throughput and reduce cost.

Big DataHBasePerformance

0 likes · 16 min read

Quantifying HBase Write Path: Disk and Network Costs for High‑Throughput Scenarios

Beike Product & Technology

Dec 27, 2018 · Cloud Computing

HBase Ecosystem Introduction

This article introduces HBase's ecosystem, including its components like OpenTSDB for time-series data, Kylin for cube analysis, Phoenix for SQL operations, and GeoMesa for spatial data, along with the author's experience in deploying these in a production environment.

Cloud ComputingGeoMesaHBase

0 likes · 9 min read

58 Tech

Dec 12, 2018 · Big Data

Design and Optimization of 58.com’s HBase Platform: Multi‑Tenant Support, Data Access Interfaces, Import/Export Tools, and Performance Tuning

This article details the architecture and operational enhancements of 58.com’s HBase platform, covering multi‑tenant resource isolation, various data access APIs, bulk import/export mechanisms, and a series of performance optimizations that improve stability and scalability for massive data workloads.

HBaseMulti‑tenantdata import

0 likes · 18 min read

Design and Optimization of 58.com’s HBase Platform: Multi‑Tenant Support, Data Access Interfaces, Import/Export Tools, and Performance Tuning

Youzan Coder

Dec 10, 2018 · Backend Development

How Youzan Scaled Order Export to Millions with ES, HBase, and Config‑Driven Design

This article examines the challenges of Youzan's order export system, describes the migration from PHP‑based scripts to an Elasticsearch and HBase stack, and details the step‑by‑step configuration‑driven refactor—including enum field definitions, Groovy scripts, strategy patterns, plugin architecture, and quality‑assurance practices—that enabled million‑order exports with high performance and stability.

ElasticsearchGroovyHBase

0 likes · 13 min read

How Youzan Scaled Order Export to Millions with ES, HBase, and Config‑Driven Design

DataFunTalk

Dec 6, 2018 · Databases

HBase RowKey and Index Design: Principles, Practices, and Case Studies

This article introduces HBase fundamentals, explores effective RowKey and secondary index design principles, discusses demand analysis, presents techniques such as reversing, salting, hashing, and reviews real-world case studies for OpenTSDB, JanusGraph, and GeoMesa, offering practical guidance for scalable NoSQL data modeling.

Database ArchitectureHBaseNoSQL

0 likes · 19 min read

HBase RowKey and Index Design: Principles, Practices, and Case Studies

DataFunTalk

Dec 1, 2018 · Databases

Apache HBase: Current Status, Development, Features, and Future Roadmap

This article provides a comprehensive overview of Apache HBase, covering its core architecture, key features such as automatic sharding, LSM‑Tree storage, separation of storage and compute, the ecosystem, real‑world use cases, recent 2.0 enhancements, upgrade guidance, future plans, and community recruitment information.

Database FeaturesHBaseNoSQL

0 likes · 10 min read

Apache HBase: Current Status, Development, Features, and Future Roadmap

Programmer DD

Nov 18, 2018 · Databases

How We Optimized HBase for 80 Billion Daily Logs: Real‑World Tuning Strategies

This article details the practical performance‑tuning steps applied to a large‑scale HBase deployment handling 80 billion daily log entries, covering rowkey redesign, region redistribution, HDFS write‑timeout fixes, network‑topology adjustments, and JVM parameter tweaks that together stabilized the system and dramatically improved throughput.

HBaseHDFSPerformance Tuning

0 likes · 14 min read

How We Optimized HBase for 80 Billion Daily Logs: Real‑World Tuning Strategies

Qunar Tech Salon

Nov 6, 2018 · Operations

Analyzing TCP Connection States and Resolving TIME_WAIT, CLOSE_WAIT, and SYN_RECV Issues in a Java/Tomcat/HBase System

This article walks through a real‑world incident where sudden traffic drops were traced to abnormal TCP states—TIME_WAIT, CLOSE_WAIT, and SYN_RECV—by examining monitoring data, explaining the TCP handshake, reviewing relevant kernel parameters, and debugging Java/ZooKeeper/HBase code to identify and fix the root cause.

HBaseSYN_RECVTCP

0 likes · 20 min read

Analyzing TCP Connection States and Resolving TIME_WAIT, CLOSE_WAIT, and SYN_RECV Issues in a Java/Tomcat/HBase System

DataFunTalk

Oct 19, 2018 · Databases

HBase Application and High‑Availability Practices

This article summarizes the current usage of HBase at Ping An Technology, the challenges it addresses, detailed client‑ and server‑side performance and stability optimizations, high‑availability mechanisms, data migration strategies, monitoring and repair practices, and future development plans.

Data MigrationHBaseHigh Availability

0 likes · 9 min read

HBase Application and High‑Availability Practices

DataFunTalk

Oct 14, 2018 · Big Data

Exploring Real-Time Data Warehouse Practices Based on HBase

The article details the evolution from an offline to a real‑time HBase data warehouse, covering business scenarios, the use of Maxwell for MySQL‑to‑Kafka ingestion, Phoenix for SQL access, CDH cluster tuning, monitoring, and several production case studies.

HBaseKafkaPhoenix

0 likes · 14 min read

Exploring Real-Time Data Warehouse Practices Based on HBase

DataFunTalk

Sep 29, 2018 · Big Data

Applying HBase in a Risk‑Control System and High‑Availability Practices

This article summarizes Guo Dongdong’s presentation on leveraging HBase for a risk‑control platform, detailing its architecture, data import/export mechanisms, indexing, region server recovery challenges, monitoring, SQL interception, dual‑cluster high‑availability, and future enhancements for large‑scale, low‑latency big‑data services.

HBaseHigh AvailabilityPhoenix

0 likes · 13 min read

Applying HBase in a Risk‑Control System and High‑Availability Practices

DataFunTalk

Sep 25, 2018 · Databases

Practical Experience with HBase at NetEase: Architecture, Use Cases, and Troubleshooting

This article presents NetEase's extensive use of HBase within its big‑data platform, covering the system’s role, real‑world application scenarios, common RIT issues, HBCK repair methods, and a systematic approach to monitoring and troubleshooting performance problems.

HBCKHBaseRIT

0 likes · 10 min read

Practical Experience with HBase at NetEase: Architecture, Use Cases, and Troubleshooting

Alibaba Cloud Developer

Sep 17, 2018 · Big Data

How Alibaba Built a Scalable Search Offline Platform for Billions of Records

This article explains how Alibaba's search offline platform combines massive batch and real‑time processing, leveraging Hadoop, HBase, Flink (Blink) and a unified full‑incremental model to handle tens of billions of daily records and millions of TPS for diverse business lines.

Alibaba CloudFlinkHBase

0 likes · 18 min read

How Alibaba Built a Scalable Search Offline Platform for Billions of Records

Big Data and Microservices

Jul 24, 2018 · Big Data

Why Hadoop Still Leads Big Data Processing: Core Advantages Explained

This article introduces Hadoop’s open‑source big‑data framework, explains its core components HDFS and MapReduce, and outlines four key advantages—ease of deployment, robustness, scalability, and simplicity—while also covering HBase as the Hadoop‑based column‑oriented database.

Big DataDistributed ComputingHBase

0 likes · 4 min read

Why Hadoop Still Leads Big Data Processing: Core Advantages Explained

DataFunTalk

Jul 13, 2018 · Big Data

Applying BitMap Indexing with HBase for Precise Marketing in Big Data

This article details a big‑data precise‑marketing solution that leverages HBase storage and Roaring BitMap indexing to efficiently handle billions of user records, describing project background, technology selection, architecture, partitioning strategy, and coprocessor implementation for fast multidimensional queries.

HBaseIndexingRoaring Bitmap

0 likes · 13 min read

Applying BitMap Indexing with HBase for Precise Marketing in Big Data

Architecture Digest

May 28, 2018 · Big Data

Building a Real-Time Stream Processing Platform with Hadoop Ecosystem (Kafka, Spark Streaming, HBase)

This guide details how to construct a real-time data processing platform on CentOS 7 using the Hadoop ecosystem—installing and configuring Zookeeper, Maven, Hadoop, Kafka, HBase, Spark, and Flume—followed by a Spark Streaming job that consumes Kafka messages and writes them into HBase.

Big DataFlumeHBase

0 likes · 14 min read

Building a Real-Time Stream Processing Platform with Hadoop Ecosystem (Kafka, Spark Streaming, HBase)

Huawei Cloud Developer Alliance

Apr 17, 2018 · Big Data

How a Big Data Platform Powers Real‑Time Facial Recognition for Billion‑Scale Face Libraries

This case study details how Beijing 恒远华信息技术有限公司 built a dynamic face‑capture and real‑time recognition solution on Huawei FusionInsight HD, leveraging deep‑learning algorithms, distributed storage, and stream processing to handle hundreds of millions of faces with high speed, efficiency, and security.

Apache StormHBaseHuawei FusionInsight

0 likes · 17 min read

How a Big Data Platform Powers Real‑Time Facial Recognition for Billion‑Scale Face Libraries

dbaplus Community

Apr 2, 2018 · Databases

Why Titan Outperforms Traditional RDBMS for Complex Graph Queries

The article explains how relational databases struggle with many‑to‑many and deep relationship queries, compares popular graph databases, details Titan's modular architecture, data model, Gremlin query examples, storage layout, and demonstrates its successful deployment at Paipaidai for large‑scale fraud detection, achieving over 25% efficiency gains.

GremlinHBaseTitan

0 likes · 10 min read

Why Titan Outperforms Traditional RDBMS for Complex Graph Queries

StarRing Big Data Open Lab

Mar 23, 2018 · Databases

Mastering HBase Ops: Essential Tools and Commands for Cluster Management

This guide introduces the most commonly used HBase operational tools—including Canary, hbck, HFile viewer, CopyTable, Export/Import, ImportTsv, CompleteBulkload, RowCounter, CellCounter, and clean utilities—explaining their purposes, typical use‑cases, and exact command syntax for effective cluster administration.

Big DataCommandsHBase

0 likes · 12 min read

Mastering HBase Ops: Essential Tools and Commands for Cluster Management

Hulu Beijing

Feb 28, 2018 · Big Data

How Hulu’s Nesto Engine Delivers Near‑Real‑Time OLAP on TB‑Scale Data

This article introduces Hulu's in‑house OLAP engine Nesto, detailing its near‑real‑time data ingestion, nested data model, TB‑level storage using HBase and Parquet, MPP query execution, custom predicate library, and the overall architecture that enables sub‑second ad‑hoc queries for user analytics.

Big DataColumnar StorageHBase

0 likes · 22 min read

How Hulu’s Nesto Engine Delivers Near‑Real‑Time OLAP on TB‑Scale Data

Alibaba Cloud Developer

Nov 30, 2017 · Big Data

How Ali‑HBase Cut Young GC Pauses from 120 ms to 5 ms: Inside CCSMap & BucketCache

This article explains how Alibaba engineers reduced young‑generation garbage‑collection pauses in large‑scale HBase deployments from over a hundred milliseconds to just a few milliseconds by redesigning memory management with CCSMap, BucketCache, and the tenant‑aware AliGC algorithm.

AliGCBucketCacheCCSMap

0 likes · 15 min read

How Ali‑HBase Cut Young GC Pauses from 120 ms to 5 ms: Inside CCSMap & BucketCache

dbaplus Community

Nov 26, 2017 · Databases

Understanding HBase Region Auto‑Splitting: Policies, Process, and Pitfalls

This article explains how HBase achieves scalable region auto‑splitting, detailing the various split policies, the algorithm for locating split points, the transactional split workflow, reference file handling, data migration via compaction, cleanup procedures, and common troubleshooting tips.

HBaseReference FileRegion Split

0 likes · 17 min read

Understanding HBase Region Auto‑Splitting: Policies, Process, and Pitfalls

Qunar Tech Salon

Nov 14, 2017 · Backend Development

Designing Distributed Systems Inspired by McDonald’s Restaurant Operations

The article uses everyday observations from a McDonald’s restaurant to illustrate core distributed system concepts such as master‑slave architecture, two‑phase commit, microservice decomposition, task queues, and container orchestration, showing how these principles apply to backend engineering.

CassandraHBaseMaster‑Slave

0 likes · 15 min read

Designing Distributed Systems Inspired by McDonald’s Restaurant Operations

21CTO

Nov 11, 2017 · Big Data

How We Built a Scalable Seller Log System with Kafka, Storm, ES & HBase

This article explains the design and implementation of a unified seller‑operation logging platform that uses Kafka for ingestion, Storm for real‑time processing, Elasticsearch for hot‑data search, and HBase for cold‑data storage, detailing the challenges faced and the optimizations applied.

Big DataElasticsearchHBase

0 likes · 12 min read

How We Built a Scalable Seller Log System with Kafka, Storm, ES & HBase

dbaplus Community

Oct 15, 2017 · Big Data

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

This article details JD's end‑to‑end seller log system architecture, explaining why Kafka, Storm, Elasticsearch and HBase were chosen, the challenges faced during scaling, and the practical solutions implemented to achieve a unified, high‑throughput logging platform for merchants and operations.

Big DataElasticsearchHBase

0 likes · 13 min read

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

21CTO

Aug 23, 2017 · Big Data

How to Build a Real-Time Customer Behavior Collection System with Storm and NSQ

This article explains the design of a real-time customer behavior collection platform that uses NSQ for messaging, Storm for streaming processing, and HBase for storage, covering architecture, data flow, reliability guarantees, and deployment considerations.

HBaseNSQReal-time Streaming

0 likes · 11 min read

How to Build a Real-Time Customer Behavior Collection System with Storm and NSQ

Alibaba Cloud Developer

Aug 10, 2017 · Big Data

Alibaba’s HBase Innovations: Powering Big Data at Scale – HBaseCon 2017 Asia Insights

At HBaseCon 2017 Asia, Alibaba showcased a series of groundbreaking HBase enhancements—including strong synchronous replication, SQL-on-HBase capabilities, cross‑cluster range data copy, and read/write path optimizations—that dramatically improve performance, reliability, and usability for large‑scale big‑data storage.

Big DataDistributed storageHBase

0 likes · 10 min read

Alibaba’s HBase Innovations: Powering Big Data at Scale – HBaseCon 2017 Asia Insights

ITFLY8 Architecture Home

Jul 26, 2017 · Big Data

Inside Taobao’s Massive Data Architecture: From Hadoop “Cloud Ladder” to Real‑Time “Galaxy”

This article details Taobao’s multi‑layer massive data platform, covering its five‑tier architecture, the 1500‑node Hadoop “Cloud Ladder” for batch processing, the low‑latency “Galaxy” stream engine, MySQL‑based MyFOX, HBase‑based Prom storage, the glider middle‑layer, and sophisticated caching strategies that together support petabytes of data and millions of daily queries.

Big DataCachingHBase

0 likes · 16 min read

Inside Taobao’s Massive Data Architecture: From Hadoop “Cloud Ladder” to Real‑Time “Galaxy”

ITFLY8 Architecture Home

Jul 10, 2017 · Databases

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, and ETA

This article explains how Didi leverages HBase across multiple business scenarios—including order lifecycle queries, driver‑passenger trajectory tracking, ETA calculations, and cluster monitoring—while addressing multi‑language support, rowkey design, GeoHash indexing, and multi‑tenant resource management.

Database DesignGeoHashHBase

0 likes · 13 min read

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, and ETA

21CTO

Jul 6, 2017 · Big Data

How HBase Boosted Tencent Monitoring Platform Performance 3‑5×

Facing the challenge of storing over 120 billion daily monitoring points from hundreds of thousands of servers, Tencent’s monitoring platform migrated from a custom solution and OpenTSDB to a finely tuned HBase architecture, achieving 3‑5× higher throughput, improved reliability, and significant storage savings.

DistributedStorageHBasePerformanceTuning

0 likes · 11 min read

How HBase Boosted Tencent Monitoring Platform Performance 3‑5×

Architecture Digest

Jul 5, 2017 · Big Data

Design and Practice of Using HBase for Massive TMP Monitoring Data Storage

This article analyzes the limitations of the original TMP monitoring storage architecture, evaluates OpenTSDB's shortcomings at large scale, and details the design, implementation, and performance tuning of a custom HBase-based solution that achieves 3‑5× higher throughput for billions of monitoring data points per day.

HBaseOpenTSDBPerformance Tuning

0 likes · 12 min read

Design and Practice of Using HBase for Massive TMP Monitoring Data Storage

21CTO

Jun 19, 2017 · Databases

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring

This article explains how Didi leverages HBase’s distributed architecture, multi‑language APIs, and custom rowkey designs to support online order queries, driver‑passenger trajectory tracking with GeoHash, real‑time ETA calculations, and a monitoring platform, while managing multi‑tenant resources through DHS and RS Group.

DidiDistributed storageGeoHash

0 likes · 13 min read

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring

dbaplus Community

May 24, 2017 · Operations

How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime

This guide details the step‑by‑step process for replacing a faulty ZooKeeper node (myid 5) in a five‑node cluster, covering configuration updates in zoo.cfg, Hadoop’s hdfs‑site.xml, yarn‑site.xml, HBase‑site.xml, and the required service restarts to ensure continuous high‑availability.

HBaseHadoopHigh Availability

0 likes · 10 min read

How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime

Efficient Ops

May 2, 2017 · Big Data

Mastering ZooKeeper: Core Concepts and Real-World Big Data Applications

This article introduces ZooKeeper’s fundamental architecture, explains its key concepts such as cluster roles, sessions, ZNodes, watches, and ACLs, and then details how it powers essential distributed coordination tasks—including configuration management, naming services, master election, and distributed locks—in large‑scale Hadoop and HBase ecosystems.

Distributed CoordinationDistributed LocksHBase

0 likes · 25 min read

Mastering ZooKeeper: Core Concepts and Real-World Big Data Applications

Qunar Tech Salon

Apr 21, 2017 · Big Data

Ensuring Exact‑Once Semantics in Spark Streaming with Kafka: Offline Repair and Data Deduplication Strategies

This article explains why Spark Streaming combined with Kafka can only guarantee at‑least‑once delivery, outlines the challenges of delayed and out‑of‑order events, and presents practical offline‑repair, deduplication, and output‑format techniques—including code examples—to achieve exact‑once semantics in big‑data pipelines.

Exact-OnceHBaseHDFS

0 likes · 11 min read

Ensuring Exact‑Once Semantics in Spark Streaming with Kafka: Offline Repair and Data Deduplication Strategies

Alibaba Cloud Developer

Apr 1, 2017 · Databases

Alibaba’s HBase Scaling Secrets: High‑Availability, Replication, Performance

This article details how Alibaba has evolved HBase from an internal storage solution to a cloud service, covering its architecture, high‑availability design, asynchronous and synchronous replication, multi‑link data flows, cost‑effective redundancy, performance optimizations, and future development directions.

Distributed storageHBasereplication

0 likes · 28 min read

Alibaba’s HBase Scaling Secrets: High‑Availability, Replication, Performance

Alibaba Cloud Developer

Mar 24, 2017 · Databases

Alibaba’s Secrets to Scaling HBase for PB‑Level Big Data

This article explains how Alibaba built, customized, and operated a massive HBase platform—covering its architecture, high‑availability design, asynchronous and synchronous replication, multi‑link data flow, cost‑aware redundancy, cross‑cluster migration, performance optimizations, and future directions for the distributed NoSQL database.

AlibabaHBasedistributed database

0 likes · 29 min read

Alibaba’s Secrets to Scaling HBase for PB‑Level Big Data

Tencent Cloud Developer

Mar 8, 2017 · Big Data

HBase Data Migration from Version 0.94.15 to 1.2.1: Issues and Solutions

The migration of 500 GB HBase and 5 TB Solr data from version 0.94.15 to 1.2.1 required fixing hardware clock drift, DNS hostname issues, and missing Snappy support, and demonstrated that a brute‑force HDFS transfer is more reliable than import/export when handling deprecated parameters.

Data MigrationHBaseHadoop

0 likes · 9 min read

HBase Data Migration from Version 0.94.15 to 1.2.1: Issues and Solutions

Meituan Technology Team

Mar 2, 2017 · Big Data

Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing

Meituan Waimai’s Feature Archive platform processes billions of daily orders by managing ~200 user and 400 merchant tags through a three‑layer architecture—Hive, Elasticsearch, HBase, and MySQL—offering visual tag selection, instant self‑service queries, full data extraction, and a predicate‑logic query language, while supporting future extensibility.

Big DataElasticsearchHBase

0 likes · 14 min read

Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing

Qunar Tech Salon

Feb 26, 2017 · Big Data

Comparative Analysis of Big Data Storage and Query Solutions

This article reviews major big‑data storage and query architectures—including HBase, Dremel/Parquet, pre‑aggregation systems, Lucene, and the custom Tindex solution—evaluating their strengths, weaknesses, and suitability for real‑time, high‑volume analytical workloads.

Big DataHBaseLucene

0 likes · 20 min read

Comparative Analysis of Big Data Storage and Query Solutions

Weidian Tech Team

Feb 24, 2017 · Big Data

How We Built a Scalable Dump Index Architecture for 60M Users and 1.3B Products

Facing the challenges of searching across 60 million users and 1.3 billion products, Weidian’s engineering team designed a dump‑based indexing pipeline—Ergate—that consolidates, transforms, version‑controls, and monitors data from MySQL to HBase, enabling fast, flexible, and reliable search across massive datasets.

HBasePlatformizationdata indexing

0 likes · 7 min read

How We Built a Scalable Dump Index Architecture for 60M Users and 1.3B Products

Tencent Cloud Developer

Dec 23, 2016 · Databases

Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain

The article explains HBase’s write‑ahead‑log architecture, detailing how client put/delete requests travel through RPC to the RegionServer, are processed by MultiRowMutationService, written to the WAL via FSHLog.append and sync, and finally stored in MemStore, while describing durability options and the underlying source‑code call chain.

Big DataDistributed storageHBase

0 likes · 10 min read

Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain

dbaplus Community

Dec 22, 2016 · Databases

Latest Features in MongoDB 3.4, Redis 3.2, HBase 1.2.4, Geode 1.0 & TiDB RC1

This article summarizes the most recent releases of major NoSQL and NewSQL databases—including MongoDB 3.4, Redis 3.0/3.2, HBase 1.2.4, Apache Geode 1.0, TiDB RC1, and the final update of RethinkDB—highlighting key new capabilities, performance improvements, and operational enhancements.

Database ReleasesGeodeHBase

0 likes · 13 min read

Latest Features in MongoDB 3.4, Redis 3.2, HBase 1.2.4, Geode 1.0 & TiDB RC1

dbaplus Community

Nov 20, 2016 · Databases

How to Slash HBase Read Latency: Proven Client, Server, and HDFS Tweaks

This article examines the common causes of high read latency in HBase—such as full GC, region‑server imbalance, low write throughput, and inefficient client settings—and provides concrete optimization steps for the client, server, column‑family design, and HDFS layers to dramatically improve performance.

Client TuningHBaseHDFS

0 likes · 16 min read

How to Slash HBase Read Latency: Proven Client, Server, and HDFS Tweaks

ITFLY8 Architecture Home

Oct 27, 2016 · Big Data

Inside Taobao’s Massive Data Architecture: How 1.5 PB Daily Is Processed and Served

The article explains Taobao’s five‑layer data product architecture—covering data sources, compute, storage, query, and product layers—and describes how massive volumes of data are ingested, processed in batch and streaming, stored in MySQL and HBase clusters, and served efficiently through a unified middle‑layer and sophisticated caching mechanisms.

Big DataCachingHBase

0 likes · 15 min read

Inside Taobao’s Massive Data Architecture: How 1.5 PB Daily Is Processed and Served

Java High-Performance Architecture

Oct 23, 2016 · Databases

How to Use Phoenix SQL on HBase: Quick Guide with Code Examples

Phoenix adds a SQL layer to HBase, enabling easy table creation, data import, and complex queries via JDBC, with features like secondary indexes and integration with Spark, Hive, and more, illustrated through step‑by‑step examples and sample code.

HBaseJDBCPhoenix

0 likes · 5 min read

How to Use Phoenix SQL on HBase: Quick Guide with Code Examples

Java High-Performance Architecture

Oct 18, 2016 · Databases

How HBase Locates Data and Manages Writes: Regions, Meta Table, and ZooKeeper

This article explains how HBase finds the correct region server for a given row key using the hbase:meta table stored in ZooKeeper, and describes the write path involving MemStore, HLog, StoreFile creation, and subsequent maintenance tasks.

HBaseMeta TableRegion

0 likes · 4 min read

How HBase Locates Data and Manages Writes: Regions, Meta Table, and ZooKeeper

Java High-Performance Architecture

Oct 17, 2016 · Databases

How Does HBase Store Massive Tables? Inside Its Architecture

HBase stores huge tables by splitting them into regions, distributing these across region servers managed by a master, and further dividing each region into column-family stores, memstores, and StoreFiles, forming a layered architecture built on Hadoop’s distributed storage.

Distributed storageHBaseHadoop

0 likes · 2 min read

How Does HBase Store Massive Tables? Inside Its Architecture

Java High-Performance Architecture

Oct 13, 2016 · Databases

How HBase Stores Data: From Relational Tables to Column Families

This article compares traditional relational table design with HBase's column‑family model, showing how tables, rows, column families, and versioned cells are defined and how data is inserted using HBase's put commands.

Column FamilyHBaseNoSQL

0 likes · 5 min read

How HBase Stores Data: From Relational Tables to Column Families

Architecture Digest

Sep 12, 2016 · Artificial Intelligence

Design and Implementation of a Real‑Time, Highly Available General Recommendation Platform at YHD

The article describes how YHD's precision recommendation team built a real‑time, highly available, traceable general recommendation platform, detailing its background, overall architecture, visual configuration and traceability subsystems, and reporting significant improvements in development speed, reuse and user satisfaction.

AIHBaseKafka

0 likes · 8 min read

Design and Implementation of a Real‑Time, Highly Available General Recommendation Platform at YHD

Ctrip Technology

Aug 26, 2016 · Big Data

Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Practical Applications in Flight Ticket Big Data

This article presents a comprehensive overview of the Qdata session on OLAP engine exploration, detailing the limitations of traditional MySQL‑based solutions, the requirements for large‑scale analytics, the architecture and theoretical foundations of Apache Kylin, its cube construction process, storage in HBase, query rewriting, real‑world flight‑ticket data applications, and the encountered challenges with corresponding optimization practices.

Apache KylinCubeData Warehouse

0 likes · 7 min read

Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Practical Applications in Flight Ticket Big Data

Ctrip Technology

Aug 19, 2016 · Big Data

HBase‑Based Packet Capture and Retrieval System for Large‑Scale Network Traffic

The article presents a method that leverages HBase to capture, store, index, and quickly retrieve massive network packets, using PF_RING and libpcap for high‑performance capture and providing APIs for time‑, IP‑, protocol‑, and port‑based packet backtracking.

Big DataHBasePF_RING

0 likes · 7 min read

HBase‑Based Packet Capture and Retrieval System for Large‑Scale Network Traffic

Qunar Tech Salon

Aug 16, 2016 · Big Data

Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Applications in Qunar's Big Data Platform

This article presents Qunar's experience transitioning from MySQL‑based OLAP to Apache Kylin, detailing the performance challenges, required features, Kylin's architecture and theory, cube construction process, storage mechanisms, real‑world applications, and the pitfalls and optimization practices discovered along the way.

Apache KylinCubeHBase

0 likes · 6 min read

Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Applications in Qunar's Big Data Platform

dbaplus Community

Aug 16, 2016 · Databases

How JD.com Scales Billions of Product Reviews with MySQL, HBase, and Elasticsearch

This article explains how JD.com designs a multi‑layer data storage architecture—using MySQL for core metadata, HBase and MongoDB for comment text, Solr/Elasticsearch for indexing, and Redis for caching—to handle billions of daily review requests with high performance and availability.

Comment SystemElasticsearchHBase

0 likes · 14 min read

How JD.com Scales Billions of Product Reviews with MySQL, HBase, and Elasticsearch

MaGe Linux Operations

Aug 11, 2016 · Big Data

Essential MapReduce, HBase, and Spark Configuration Parameters for Faster, More Stable Jobs

This article compiles the most frequently used configuration parameters for MapReduce, HBase, and Spark, explaining their purposes and recommended settings to improve job performance, reliability, and resource utilization in big‑data environments.

Big DataHBaseMapReduce

0 likes · 8 min read

Essential MapReduce, HBase, and Spark Configuration Parameters for Faster, More Stable Jobs

dbaplus Community

Aug 1, 2016 · Databases

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

This article traces Facebook's evolution from a small social site to a global platform, explains how its massive data‑storage challenges led to the adoption of NoSQL solutions like Cassandra and HBase, and breaks down the core patterns, consistency models, and scaling techniques that power such large‑scale systems.

CassandraFacebookHBase

0 likes · 15 min read

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

ITPUB

Jun 15, 2016 · Databases

Understanding HBase’s Physical Architecture: Regions, Stores, and WAL

This article explains HBase’s internal architecture, covering the roles of HRegionServer, Client, Zookeeper, Master, RegionServer, the physical storage layout, StoreFile and HFile structures, and the Write-Ahead Log mechanism that ensures data durability and fault tolerance.

HBaseHDFSNoSQL

0 likes · 13 min read

Understanding HBase’s Physical Architecture: Regions, Stores, and WAL

Architecture Digest

Jun 9, 2016 · Databases

Understanding HBase Architecture and Core Principles

This article provides a comprehensive overview of HBase, covering its distributed architecture, component roles, data organization, read/write mechanisms, and best practices for schema and region design to ensure efficient big‑data storage and retrieval.

Big DataHBaseRegionServer

0 likes · 17 min read

Understanding HBase Architecture and Core Principles

dbaplus Community

May 24, 2016 · Databases

Which NoSQL DB Fits Your Node Project? HBase, Redis, MongoDB, Couchbase, LevelDB Compared

This article provides a detailed comparison of five popular NoSQL databases—HBase, Redis, MongoDB, Couchbase, and LevelDB—covering their data models, performance characteristics, CAP classification, Node.js client options, advantages, drawbacks, and ideal use‑cases to help developers choose the right storage solution for a new Node project.

CouchbaseDatabase ComparisonHBase

0 likes · 28 min read

Which NoSQL DB Fits Your Node Project? HBase, Redis, MongoDB, Couchbase, LevelDB Compared

Architect

May 6, 2016 · Big Data

Integrating Kylin, Mondrian, and Saiku to Build an OLAP Analysis Tool

This article describes how the Youzan data team combined Apache Kylin, Mondrian, and Saiku into a three‑layer OLAP system, covering background, component overviews, technical architecture, schema integration challenges, count‑distinct handling, Kylin‑specific SQL quirks, and practical solutions.

Big DataHBaseHive

0 likes · 12 min read

Integrating Kylin, Mondrian, and Saiku to Build an OLAP Analysis Tool