Tagged articles
312 articles
Page 4 of 4
Qunar Tech Salon
Qunar Tech Salon
Apr 30, 2016 · Big Data

Designing and Optimizing Log Storage and Query in HBase

This article analyzes the characteristics of log data, explains why HBase is chosen for log storage, discusses the shortcomings of self‑built indexes, and presents optimization strategies such as rowKey design, filter usage, coprocessor integration, and third‑party indexing to improve query performance.

HBaseRowkey Designindexing
0 likes · 12 min read
Designing and Optimizing Log Storage and Query in HBase
21CTO
21CTO
Apr 16, 2016 · Databases

Optimizing HBase Log Queries: Index Design and RowKey Strategies

This article examines the challenges of storing and querying log data in HBase, outlines the drawbacks of custom indexing, and presents practical rowKey design, filter usage, and integration with external search engines to improve query performance.

Big DataHBaseNoSQL
0 likes · 15 min read
Optimizing HBase Log Queries: Index Design and RowKey Strategies
ITPUB
ITPUB
Jan 11, 2016 · Big Data

Building Real‑Time Recommendations with Kiji: A Hands‑On Guide

This article explains how to use the open‑source Kiji framework together with HBase, Avro, and MapReduce to build a scalable, entity‑centric real‑time recommendation system that can instantly refresh suggestions based on user context and recent interactions.

AvroHBaseKiji
0 likes · 13 min read
Building Real‑Time Recommendations with Kiji: A Hands‑On Guide
High Availability Architecture
High Availability Architecture
Jan 3, 2016 · Databases

HBase 2015 Technical Developments

An overview of HBase’s 2015 milestones—including the stable 1.0 release, clearer API with BufferedMutator, multi‑region replicas for high availability, family‑level flush improvements, RPC call‑queue separation, online configuration changes, and various Q&A on performance, replication, and best‑practice design considerations.

APIHBasedatabase
0 likes · 10 min read
HBase 2015 Technical Developments
21CTO
21CTO
Nov 28, 2015 · Databases

Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?

While Hadoop enjoys a strong reputation in big‑data applications, the article argues that NoSQL databases—specifically MongoDB, Cassandra, and HBase—are more widely deployed, comparing their strengths, use cases, and market popularity to help developers decide which technology best fits their needs.

HBaseNoSQLcassandra
0 likes · 10 min read
Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?
Efficient Ops
Efficient Ops
Oct 14, 2015 · Big Data

Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A

During a lively “Sit and Discuss” session, experts compared Spark and Hadoop, evaluated Flink against Spark, contrasted HBase with Cassandra, explained why Kafka (and sometimes Flink) is preferred for distributed messaging, and shared insights on Tachyon’s role in modern big‑data ecosystems.

FlinkHBaseHadoop
0 likes · 10 min read
Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A
21CTO
21CTO
Sep 15, 2015 · Databases

Zero‑Downtime Online Data Migration: Step‑by‑Step Guide

This article explains how to migrate live service data between systems without downtime, covering migration types, a four‑stage process, practical examples with MySQL and HBase, and key techniques for ensuring consistency and smooth cut‑over.

Data ConsistencyHBaseMySQL
0 likes · 11 min read
Zero‑Downtime Online Data Migration: Step‑by‑Step Guide
21CTO
21CTO
Aug 14, 2015 · Big Data

Mastering HBase: Table Architecture, API Usage, and Performance Tuning

This article explains HBase's column‑oriented data model, demonstrates Java API examples for creating, reading, and deleting tables, and provides practical optimization techniques—including pre‑splitting, Rowkey design, ColumnFamily reduction, caching, and compaction settings—to improve read/write performance in large‑scale deployments.

Database OptimizationHBaseJava API
0 likes · 19 min read
Mastering HBase: Table Architecture, API Usage, and Performance Tuning

Design Choices for Distributed Storage Metadata: Comparing GlusterFS, Hadoop, GridFS, HBase, and FastDFS

The article examines various distributed storage design approaches—decentralized (GlusterFS), centralized (Hadoop), database‑based (GridFS and HBase), and metadata‑bypassing (FastDFS)—detailing their advantages, drawbacks, and practical considerations for cloud storage systems.

FastDFSGlusterFSGridFS
0 likes · 17 min read
Design Choices for Distributed Storage Metadata: Comparing GlusterFS, Hadoop, GridFS, HBase, and FastDFS

Mastering HBase: Table Structure, API Usage, and Performance Tuning

This article explains HBase's column‑oriented architecture, key concepts such as Rowkey, ColumnFamily, and Region, provides Java API examples for table operations, and offers practical optimization techniques—including pre‑splitting, Rowkey design, caching, and compaction settings—to improve read/write performance.

Big DataDatabase OptimizationHBase
0 likes · 20 min read
Mastering HBase: Table Structure, API Usage, and Performance Tuning

Non‑Intrusive High‑Performance Complex Query Engine for HBase Using Secondary Multi‑Column Indexes

This article presents a non‑intrusive, high‑performance engine that adds secondary multi‑column indexes to Apache HBase, enabling efficient complex condition queries while preserving HBase's scalability, and details its principles, architecture, query API, index configuration, and practical trade‑offs.

CoprocessorHBaseNoSQL
0 likes · 18 min read
Non‑Intrusive High‑Performance Complex Query Engine for HBase Using Secondary Multi‑Column Indexes