Databases 5 min read

How to Use Phoenix SQL on HBase: Quick Guide with Code Examples

Phoenix adds a SQL layer to HBase, enabling easy table creation, data import, and complex queries via JDBC, with features like secondary indexes and integration with Spark, Hive, and more, illustrated through step‑by‑step examples and sample code.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
How to Use Phoenix SQL on HBase: Quick Guide with Code Examples

What is Phoenix

In short, Phoenix is a framework that lets us operate HBase databases using SQL.

HBase is a NoSQL database; its shell client only supports simple operations and can be confusing.

For example, the following picture shows a table returning all data.

The output is hard to read, and complex queries require writing native HBase API programs, which is cumbersome.

Using Phoenix, you can query with SQL, which is convenient and efficient, and you can add secondary indexes to improve performance, among other useful features.

Phoenix supports JDBC code to operate HBase, which is more convenient than the native API.

Usage Examples

Create Table

CREATE TABLE IF NOT EXISTS us_population (
     state CHAR(2) NOT NULL,
     city VARCHAR NOT NULL,
     population BIGINT
     CONSTRAINT my_pk PRIMARY KEY (state, city));

Show Tables

0: jdbc:phoenix:localhost> !tables

View in HBase: hbase(main):041:0> list Result:

TABLE
SYSTEM.CATALOG
SYSTEM.FUNCTION
SYSTEM.SEQUENCE
SYSTEM.STATS
US_POPULATION
...

Successfully created US_POPULATION table.

Add Data

First create a test data file us_population.csv with the following content:

NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332

Run the command to import the file into the database: ./psql.py localhost us_population.csv Query table data:

0: jdbc:phoenix:localhost> select * from US_POPULATION;

Example

Phoenix includes a small web statistics example. First import it:

bin/psql.py localhost examples/WEB_STAT.sql examples/WEB_STAT.csv
First execute the create‑table SQL, then import the CSV data file.

Query table data:

0: jdbc:phoenix:localhost> select * from WEB_STAT;
CORE , DB fields represent CPU and database usage.

Group by DOMAIN to view average CPU and DB usage per group:

SELECT DOMAIN, AVG(CORE) Average_CPU_Usage, AVG(DB) Average_DB_Usage
FROM WEB_STAT
GROUP BY DOMAIN
ORDER BY DOMAIN DESC;

View access counts per domain, sorted descending:

select domain,count(1) num 
from web_stat 
group by domain 
order by num desc;

Summary

Phoenix’s core function is adding a SQL layer on top of HBase, making HBase easier to use.

It offers many valuable features such as secondary indexes, namespace mapping, views, multi‑tenant support, dynamic columns, transactions, and more.

It is now mature and can integrate with Spark, Hive, Pig, MapReduce, and Flume plugins.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLdatabaseHBaseJDBCPhoenix
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.