How to Use Phoenix SQL on HBase: Quick Guide with Code Examples
Phoenix adds a SQL layer to HBase, enabling easy table creation, data import, and complex queries via JDBC, with features like secondary indexes and integration with Spark, Hive, and more, illustrated through step‑by‑step examples and sample code.
What is Phoenix
In short, Phoenix is a framework that lets us operate HBase databases using SQL.
HBase is a NoSQL database; its shell client only supports simple operations and can be confusing.
For example, the following picture shows a table returning all data.
The output is hard to read, and complex queries require writing native HBase API programs, which is cumbersome.
Using Phoenix, you can query with SQL, which is convenient and efficient, and you can add secondary indexes to improve performance, among other useful features.
Phoenix supports JDBC code to operate HBase, which is more convenient than the native API.
Usage Examples
Create Table
CREATE TABLE IF NOT EXISTS us_population (
state CHAR(2) NOT NULL,
city VARCHAR NOT NULL,
population BIGINT
CONSTRAINT my_pk PRIMARY KEY (state, city));Show Tables
0: jdbc:phoenix:localhost> !tablesView in HBase: hbase(main):041:0> list Result:
TABLE
SYSTEM.CATALOG
SYSTEM.FUNCTION
SYSTEM.SEQUENCE
SYSTEM.STATS
US_POPULATION
...Successfully created US_POPULATION table.
Add Data
First create a test data file us_population.csv with the following content:
NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332Run the command to import the file into the database: ./psql.py localhost us_population.csv Query table data:
0: jdbc:phoenix:localhost> select * from US_POPULATION;Example
Phoenix includes a small web statistics example. First import it:
bin/psql.py localhost examples/WEB_STAT.sql examples/WEB_STAT.csvFirst execute the create‑table SQL, then import the CSV data file.
Query table data:
0: jdbc:phoenix:localhost> select * from WEB_STAT;CORE , DB fields represent CPU and database usage.
Group by DOMAIN to view average CPU and DB usage per group:
SELECT DOMAIN, AVG(CORE) Average_CPU_Usage, AVG(DB) Average_DB_Usage
FROM WEB_STAT
GROUP BY DOMAIN
ORDER BY DOMAIN DESC;View access counts per domain, sorted descending:
select domain,count(1) num
from web_stat
group by domain
order by num desc;Summary
Phoenix’s core function is adding a SQL layer on top of HBase, making HBase easier to use.
It offers many valuable features such as secondary indexes, namespace mapping, views, multi‑tenant support, dynamic columns, transactions, and more.
It is now mature and can integrate with Spark, Hive, Pig, MapReduce, and Flume plugins.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
