Big Data 3 min read

Understanding Big Data Storage Engines and the Baidu Palo Project

The article highlights the critical role of databases in big data ecosystems, surveys existing open‑source and commercial storage and query engines such as Druid, Kylin, Impala, Greenplum, Vertica and Redshift, and introduces Baidu’s open‑source Palo project, questioning its unique features and performance.

Big Data Technology & Architecture

Aug 24, 2019

Understanding Big Data Storage Engines and the Baidu Palo Project

Big data relies heavily on data storage, and databases are a crucial component of the big data stack. Because of this, the industry continuously seeks better big data storage and query engines. Existing open‑source solutions include Druid, Kylin, and Impala, while commercial options feature EMC Greenplum, HP Vertica, and AWS Redshift.

Baidu’s open‑source Palo project is introduced, prompting questions about what kind of database engine it is, how it differs from the aforementioned engines, and how its performance compares.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Storage Engine open source Palo

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.