Tagged articles

MetaStore

7 articles · Page 1 of 1

Sep 15, 2022 · Big Data

Bilibili Offline Platform: Migration from Hive to Spark and Large‑Scale Optimizations

This article details Bilibili's evolution of its offline computing platform from Hadoop‑based Hive to Spark, describing the migration process, automated SQL conversion, result verification, stability and performance enhancements, meta‑store optimizations, and future work on remote shuffle and vectorized execution.

Data SkippingHiveMetaStore

0 likes · 28 min read

Bilibili Offline Platform: Migration from Hive to Spark and Large‑Scale Optimizations

ITPUB

Aug 1, 2022 · Big Data

How Bilibili Scaled Offline Computing: Migrating from Hive to Spark and Boosting Performance

This article details Bilibili's evolution from a Hadoop‑based offline platform to a Spark‑driven architecture, covering the Hive‑to‑Spark migration, automated SQL conversion, result validation, stability enhancements, performance tuning, meta‑store federation, and future directions for large‑scale data processing.

Big DataData SkippingHive

0 likes · 31 min read

How Bilibili Scaled Offline Computing: Migrating from Hive to Spark and Boosting Performance

DataFunTalk

Aug 29, 2021 · Big Data

Building and Optimizing the Offline Computing Platform at Autohome: Challenges, Solutions, and Future Plans

This article details the evolution of Autohome's offline computing platform from a 50‑node cluster in 2013 to a multi‑thousand‑node Hadoop ecosystem, describing performance and stability challenges, multi‑tenant operational issues, low resource utilization, and the comprehensive technical solutions and future roadmap implemented to address them.

AI on HadoopMetaStoreOffline Computing

0 likes · 11 min read

Building and Optimizing the Offline Computing Platform at Autohome: Challenges, Solutions, and Future Plans

Big Data Technology Architecture

Apr 13, 2021 · Big Data

Hive Metadata Migration and Merging Tool for Consolidating Multiple Hive Metastores

This article describes how NetEase developed a Hive metadata migration and merging tool that consolidates metadata from multiple independent Hive clusters into a single Hive metastore without moving HDFS data, detailing the challenges, ID handling, database operations, and step‑by‑step migration process.

Data MigrationHiveMetaStore

0 likes · 12 min read

Hive Metadata Migration and Merging Tool for Consolidating Multiple Hive Metastores

DataFunTalk

Mar 10, 2021 · Big Data

Hive MetaStore Challenges and Optimizations at Kuaishou

At Kuaishou, the Hive MetaStore service, which stores metadata for Hive, faced scalability and performance challenges due to massive dynamic partitions and high query volume, leading to a series of architectural optimizations—including read‑write separation, API enhancements, traffic control, and federation—to improve stability and efficiency.

Big DataHiveKuaishou

0 likes · 15 min read

Hive MetaStore Challenges and Optimizations at Kuaishou

Big Data Technology & Architecture

Apr 25, 2020 · Big Data

Integrating SparkSQL with Hive: Configuration, MetaStore Setup, and Example Scala Code

This article explains the differences between Spark on Hive and Hive on Spark, then provides step‑by‑step instructions for configuring Hive MetaStore, setting up SparkSQL to use Hive, and demonstrates a complete Scala program that creates a Hive table, loads data, and queries it.

Big DataData IntegrationHive

0 likes · 7 min read

Integrating SparkSQL with Hive: Configuration, MetaStore Setup, and Example Scala Code

Big Data Technology & Architecture

Apr 17, 2019 · Big Data

Step-by-Step Guide to Installing Hive 2.1.0 on a Hadoop 2.7.1 Cluster (Ubuntu 14.04)

This tutorial provides a comprehensive, step-by-step procedure for setting up Hive 2.1.0 on a Hadoop 2.7.1 cluster running Ubuntu 14.04, covering environment preparation, Hive installation, configuration of environment variables, MySQL metastore integration, client setup, service startup, and basic verification commands.

Big DataHadoopHive

0 likes · 8 min read

Step-by-Step Guide to Installing Hive 2.1.0 on a Hadoop 2.7.1 Cluster (Ubuntu 14.04)