Tagged articles
4 articles
Page 1 of 1
Big Data Technology Architecture
Big Data Technology Architecture
May 22, 2022 · Big Data

Delta Lake Overview, File Structure, Metadata, and Its Integration with Alibaba Cloud EMR, DLF, G‑SCD and CDC Solutions

This article introduces Delta Lake as an open‑source storage layer for lake‑house architectures, explains its key features, file and metadata structures, and details how Alibaba Cloud EMR and Data Lake Formation integrate and extend Delta Lake with advanced capabilities such as G‑SCD, CDC, performance optimizations, and future roadmap.

CDCDLFDelta Lake
0 likes · 10 min read
Delta Lake Overview, File Structure, Metadata, and Its Integration with Alibaba Cloud EMR, DLF, G‑SCD and CDC Solutions
Alibaba Cloud Developer
Alibaba Cloud Developer
May 13, 2022 · Big Data

Unlocking Delta Lake: Key Features, Architecture, and EMR Integration

Delta Lake, an open‑source storage layer from Databricks, provides ACID transactions, data versioning, schema evolution, and unified batch‑stream processing, with a detailed file structure and metadata mechanism, while Alibaba Cloud EMR enhances it with advanced DML, performance optimizations, deep DLF integration, and solutions for G‑SCD and CDC.

CDCDLFData Lakehouse
0 likes · 11 min read
Unlocking Delta Lake: Key Features, Architecture, and EMR Integration
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 18, 2021 · Big Data

Slowly Changing Dimensions (SCD) – Design Principles, Challenges, and Hive Implementation

This article explains the concept of Slowly Changing Dimensions (SCD), discusses practical design questions, compares three change‑tracking requirements, presents three implementation patterns, and provides detailed Hive/SQL examples for historical data initialization and incremental updates in large‑scale data warehouses.

Big DataSCDdata-warehouse
0 likes · 20 min read
Slowly Changing Dimensions (SCD) – Design Principles, Challenges, and Hive Implementation