Tagged articles

Hive Integration

6 articles · Page 1 of 1

Sep 11, 2021 · Big Data

Deep Dive into Flink Table & SQL Window Functions, UDFs, and Hive Integration

This article provides a comprehensive guide to Flink Table and SQL window semantics—including group, tumbling, sliding, and session windows—covers over windows, demonstrates how to define windows in SQL, explains built‑in functions, shows how to implement scalar, table, aggregate and table‑aggregate UDFs, and details Flink's integration with Hive, complete with Maven dependencies and runnable examples.

FlinkHive IntegrationSQL

0 likes · 27 min read

Deep Dive into Flink Table & SQL Window Functions, UDFs, and Hive Integration

Big Data Technology & Architecture

Aug 15, 2021 · Big Data

Spark SQL Interview Guide: Concepts, APIs, Optimization and Common Pitfalls

This article provides a comprehensive overview of Spark SQL, covering its architecture, DataSet/DataFrame APIs, code examples for creating and querying datasets, join strategy selection, handling Hive tables, small‑file issues, inefficient NOT‑IN subqueries, Cartesian products, and a catalog of useful built‑in functions.

Hive IntegrationPerformance OptimizationSpark SQL

0 likes · 40 min read

Spark SQL Interview Guide: Concepts, APIs, Optimization and Common Pitfalls

DataFunTalk

Jun 29, 2021 · Big Data

In-depth Analysis of Flink SQL 1.13 Features and Improvements

This article provides a comprehensive overview of Apache Flink SQL 1.13, detailing new Window TVF support, cumulate windows, performance optimizations, time‑zone handling, enhanced Hive compatibility, SQL client upgrades, DataStream‑Table conversion improvements, and outlines the roadmap for the upcoming 1.14 release.

DataStreamFlinkHive Integration

0 likes · 15 min read

In-depth Analysis of Flink SQL 1.13 Features and Improvements

Big Data Technology & Architecture

Aug 23, 2020 · Big Data

Apache Hudi Overview, Core Concepts, and Quick‑Start Guide

This article introduces Apache Hudi, explaining its storage types, query views, timeline feature, typical use cases such as near‑real‑time ingestion and incremental pipelines, and provides a step‑by‑step Scala/Spark quick‑start guide with code examples for compiling, inserting, updating, querying, and syncing data to Hive.

Apache HudiBig DataData Lake

0 likes · 18 min read

Apache Hudi Overview, Core Concepts, and Quick‑Start Guide

Big Data Technology Architecture

Feb 12, 2020 · Big Data

Apache Flink 1.10 Release: New Features, Optimizations, and Kubernetes Integration

Apache Flink 1.10 introduces major performance and stability improvements, unified memory configuration, native Kubernetes session mode, enhanced Table API/SQL with production‑ready Hive integration, expanded Python UDF support, and a host of important bug fixes and connector updates, marking the largest community‑driven release to date.

Apache FlinkHive IntegrationPython

0 likes · 17 min read

Apache Flink 1.10 Release: New Features, Optimizations, and Kubernetes Integration

Alibaba Cloud Developer

Aug 23, 2019 · Big Data

What’s New in Apache Flink 1.9.0? Deep Dive into Architecture, Table API & Hive Integration

Apache Flink 1.9.0, released on August 22, merges Alibaba's Blink engine, introduces a major architecture overhaul, enriches Table API & SQL, adds batch and stream processing enhancements, and integrates tightly with Hive, marking a significant milestone for large‑scale data processing.

Apache FlinkBatch ProcessingHive Integration

0 likes · 14 min read

What’s New in Apache Flink 1.9.0? Deep Dive into Architecture, Table API & Hive Integration