Big Data 13 min read

Apache Kyuubi 1.8: New Features and Enhancements Overview

Apache Kyuubi 1.8 introduces a range of enhancements including multi‑tenant serverless SQL support on Spark and Flink, expanded batch and streaming capabilities, improved resource scheduling with database‑backed queues, stronger Kerberos/LDAP security, Flink YARN integration, and a new web UI for management.

DataFunSummit

Dec 17, 2023

Apache Kyuubi 1.8: New Features and Enhancements Overview

Apache Kyuubi is a distributed, multi‑tenant enterprise‑grade data gateway built on Spark, Flink, Trino and other engines, providing serverless SQL services on lakehouse platforms.

Version 1.8 adds several major improvements: support for online analytical workloads and offline batch jobs via the Kyuubi Thrift API, a lightweight Kyuubi Server that launches engines on demand, and clear concepts of Session and Operation that map to JDBC connections and Spark sessions.

Batch processing is enhanced with a new Batch V2 design that introduces a database‑backed queue and a dedicated Submitter thread pool, enabling fine‑grained concurrency control, global FIFO and priority scheduling, and better HA handling.

Streaming capabilities are strengthened with Flink engine integration, including support for Flink YARN Application Mode, compatibility with Flink 1.16‑1.18, extended time‑type handling, and result‑set transmission for unbounded streams.

Enterprise‑level security features are expanded: deep Kerberos adaptation, simultaneous Kerberos/LDAP authentication, Hadoop delegation token renewal, and Authz plugins for Iceberg and other lakehouse tables.

The 1.8 release also brings a new web UI for monitoring engines, sessions and operations, and updates for Java 17, Scala 2.13, Hive 3.1, and compatibility with Spark 3.1‑3.4 and Flink 1.16‑1.18.

Community contributions from NetEase, Baidu, Cisco and others continue to grow, with the project graduating to an Apache top‑level project in 2023 and being adopted by many enterprises on public clouds.

In the Q&A, solutions for large queries affecting engine stability were discussed, including history‑based optimization (HBO) and exposing resource‑isolation options to end users.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Serverless Big Data Flink security Spark Apache Kyuubi SQL Gateway

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.