Big Data 13 min read

What’s New in Apache Kyuubi 1.6.0? Server, Client, and Engine Enhancements

Apache Kyuubi 1.6.0 introduces major server‑side upgrades such as batch JAR task submission with RESTful APIs and a metadata store for HA, client‑side improvements including a unified JDBC driver and enhanced Beeline, plus mature Spark, Flink, Trino, and Hive engine plugins, while outlining the community’s roadmap.

ITPUB

Mar 13, 2023

What’s New in Apache Kyuubi 1.6.0? Server, Client, and Engine Enhancements

Overview

Apache Kyuubi is an open‑source enterprise data‑lake exploration platform that acts as a distributed, multi‑tenant gateway providing SQL query services for engines such as Spark, Flink, and Trino. It supports multi‑tenant isolation, high availability, and diverse workloads for ETL, BI reporting, interactive analytics, and batch processing.

Server‑Side Enhancements

Batch (JAR) Task Submission

Kyuubi 1.6.0 adds a RESTful API that allows users to submit batch JAR tasks. A POST request creates a batch, returning a BatchId that is attached to the Spark submit configuration and propagated to YARN. The BatchId links the Kyuubi server, YARN, and Spark, enabling log retrieval, status queries, and graceful termination via DELETE requests.

Metadata Store for HA

A new metadata store records batch metadata (BatchId, configuration, creator node) and makes it visible to all Kyuubi nodes. In HA deployments, a load‑balancer forwards requests to any node; if the target node lacks the batch locally, it queries the metadata store to locate the originating node and fetch logs. The store also supports recovery after server restarts and asynchronous retry when the underlying MySQL store is temporarily unavailable, falling back to YARN for status when needed.

Unified API and Authentication

Kyuubi 1.6.0 unifies its interfaces—Thrift, REST, JDBC, and ODBC—and adds support for both Kerberos and password authentication across all protocols, extending security beyond the previous Thrift‑only model.

Restful CLI and SDK

A new command‑line tool kyuubi-ctl and accompanying SDK simplify batch management. Commands follow the pattern kyuubi-ctl <action> batch <yaml> where actions include create, get, logs, delete, and a composite submit command.

Client‑Side Enhancements

Improved Built‑in JDBC Driver

Removed Hive and Hadoop dependencies, making the driver lighter.

Added support for Kerberos authentication via keytab.

Enhanced Beeline

Beeline now displays a Spark progress bar, showing stage‑wise execution details and overall progress.

Restful CLI Usage

The kyuubi-ctl tool accepts a YAML file specifying the JAR location, batch type (currently Spark, with Flink support in progress), main class, arguments, and configuration, enabling one‑line batch submission without extensive local Spark setup.

Engine Plugins

Spark Engine

Supports Spark 3.0‑3.3 across all deployment modes (local, standalone, YARN, K8s) and both client and cluster modes.

Enterprise‑grade plugins: automatic small‑file merging, max partition scan limit, result size limit, Z‑Order optimization, TPC‑DS/TPC‑H connectors, and Authz plugin.

Ongoing development of lineage plugins.

Flink Engine

Supports Flink 1.14 and 1.15 (1.16 pending).

Deployment modes: Local, YARN (PerJob and Session); Application mode on YARN/K8s slated for 1.7.0.

Trino and Hive/JDBC Engines

Trino Engine is production‑ready and widely used; Hive and JDBC engines are available in beta, inviting community feedback.

Community Outlook

Kyuubi graduated from the Apache incubator and is currently in the graduation voting phase. Since 2021 the project has released four major versions, with version 1.6.0 completing the planned roadmap and delivering most enterprise‑grade features.

Community statistics: 12 PPMC members, 17 committers, over 96 contributors, 65+ dev‑mail subscribers, four community meetups, and participation in more than 20 events such as ApacheCon.

To date, eight releases have been published, encompassing over 1,400 merged pull requests and the resolution of more than 900 issues.

The project continues to evolve toward a “Serverless SQL on Lakehouse” vision, with future plans to add more enterprise‑level capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Flink Spark Kyuubi SQL Gateway Engine Plugins

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.