Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 17, 2026 · Big Data

What Spark 4.0 Brings: VARIANT Type, Native SQL UDFs, and Serverless Enhancements

Apache Spark 4.0 introduces a high‑performance VARIANT data type for semi‑structured JSON, native SQL UDFs that eliminate Python UDF bottlenecks, a richer Python DataSource API, a new pipeline syntax, upgraded Structured Streaming state management, and Alibaba Cloud EMR Serverless optimizations that together deliver up to 30% speed gains and seamless migration from Spark 3.x.

Apache SparkPython APISQL UDF
0 likes · 12 min read
What Spark 4.0 Brings: VARIANT Type, Native SQL UDFs, and Serverless Enhancements
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 13, 2025 · Databases

Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts

The Apache Doris 3.1 release strengthens lake‑house capabilities with major upgrades to the VARIANT data type, vertical compaction, inverted index storage, new tokenizers, enhanced materialized view support for Iceberg/Paimon/Hudi, and numerous query‑performance optimizations such as faster partition pruning and dynamic partition clipping, offering smoother handling of thousands of columns and large‑scale semi‑structured data.

Apache DorisIndexQuery Performance
0 likes · 8 min read
Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts