Tag

ByteHouse

0 views collected around this technical thread.

ByteDance Data Platform
ByteDance Data Platform
Jan 9, 2025 · Databases

Why ByteHouse’s GIS Engine Beats Traditional Spatial Databases in Real‑World Analytics

This article explains how ByteHouse integrates high‑performance GIS capabilities into its OLAP engine, describes its spatial indexing architecture, showcases benchmark results against ClickHouse, StarRocks, PostGIS and DuckDB using the NYC Taxi dataset, and outlines when to choose ByteHouse versus other spatial database solutions.

ByteHouseGISGeospatial Analytics
0 likes · 11 min read
Why ByteHouse’s GIS Engine Beats Traditional Spatial Databases in Real‑World Analytics
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 8, 2025 · Databases

ByteHouse GIS: High‑Performance Geospatial Analytics and Benchmark Comparison with ClickHouse, StarRocks, PostGIS, and DuckDB

The article explains ByteHouse's GIS capabilities, describing its R‑Tree and Google S2 spatial index implementation, OGC‑compatible data types and functions, and presents benchmark results that show ByteHouse outperforming ClickHouse, StarRocks, PostGIS, and DuckDB on key geospatial queries.

ByteHouseGISOLAP
0 likes · 13 min read
ByteHouse GIS: High‑Performance Geospatial Analytics and Benchmark Comparison with ClickHouse, StarRocks, PostGIS, and DuckDB
ByteDance Data Platform
ByteDance Data Platform
Oct 16, 2024 · Databases

How ByteHouse Boosted Sales Data Platform Queries Up to 16× with ACL and Optimizer

This article examines a fast‑growing company's sales data platform, outlines the data‑access pain points caused by ACL permissions, describes the migration from ClickHouse to ByteHouse, details the optimizer’s rule‑based, cost‑based, and distributed‑plan enhancements, and presents benchmark results showing query speedups of up to sixteen times.

ACLByteHouseOLAP
0 likes · 16 min read
How ByteHouse Boosted Sales Data Platform Queries Up to 16× with ACL and Optimizer
DataFunTalk
DataFunTalk
May 9, 2024 · Databases

ByteHouse Vector Search Technical Guide: Architecture, Design, and Performance Optimizations

This guide explains ByteHouse’s high‑performance vector search capabilities, covering the background of vector retrieval for LLMs, the limitations of its existing skip‑index architecture, the new vector‑index design with HNSW and IVF, query‑time optimizations, performance benchmarks against Milvus, and future development plans.

ByteHouseIndexingLLM
0 likes · 8 min read
ByteHouse Vector Search Technical Guide: Architecture, Design, and Performance Optimizations
DataFunTalk
DataFunTalk
Apr 15, 2024 · Databases

ByteHouse Cloud‑Native Data Warehouse Performance Whitepaper: Architecture, Optimizations, and Benchmark Results

The ByteHouse performance whitepaper details the cloud‑native data warehouse’s architecture, rule‑based and cost‑based optimizer enhancements, exchange runtime, runtime filters, parallelism and wide‑table optimizations, and presents benchmark comparisons on TPC‑DS, TPC‑H and SSB datasets demonstrating orders‑of‑magnitude query speed improvements.

ByteHouseData WarehouseOLAP
0 likes · 17 min read
ByteHouse Cloud‑Native Data Warehouse Performance Whitepaper: Architecture, Optimizations, and Benchmark Results
ByteDance Data Platform
ByteDance Data Platform
Mar 6, 2024 · Databases

How ByteHouse Boosted Douyin’s Interest Circle Queries by 100×

This article explains how Douyin rebuilt its interest‑circle platform by replacing MySQL with the columnar OLAP engine ByteHouse, achieving roughly a hundred‑fold improvement in query speed, lower hardware costs, and seamless horizontal scalability for massive daily data volumes.

ByteHouseColumnarStorageDataWarehouse
0 likes · 10 min read
How ByteHouse Boosted Douyin’s Interest Circle Queries by 100×
ByteDance Data Platform
ByteDance Data Platform
Dec 27, 2023 · Databases

How ByteHouse Redefines Cloud‑Native Data Warehousing for Real‑Time Analytics

This article details ByteHouse's evolution from a ClickHouse‑based OLAP engine to a cloud‑native, massively parallel data warehouse, highlighting its distributed and cloud‑native architectures, enhanced table engines, HaKafka and Materialized MySQL extensions, and real‑world use cases in short‑video, marketing and gaming analytics.

Big DataByteHouseHaKafka
0 likes · 20 min read
How ByteHouse Redefines Cloud‑Native Data Warehousing for Real‑Time Analytics
ByteDance Data Platform
ByteDance Data Platform
Jul 5, 2023 · Cloud Native

How to Seamlessly Integrate ByteHouse Cloud Data Warehouse with Apache Airflow

This guide explains how to combine ByteHouse's cloud‑native data warehouse with Apache Airflow to build scalable, automated, and easy‑to‑manage data pipelines, covering business scenarios, data flow, and step‑by‑step installation and DAG creation.

Apache AirflowByteHouseDAG
0 likes · 10 min read
How to Seamlessly Integrate ByteHouse Cloud Data Warehouse with Apache Airflow
DataFunTalk
DataFunTalk
Jul 4, 2023 · Big Data

Integrating Apache Airflow with ByteHouse: A Step‑by‑Step Guide

This guide explains how to integrate Apache Airflow with ByteHouse, highlighting scalability, automated workflow management, and simple deployment, and provides a step‑by‑step tutorial—including prerequisites, installation, configuration, DAG creation, and execution commands—to build a robust data pipeline for analytics and machine learning.

Apache AirflowByteHouseETL
0 likes · 10 min read
Integrating Apache Airflow with ByteHouse: A Step‑by‑Step Guide
DataFunSummit
DataFunSummit
May 30, 2023 · Big Data

DataFunCon Conference – OLAP, StarRocks, ClickHouse, and ByteHouse Technical Sessions

The DataFunCon conference showcases leading experts from Ctrip, Didi, Bilibili, and ByteDance presenting next‑generation OLAP technologies such as StarRocks, ClickHouse, and ByteHouse, covering architecture, materialized views, ELT practices, and performance optimization to guide practitioners in big‑data platform selection and implementation.

Big DataByteHouseClickHouse
0 likes · 7 min read
DataFunCon Conference – OLAP, StarRocks, ClickHouse, and ByteHouse Technical Sessions
DataFunTalk
DataFunTalk
Mar 29, 2023 · Big Data

Evolution of ByteHouse Real‑Time Ingestion: From Internal Demands to a Cloud‑Native Architecture

This article details the motivation, architectural evolution, and technical implementations of ByteHouse's real‑time ingestion pipeline, covering internal business requirements, distributed‑system challenges, the custom HaKafka engine, memory‑table optimizations, and the transition to a cloud‑native design that delivers high availability, low‑latency, and exactly‑once semantics.

ByteHouseHigh AvailabilityKafka
0 likes · 13 min read
Evolution of ByteHouse Real‑Time Ingestion: From Internal Demands to a Cloud‑Native Architecture
ByteDance Data Platform
ByteDance Data Platform
Feb 15, 2023 · Databases

How ByteHouse Powers Real‑Time Data Warehousing at Scale

ByteHouse, a cloud‑native data warehouse built on ClickHouse, delivers ultra‑fast real‑time and massive offline analytics with elastic scaling, addressing business needs in ByteDance and the financial sector through optimized architecture, ROI‑driven monitoring, and comprehensive operational tools.

Big DataByteHouseClickHouse
0 likes · 16 min read
How ByteHouse Powers Real‑Time Data Warehousing at Scale
ByteDance Data Platform
ByteDance Data Platform
Jan 4, 2023 · Databases

How ByteHouse Enhances ClickHouse with Resource Isolation and High Availability

This article explains how ByteHouse, an enhanced version of ClickHouse used at ByteDance, adds full upsert support, multi‑table joins, high‑availability features, and, most importantly, a Resource Group mechanism that provides fine‑grained CPU, memory, and concurrency isolation to improve query performance and stability.

ByteHouseClickHouseResource Isolation
0 likes · 8 min read
How ByteHouse Enhances ClickHouse with Resource Isolation and High Availability
DataFunTalk
DataFunTalk
Nov 10, 2022 · Big Data

Enhancing ClickHouse Resource Isolation with ByteHouse Resource Group

This article explains how ByteHouse extends ClickHouse with a Resource Group mechanism that provides fine‑grained concurrency, memory, and CPU isolation, improving query latency, reducing variance, and increasing cluster stability for large‑scale ad‑tech workloads.

Big DataByteHouseClickHouse
0 likes · 8 min read
Enhancing ClickHouse Resource Isolation with ByteHouse Resource Group
DataFunTalk
DataFunTalk
Oct 25, 2022 · Databases

Design and Implementation of ByteHouse Query Optimizer

The article explains how ByteHouse extends ClickHouse with a full‑featured query optimizer—including rule‑based and cost‑based techniques, analyzer modules, plan construction, and distributed optimization—to overcome ClickHouse limitations and achieve significant performance gains on complex OLAP workloads.

ByteHouseCBODistributed Query
0 likes · 10 min read
Design and Implementation of ByteHouse Query Optimizer
DataFunTalk
DataFunTalk
Oct 11, 2022 · Databases

Enhancing ClickHouse Multi‑Table Join Capability with ByteHouse

This article explains the limitations of ClickHouse for multi‑table joins, describes ByteHouse’s staged execution model, various join strategies (Shuffle, Broadcast, Colocate) and runtime filters, and presents performance benchmarks that show significant speed‑ups over the original ClickHouse engine.

Big DataByteHouseClickHouse
0 likes · 10 min read
Enhancing ClickHouse Multi‑Table Join Capability with ByteHouse
DataFunSummit
DataFunSummit
Oct 7, 2022 · Databases

Optimizing Complex Queries in ClickHouse: Multi‑Stage Execution, Exchange Management, and Performance Enhancements

This article explains how ByteHouse (a heavily optimized ClickHouse variant) tackles complex query challenges by introducing a multi‑stage execution model, exchange mechanisms, runtime filters, and network optimizations, and it presents performance results and future directions for large‑scale OLAP workloads.

ByteHouseClickHouseDatabase Optimization
0 likes · 21 min read
Optimizing Complex Queries in ClickHouse: Multi‑Stage Execution, Exchange Management, and Performance Enhancements
DataFunTalk
DataFunTalk
Sep 5, 2022 · Databases

Optimizing Complex Queries in ClickHouse: Multi‑Stage Execution, Exchange Management, and Runtime Filters

This article explains how ByteHouse, a heavily optimized ClickHouse variant, addresses complex query challenges by introducing a multi‑stage execution model, sophisticated exchange management, various join strategies, runtime filters, and diagnostic metrics to improve performance, scalability, and resource utilization in large‑scale data environments.

ByteHouseClickHouseDistributed Query
0 likes · 21 min read
Optimizing Complex Queries in ClickHouse: Multi‑Stage Execution, Exchange Management, and Runtime Filters
ByteDance Data Platform
ByteDance Data Platform
Aug 22, 2022 · Databases

How ByteHouse Supercharges ClickHouse with Upsert, Joins, and High Availability

ByteHouse, built on ClickHouse, addresses key limitations such as missing upsert/delete, weak multi‑table joins, scalability issues, and lack of resource isolation by introducing a modular, stage‑based execution engine, advanced join strategies, runtime filters, and a custom optimizer, delivering dramatically faster query performance.

ByteHouseClickHouseDatabase Optimization
0 likes · 11 min read
How ByteHouse Supercharges ClickHouse with Upsert, Joins, and High Availability