Tagged articles
57 articles
Page 1 of 1
Architect-Kip
Architect-Kip
Mar 2, 2026 · Big Data

How to Build a Scalable Tiered Archive & Query System for MySQL Data

This article presents a comprehensive design for a layered storage and unified scheduling platform that archives MySQL historical data, reduces storage costs, ensures high‑performance queries, and enables efficient data analysis through tiered hot, warm, and cold storage using big‑data technologies.

FlinkHiveSpark
0 likes · 13 min read
How to Build a Scalable Tiered Archive & Query System for MySQL Data
ITPUB
ITPUB
Feb 13, 2026 · Big Data

Real‑Time Sync of New MySQL Tables to Doris Using Flink CDC

This article explains how to extend a Flink CDC job that already syncs an entire MySQL database to Doris so that newly created tables are automatically created in Doris in real time, using the CdcTools utility, side‑output streams, and asynchronous I/O.

CDCCdcToolsFlink
0 likes · 9 min read
Real‑Time Sync of New MySQL Tables to Doris Using Flink CDC
ITPUB
ITPUB
Feb 9, 2026 · Databases

ClickHouse vs Doris vs Redis: Real‑World Query Performance Test with Flink

Using a 600k‑record IP range dataset, we built identical tables in ClickHouse and Doris, and a Redis skip‑list store, then ran three Flink‑Kafka streaming jobs to compare query latency across the three databases under varying traffic rates, revealing Redis as fastest, ClickHouse second, Doris slowest.

ClickHouseDatabase PerformanceFlink
0 likes · 8 min read
ClickHouse vs Doris vs Redis: Real‑World Query Performance Test with Flink
ITPUB
ITPUB
Jan 22, 2026 · Backend Development

Sync New MySQL Tables to Doris in Real‑Time with Flink CDC and CdcTools

This article explains how to use Flink CDC together with the CdcTools utility to automatically capture newly created MySQL tables and synchronize both their schema and data to a Doris database in real time, covering the required code, side‑output handling, async execution, and a special delete‑sign field.

Async IOCDCFlink
0 likes · 10 min read
Sync New MySQL Tables to Doris in Real‑Time with Flink CDC and CdcTools
ITPUB
ITPUB
Jan 15, 2026 · Databases

How to Migrate ClickHouse Data to Doris: Three Practical Strategies Tested

Facing a ClickHouse cluster shutdown, the author explores three migration methods—using Doris’s ClickHouse catalog, exporting to files with Broker/Stream Load, and Spark—to transfer ~10 billion rows to Doris, evaluating each for simplicity, bugs, and performance, and sharing detailed steps, code snippets, and benchmark results.

ClickHouseData MigrationSQL
0 likes · 9 min read
How to Migrate ClickHouse Data to Doris: Three Practical Strategies Tested
ITPUB
ITPUB
Dec 26, 2025 · Databases

How to Migrate 100 Billion ClickHouse Rows to Doris: Three Practical Strategies

When a ClickHouse cluster needed to be decommissioned, the author evaluated three migration approaches—using Doris' ClickHouse catalog, exporting to files with Broker/Stream Load, and leveraging Spark—to move roughly 100 billion rows to Doris, comparing their complexity, reliability, and performance.

CatalogClickHouseSQL
0 likes · 9 min read
How to Migrate 100 Billion ClickHouse Rows to Doris: Three Practical Strategies
dbaplus Community
dbaplus Community
Dec 8, 2025 · Databases

Which Database Wins IP Range Lookups? ClickHouse vs Doris vs Redis Benchmarks

This article presents a systematic benchmark comparing ClickHouse, Doris, and Redis for IP‑range dimension lookups using Flink‑Kafka pipelines, detailing test design, result table schema, query interfaces, and performance results across varying data rates, concluding that Redis offers the fastest and most stable query latency.

ClickHouseDatabase BenchmarkFlink
0 likes · 7 min read
Which Database Wins IP Range Lookups? ClickHouse vs Doris vs Redis Benchmarks
Code Ape Tech Column
Code Ape Tech Column
Oct 8, 2025 · Databases

Boost Your Data Ingestion: A High‑Performance Java Stream Load Architecture for Doris

This article presents a complete Java‑based architecture for high‑throughput Doris stream loading, covering project structure, Maven dependencies, configuration properties, field‑mapping annotations, automatic mapper utilities, a robust parallel loader with retry and compression, plus performance tuning recommendations.

Annotation MappingJavaPerformance Optimization
0 likes · 23 min read
Boost Your Data Ingestion: A High‑Performance Java Stream Load Architecture for Doris
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 17, 2025 · Big Data

Lakehouse Implementations at Leading Companies: Challenges, Solutions, and Benefits

This article reviews how major tech firms such as Alibaba, Tencent, ByteDance, and Kuaishou tackled lakehouse challenges—including architecture fragmentation, cost, scalability, and complex multimodal data—by adopting real‑time lakehouse solutions like Flink + Paimon, Iceberg + StarRocks, Hudi + LAS, and Doris + Alluxio, and outlines the resulting performance and cost gains.

FlinkLakehousePaimon
0 likes · 9 min read
Lakehouse Implementations at Leading Companies: Challenges, Solutions, and Benefits
DataFunSummit
DataFunSummit
Mar 2, 2025 · Artificial Intelligence

Lightweight Algorithm Service Architecture Based on Offline Tag Knowledge Base and Real‑time Data Warehouse

This article presents a lightweight algorithm service solution that combines an offline pre‑computed tag knowledge base with a real‑time data warehouse using Flink, Doris, Hive SQL and Python to achieve short development cycles, agile iteration, low cost, and scalable deployment for classification and clustering tasks.

Flinkalgorithm servicedoris
0 likes · 16 min read
Lightweight Algorithm Service Architecture Based on Offline Tag Knowledge Base and Real‑time Data Warehouse
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 4, 2024 · Databases

Detailed Analysis of Doris SQL Execution Process: Optimizer, Scheduler, and Executor

This article provides a comprehensive walkthrough of Doris's SQL execution pipeline, covering the query optimizer's parsing, rewriting, and plan generation, the scheduler's fragment distribution, and the executor's fragment processing, including code examples of expression rewrite rules, join strategies, and data flow between FE and BE nodes.

Distributed ExecutionQuery OptimizerSQL
0 likes · 30 min read
Detailed Analysis of Doris SQL Execution Process: Optimizer, Scheduler, and Executor
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 22, 2024 · Big Data

Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective

This article reviews the emerging lakehouse architecture, outlines its core frameworks such as Hudi, Iceberg, Paimon, Flink, and Doris, discusses their storage‑compute separation, read‑write optimizations, and highlights how companies of different sizes adopt these technologies based on cost, efficiency, and specific business scenarios.

Data ArchitectureFlinkLakehouse
0 likes · 6 min read
Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 18, 2024 · Databases

Doris Performance Optimization: OLAP Query, Indexes, Vectorized Execution, and High‑Concurrency Point Queries

This article explains how Apache Doris achieves high‑concurrency OLAP and point‑query performance through MPP architecture, columnar storage, partition‑bucket pruning, various indexes, materialized views, vectorized execution, runtime filters, short‑circuit planning, and prepared‑statement caching.

OLAPdorishigh concurrency
0 likes · 12 min read
Doris Performance Optimization: OLAP Query, Indexes, Vectorized Execution, and High‑Concurrency Point Queries
DataFunSummit
DataFunSummit
Aug 26, 2024 · Big Data

Building a Doris‑Based Lakehouse Integrated Analytics System at Kuaishou

This article presents Kuaishou's experience of designing and implementing a Doris‑driven lakehouse integrated analytics system, covering the current OLAP landscape, challenges of data duplication and governance, the new architecture with caching and auto‑materialization, implementation details, performance impact, and future work.

Auto MaterializationBig DataData Warehouse
0 likes · 24 min read
Building a Doris‑Based Lakehouse Integrated Analytics System at Kuaishou
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 3, 2024 · Databases

Optimizing High-Concurrency Point Queries in Doris with Row Store, Short Query Path, and PreparedStatement

This guide explains how to enable row store, configure short query path, and use PreparedStatement in Doris to reduce I/O and CPU overhead for high‑concurrency primary‑key point queries, including DDL examples, JDBC usage, row cache settings, performance tips, and verification methods.

PreparedStatementRow StoreSQL
0 likes · 9 min read
Optimizing High-Concurrency Point Queries in Doris with Row Store, Short Query Path, and PreparedStatement
ITPUB
ITPUB
Jun 9, 2024 · Databases

Doris vs ClickHouse: Which Database Fits Your Workload?

This article compares Doris and ClickHouse across architecture, table creation, ecosystem integration, management tools, query performance, and join capabilities, offering practical guidance on how to choose the right database based on your specific data processing and operational requirements.

ClickHouseData WarehouseSQL
0 likes · 10 min read
Doris vs ClickHouse: Which Database Fits Your Workload?
DataFunSummit
DataFunSummit
Dec 16, 2023 · Databases

Optimizing Precise Deduplication with Doris Bitmap: Architecture, Performance Enhancements, and Practical Practices

This article presents a comprehensive overview of precise deduplication in Meituan's Doris database, detailing the underlying bitmap data structures, aggregation bottlenecks, and a series of optimizations—including memory management, fast union, orthogonal encoding, and vectorized engine integration—that together achieve significant performance gains in high‑cardinality scenarios.

BitmapOLAPdatabase
0 likes · 20 min read
Optimizing Precise Deduplication with Doris Bitmap: Architecture, Performance Enhancements, and Practical Practices
DataFunTalk
DataFunTalk
Dec 15, 2023 · Big Data

Zhihu Bridge Platform: Internal Marketing Architecture, Challenges, and Optimizations

This article presents a comprehensive overview of Zhihu's Bridge Platform internal marketing module, detailing its background, business logic, product components such as CDP, activity and delivery platforms, architectural layers, performance bottlenecks, optimization techniques—including distributed transactions, bitmap indexing, and vectorized query execution—and future directions toward marketing automation and intelligence.

CDPDistributed SystemsPerformance Optimization
0 likes · 28 min read
Zhihu Bridge Platform: Internal Marketing Architecture, Challenges, and Optimizations
ITPUB
ITPUB
Nov 1, 2023 · Databases

Doris 2.0.2 vs 1.2.3: Real‑World Query Performance Comparison

After upgrading a Doris cluster from version 1.2.3 to 2.0.2, the author runs a series of SQL benchmarks—including PK lookups, top‑client queries, distinct counts on low‑ and high‑cardinality columns, minute‑level session analysis, and full‑table deduplication—to measure execution times, revealing mixed performance gains and regressions across the seven test scenarios.

Database UpgradeSQLdoris
0 likes · 9 min read
Doris 2.0.2 vs 1.2.3: Real‑World Query Performance Comparison
dbaplus Community
dbaplus Community
Oct 18, 2023 · Databases

Doris vs ClickHouse: Which Database Delivers Faster Writes and Queries?

This article presents a systematic performance comparison between Doris and ClickHouse, covering data ingestion speed, SQL syntax differences, hardware impact, and detailed query benchmarks across multiple scenarios, ultimately revealing that each system excels in different use cases.

Big DataClickHouseSQL
0 likes · 15 min read
Doris vs ClickHouse: Which Database Delivers Faster Writes and Queries?
DataFunTalk
DataFunTalk
Sep 24, 2023 · Databases

Insights into the Design and Challenges of Doris' New Optimizer (Nereids)

The article explains why Doris needed a new optimizer, describes its architecture—including rule‑based and cost‑based stages, early data‑size reduction techniques, dynamic‑programming join‑reorder methods, and practical challenges such as statistics errors and runtime filters—while sharing performance results and a Q&A session.

Database PerformanceJoin ReorderQuery Optimizer
0 likes · 17 min read
Insights into the Design and Challenges of Doris' New Optimizer (Nereids)
ITPUB
ITPUB
Sep 15, 2023 · Databases

Importing Billions of Kafka Rows into Doris and Benchmarking Against ClickHouse

This article explains Doris's various data import methods, focuses on the routine load approach for Kafka streams, describes how to handle mixed‑schema topics using the max_error_number parameter, and compares query performance of a 130 million‑row dataset against ClickHouse, highlighting each system's strengths and limitations.

ClickHouseKafkaRoutine Load
0 likes · 10 min read
Importing Billions of Kafka Rows into Doris and Benchmarking Against ClickHouse
Huolala Tech
Huolala Tech
Sep 7, 2023 · Big Data

How Huolala Ensures Doris Stability: Real-World Big Data Practices

This article details Huolala's big‑data architecture and the practical measures—ranging from background analysis and stability challenges to case studies, discovery mechanisms, capacity planning, high‑availability, and automation—that the company employs to guarantee Doris's reliability and performance across its rapidly growing logistics platform.

Big DataOLAPcapacity planning
0 likes · 15 min read
How Huolala Ensures Doris Stability: Real-World Big Data Practices
DataFunTalk
DataFunTalk
Aug 28, 2023 · Big Data

Practical Experience of an E‑commerce Platform’s Offline and Real‑time Data Warehouse

This article shares the practical architecture, technology selection, implementation details, and evolution of an e‑commerce platform’s offline and real‑time data warehouses, covering data modeling, processing pipelines, system components such as Hive, Spark, Flink, ClickHouse, Doris, and Hudi, and the lessons learned from multiple production deployments.

Big DataClickHouseData Warehouse
0 likes · 18 min read
Practical Experience of an E‑commerce Platform’s Offline and Real‑time Data Warehouse
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 7, 2023 · Big Data

Using Doris for Real‑Time Data Warehousing: Benefits, Drawbacks, and Comparison with Flink

The article examines Doris‑based real‑time data warehousing, outlining why teams choose this approach, comparing its low‑threshold development and operational simplicity to Flink’s high‑cost streaming, and highlighting latency, scale limits, and the strict monitoring required for production use.

Big DataData WarehouseFlink
0 likes · 5 min read
Using Doris for Real‑Time Data Warehousing: Benefits, Drawbacks, and Comparison with Flink
ByteDance Data Platform
ByteDance Data Platform
May 29, 2023 · Databases

Which Open‑Source OLAP Engine Wins the TPC‑DS Benchmark? A Deep Performance Comparison

Using the TPC‑DS benchmark’s 99 queries on a 1 TB dataset, this study evaluates the performance of four open‑source OLAP engines—ClickHouse, Doris, Presto, and ByConity—across basic, join, aggregation, subquery, and window‑function scenarios, revealing ByConity’s superior speed and the limitations of ClickHouse.

ByConityClickHouseOLAP
0 likes · 12 min read
Which Open‑Source OLAP Engine Wins the TPC‑DS Benchmark? A Deep Performance Comparison
DataFunTalk
DataFunTalk
Dec 19, 2022 · Big Data

Evolution of OLAP: Key Technologies, Engine Comparison, and Future Trends

This article provides a comprehensive overview of OLAP technology evolution, covering its origins, modern requirements for massive and real‑time data, detailed comparisons of major open‑source OLAP engines such as Druid, Elasticsearch, Kylin, Doris/StarRocks, and ClickHouse, core architectural and storage techniques, and emerging trends like federated queries, hybrid storage, and lakehouse integration.

ClickHouseDruidOLAP
0 likes · 22 min read
Evolution of OLAP: Key Technologies, Engine Comparison, and Future Trends
DataFunSummit
DataFunSummit
Nov 2, 2022 · Big Data

Evolution and Construction of Huolala's Doris‑Based OLAP System

This article details Huolala's journey from a MySQL‑centric analytics pipeline to a multi‑engine OLAP platform built on Doris, covering system architecture, data flow, stage‑wise evolution, engine selection, POC validation, performance tuning, stability measures, and future roadmap for self‑service analytics.

Big DataOLAPdoris
0 likes · 15 min read
Evolution and Construction of Huolala's Doris‑Based OLAP System
DataFunTalk
DataFunTalk
Sep 22, 2022 · Big Data

Architecture and Practices of Zhihu DMP System Based on Doris

This article presents a comprehensive overview of Zhihu's Data Management Platform (DMP), covering its business background, three core business modes, detailed architecture, offline and real‑time data pipelines, feature storage design, performance optimization techniques, and future iteration directions.

DMPData Platformdoris
0 likes · 14 min read
Architecture and Practices of Zhihu DMP System Based on Doris
DataFunTalk
DataFunTalk
Sep 1, 2022 · Big Data

Evolution and Construction of Huolala's OLAP System Based on Doris

This presentation details Huolala's journey from its initial OLAP architecture to a multi‑engine platform, describing background, data‑flow layers, technical research, engine selection (Druid, ClickHouse, Doris), POC validation, performance tuning, stability measures, production rollout, problem analysis, and future roadmap.

ClickHouseDruidHuolala
0 likes · 17 min read
Evolution and Construction of Huolala's OLAP System Based on Doris
Big Data Technology & Architecture
Big Data Technology & Architecture
May 30, 2022 · Big Data

Doris Architecture, Principles, and Key Features Overview

This article provides a comprehensive overview of Doris's architecture—including its FE and BE components, metadata management, data organization, execution planning—and details its major features such as adaptive join aggregation, vectorized execution, materialized views, and Elasticsearch integration, supplemented with example DDL and query code.

Big DataDatabase ArchitectureElasticsearch
0 likes · 7 min read
Doris Architecture, Principles, and Key Features Overview
dbaplus Community
dbaplus Community
May 11, 2022 · Big Data

How JD Logistics Tackled Billion-Scale Data Challenges with Doris

This article details JD Logistics' journey from fragmented, massive‑scale data to a unified, real‑time analytics platform, covering business needs, pain points, tool evaluation, a new Doris‑based architecture, table management, data import procedures, automation scripts, and future roadmap for data engineering.

BI ToolsBig DataData Warehouse
0 likes · 16 min read
How JD Logistics Tackled Billion-Scale Data Challenges with Doris
dbaplus Community
dbaplus Community
Oct 26, 2021 · Databases

Scaling JD.com Customer Service with Doris OLAP: Architecture & Caching

JD.com’s customer service team leverages the open‑source MPP database Doris to power real‑time and offline OLAP dashboards, detailing data ingestion pipelines, full‑link monitoring, dual‑stream high‑availability design, dynamic partition management, multi‑layer caching strategies, and performance optimizations applied during the 2020 11.11 shopping festival.

Big DataOLAPReal-time analytics
0 likes · 15 min read
Scaling JD.com Customer Service with Doris OLAP: Architecture & Caching
DataFunSummit
DataFunSummit
Oct 16, 2021 · Databases

Practical Use Cases of Materialized Views and Indexes in Doris

This article shares practical experiences with Doris, covering materialized view concepts, typical use cases, index principles, performance optimizations, and real‑world scenarios such as order analysis, PV/UV aggregation, and detailed queries, while also providing operational tips and Q&A insights.

Big DataOLAPdoris
0 likes · 16 min read
Practical Use Cases of Materialized Views and Indexes in Doris
DataFunTalk
DataFunTalk
Sep 23, 2021 · Databases

Practical Use Cases of Materialized Views and Indexes in Doris

This article shares practical experiences with Doris, covering materialized view concepts, typical use cases, advantages, creation syntax, prefix index principles, performance‑boosting scenarios such as order analysis, PV/UV counting, detail queries, and operational tips for high‑throughput and low‑latency workloads.

Big DataOLAPPerformance Optimization
0 likes · 18 min read
Practical Use Cases of Materialized Views and Indexes in Doris
DataFunTalk
DataFunTalk
Sep 4, 2021 · Big Data

High‑Availability Practices of ClickHouse in JD.com: Architecture, Deployment, and Operations

The article details JD.com’s large‑scale OLAP strategy using ClickHouse as the primary engine and Doris as a secondary engine, covering application scenarios, component selection criteria, cluster deployment models, high‑availability architecture, fault‑handling procedures, performance tuning, and future cloud‑native plans.

Big DataClickHouseCluster Deployment
0 likes · 19 min read
High‑Availability Practices of ClickHouse in JD.com: Architecture, Deployment, and Operations
Meituan Technology Team
Meituan Technology Team
Aug 26, 2021 · Big Data

How Meituan Built a Scalable Real‑Time Data Warehouse: Architecture & Lessons

Meituan Waimai’s data intelligence team outlines a universal real‑time data‑warehouse methodology that combines a production platform with an interactive analytics engine, detailing scenarios, technology choices, architectural designs, platformization, SLA management, and a practical Lambda‑style case study.

FlinkKappa architectureLambda architecture
0 likes · 18 min read
How Meituan Built a Scalable Real‑Time Data Warehouse: Architecture & Lessons
JD Retail Technology
JD Retail Technology
Jun 9, 2021 · Big Data

JD OLAP High‑Availability Practices: ClickHouse and Doris Deployment, Architecture, and Future Plans

This article details JD's OLAP implementation using ClickHouse as the primary engine and Doris as a secondary engine, covering business scenarios, selection criteria, multi‑tenant deployment, high‑availability architecture, encountered challenges, and future roadmap for cloud‑native, scalable analytics.

ClickHouseCloud NativeCluster Management
0 likes · 17 min read
JD OLAP High‑Availability Practices: ClickHouse and Doris Deployment, Architecture, and Future Plans
Baidu Geek Talk
Baidu Geek Talk
May 24, 2021 · Big Data

Real-Time Quantile Computation Using TDigest: Architecture and Solutions

The article presents a real‑time quantile solution using the TDigest data structure, which clusters data into centroids and stores digests in Redis or Doris, pre‑computes quantiles for all dimension combinations, and provides a reusable API that delivers fast, accurate, low‑memory quantile statistics for diverse business scenarios.

data aggregationdorisreal-time quantile
0 likes · 11 min read
Real-Time Quantile Computation Using TDigest: Architecture and Solutions
DataFunTalk
DataFunTalk
May 9, 2021 · Big Data

User Segmentation and Growth Practices for Mini‑Programs Based on Doris

This article presents a comprehensive case study of how Baidu’s senior R&D engineer Zhao Yuyang built a Doris‑based user‑segmentation system for mini‑programs, detailing the product’s private‑domain fine‑grained operation capabilities, the four technical challenges, the architecture and solutions—including global dictionaries, bitmap storage, partitioning, tag optimization, dynamic‑static query handling, and rapid user‑package generation—along with future roadmap plans.

User Segmentationbitmap indexingdoris
0 likes · 20 min read
User Segmentation and Growth Practices for Mini‑Programs Based on Doris
JD Retail Technology
JD Retail Technology
Apr 28, 2021 · Databases

Real‑Time Analytics with Doris for JD Customer Service: Architecture, Caching, and Optimization

This article describes how JD.com leverages the open‑source MPP analytical database Doris for real‑time and offline OLAP on customer‑service data, covering data ingestion pipelines, dual‑stream high‑availability design, dynamic partition management, multi‑level caching, monitoring with Prometheus‑Grafana, and performance optimizations applied during major sales events.

JD.comOLAPReal-time analytics
0 likes · 13 min read
Real‑Time Analytics with Doris for JD Customer Service: Architecture, Caching, and Optimization
dbaplus Community
dbaplus Community
Feb 18, 2021 · Big Data

How JD Search Scaled Real‑Time Analytics with Flink and Doris

This article details JD Search's journey from a Storm‑based pipeline to a Flink‑driven architecture backed by Apache Doris, covering business requirements, technical challenges, design trade‑offs, performance optimizations for massive traffic spikes, and future plans for their real‑time OLAP data warehouse.

Big DataFlinkOLAP
0 likes · 12 min read
How JD Search Scaled Real‑Time Analytics with Flink and Doris
dbaplus Community
dbaplus Community
Aug 4, 2020 · Databases

How Doris Powers Meituan’s Real‑Time Data Warehouse: ROLAP vs MOLAP Lessons

This article examines Meituan’s data warehouse evolution, detailing the limitations of MOLAP with Kylin, the adoption of Doris‑driven ROLAP using MPP technology, and the practical optimizations—such as join predicate pushdown, concurrent execution, colocate join, and bitmap aggregation—that improve real‑time analytics and reduce costs.

Data WarehouseMOLAPMPP
0 likes · 19 min read
How Doris Powers Meituan’s Real‑Time Data Warehouse: ROLAP vs MOLAP Lessons