Tagged articles
107 articles
Page 1 of 2
DataFunSummit
DataFunSummit
May 20, 2026 · Databases

Apache Doris 4.1: A Unified Data Store and Retrieval Engine for AI & Search

Apache Doris 4.1 introduces a systematic evolution for AI and search workloads, adding low‑cost massive vector storage, unified structured, full‑text and vector search, 100 MB JSON document support, Segment V3 metadata decoupling, sparse column optimizations, lakehouse lifecycle management, and a suite of performance‑boosting features such as aggregate push‑down, condition cache, and spill‑to‑disk, all backed by detailed benchmark results.

AIApache DorisLakehouse
0 likes · 30 min read
Apache Doris 4.1: A Unified Data Store and Retrieval Engine for AI & Search
DataFunTalk
DataFunTalk
Apr 18, 2026 · Databases

How Will Apache Doris Evolve in 2026 to Power AI‑Driven Data Workloads?

The article outlines Apache Doris's 2026 roadmap, detailing how the database will shift from pure analytics to a unified AI‑enabled platform with enhanced semi‑structured data support, vector and hybrid search, agent‑focused capabilities, and expanded storage and lakehouse integrations to meet emerging AI workloads.

AI integrationApache DorisData Lake
0 likes · 14 min read
How Will Apache Doris Evolve in 2026 to Power AI‑Driven Data Workloads?
JD Tech
JD Tech
Apr 16, 2026 · Industry Insights

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.

Apache DorisBatch ProcessingCoupon Search
0 likes · 12 min read
How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture
Tech Musings
Tech Musings
Feb 12, 2026 · Databases

From MySQL to Apache Doris: Key Design Shifts for OLAP Migration

This article explains how backend engineers should rethink table design, indexing, partitioning, and key strategies when migrating attendance data from MySQL's OLTP model to Apache Doris 2.1.7's OLAP architecture, providing concrete DDL examples and practical tips.

Apache DorisOLAPPartitioning
0 likes · 12 min read
From MySQL to Apache Doris: Key Design Shifts for OLAP Migration
Java Companion
Java Companion
Dec 6, 2025 · Backend Development

Replacing MySQL with Apache Doris in Spring Boot for Real‑Time Analytics

This article demonstrates how to integrate Apache Doris, a high‑performance MPP analytical database, into a Spring Boot application as a drop‑in replacement for MySQL, detailing environment setup, Maven dependencies, configuration, entity mapping, repository, service and controller code, and performance testing that shows Doris’s superior real‑time query speed.

Apache DorisJavaMPP database
0 likes · 15 min read
Replacing MySQL with Apache Doris in Spring Boot for Real‑Time Analytics
Huolala Tech
Huolala Tech
Oct 17, 2025 · Big Data

How HuoLala Accelerated User Profiling 30× Faster with Apache Doris

This article details how HuoLala built a high‑performance user profiling platform on Apache Doris, redesigning data models, leveraging bitmap storage, and applying query‑level optimizations to achieve up to 30‑fold speed gains, lower memory usage, and scalable real‑time analytics.

Apache DorisBig DataBitmap
0 likes · 17 min read
How HuoLala Accelerated User Profiling 30× Faster with Apache Doris
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 13, 2025 · Databases

Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts

The Apache Doris 3.1 release strengthens lake‑house capabilities with major upgrades to the VARIANT data type, vertical compaction, inverted index storage, new tokenizers, enhanced materialized view support for Iceberg/Paimon/Hudi, and numerous query‑performance optimizations such as faster partition pruning and dynamic partition clipping, offering smoother handling of thousands of columns and large‑scale semi‑structured data.

Apache DorisLakehouseVARIANT
0 likes · 8 min read
Apache Doris 3.1 Unveiled: Variant, Index, and Lakehouse Boosts
JD Tech Talk
JD Tech Talk
Jul 16, 2025 · Databases

How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Data Tiering

JD Advertising built a massive ad‑data warehouse on Apache Doris, reaching nearly 1 PB and 18 trillion rows, then implemented a hot‑cold data tiering strategy—first a lake‑based approach, later a native tiering solution in Doris 2.0—reducing storage costs by 87% and boosting query performance over tenfold.

Apache DorisSchema Changecold-hot tiering
0 likes · 18 min read
How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Data Tiering
JD Cloud Developers
JD Cloud Developers
Jul 16, 2025 · Databases

How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Tiering

This article details JD Advertising's journey from a 1 PB Apache Doris data lake to a multi‑level hot‑cold tiering architecture, describing two tiering strategies, the performance and schema‑change challenges faced during the upgrade to Doris 2.0, and the optimizations that reduced storage costs by about 87% while boosting query throughput.

Apache DorisSchema Changecold data
0 likes · 19 min read
How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Tiering
DataFunSummit
DataFunSummit
Jun 22, 2025 · Databases

Unlocking Apache Doris: How Lakehouse Integration Supercharges Data Analytics

This article walks through Apache Doris’s lakehouse‑in‑one architecture, explains its core value and paradigm, details the system’s components and use cases, examines technical challenges such as file‑format diversity and I/O stability, and presents a suite of optimizations—from predicate push‑down and partition pruning to metadata caching and dynamic scheduling—that dramatically improve query performance and resource utilization, while also outlining future roadmap plans.

Apache DorisBig DataData Warehouse
0 likes · 22 min read
Unlocking Apache Doris: How Lakehouse Integration Supercharges Data Analytics
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 2, 2025 · Databases

Replacing Elasticsearch with Apache Doris for Real‑Time Big Data Analytics: Architecture, Performance, and Enterprise Cases

This article analyzes why Elasticsearch struggles with large‑scale, complex real‑time analytics and demonstrates how Apache Doris’s MPP, columnar storage, and native SQL support provide a cost‑effective, high‑performance alternative, illustrated with detailed enterprise case studies.

Apache DorisBig DataElasticsearch
0 likes · 11 min read
Replacing Elasticsearch with Apache Doris for Real‑Time Big Data Analytics: Architecture, Performance, and Enterprise Cases
JD Retail Technology
JD Retail Technology
Feb 20, 2025 · Big Data

Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris

JD Advertising built a petabyte‑scale ad analytics service on Apache Doris, identified a hot‑cold access pattern, and implemented a native cold‑hot tiering solution (upgrading to Doris 2.0 and optimizing schema changes) that cut storage costs by ~87% and boosted concurrent query capacity over tenfold while simplifying operations.

Apache DorisBig DataPerformance Optimization
0 likes · 18 min read
Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris
JD Tech
JD Tech
Feb 11, 2025 · Big Data

Cold‑Hot Data Tiering and Performance Optimization in Apache Doris for JD Advertising

This article presents JD Advertising's engineering experience with Apache Doris, describing the evolution from a data‑lake cold‑data solution to a native cold‑hot tiering approach, detailing performance regressions after upgrading to Doris 2.0, and outlining a series of optimizations for query speed, CPU and memory usage, schema‑change efficiency, and automated data migration and restoration.

Apache DorisBig DataData Lake
0 likes · 17 min read
Cold‑Hot Data Tiering and Performance Optimization in Apache Doris for JD Advertising
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 21, 2024 · Big Data

Key New Features of Apache Doris 3.0: Storage‑Compute Separation, Lakehouse Integration, Semi‑Structured Data, ETL Enhancements, Materialized Views, and Java UDTF

Apache Doris 3.0 introduces storage‑compute separation, native lakehouse write‑back, optimized Variant handling for semi‑structured data, stronger ETL transaction support, enhanced multi‑table materialized views, and Java UDTF capabilities, providing developers with more flexible, cost‑effective, and high‑performance analytics solutions.

Apache DorisData WarehouseETL
0 likes · 7 min read
Key New Features of Apache Doris 3.0: Storage‑Compute Separation, Lakehouse Integration, Semi‑Structured Data, ETL Enhancements, Materialized Views, and Java UDTF
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 16, 2024 · Databases

Kuaishou's Lakehouse‑Integrated OLAP Architecture with Apache Doris: Design, Migration, and Optimization

The article describes how Kuaishou transformed its high‑traffic OLAP system from a separated lake‑and‑warehouse architecture using Hive/Hudi and ClickHouse into a unified lakehouse solution powered by Apache Doris, detailing the challenges, design choices, caching and automatic materialization mechanisms, and the resulting performance and governance improvements.

Apache DorisBig DataData Caching
0 likes · 18 min read
Kuaishou's Lakehouse‑Integrated OLAP Architecture with Apache Doris: Design, Migration, and Optimization
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Sep 25, 2024 · Big Data

How Cold‑Hot Data Separation Boosts Cost Efficiency in Baidu Palo for Apache Doris

This article explains the principles, configuration steps, monitoring metrics, leader selection, data migration granularity, compaction, invalid data cleanup, and cache mechanisms of cold‑hot data separation in Baidu Intelligent Cloud's Palo for Apache Doris, illustrating how tiered storage reduces costs while maintaining query performance.

Apache DorisData TieringPalo
0 likes · 21 min read
How Cold‑Hot Data Separation Boosts Cost Efficiency in Baidu Palo for Apache Doris
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 12, 2024 · Databases

MemTable Optimization and Single‑Replica Load in Apache Doris 2.0

The article explains how Apache Doris 2.0 improves data import performance by redesigning MemTable handling, introducing write‑path optimizations, parallel segment flushing, and a single‑replica load mode that reduces resource consumption and boosts throughput for both single‑ and multi‑concurrent workloads.

Apache DorisMemtableSingle Replica Load
0 likes · 10 min read
MemTable Optimization and Single‑Replica Load in Apache Doris 2.0
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 26, 2024 · Databases

Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability

This article provides a comprehensive overview of Apache Doris, explaining its frontend and backend nodes, storage structures such as tablets, rowsets, and segments, replication mechanisms, partitioning versus bucketing, indexing types, compaction processes, and high‑availability strategies through a detailed Q&A format.

Apache DorisBig DataDatabase Architecture
0 likes · 22 min read
Apache Doris Architecture and Common Q&A: Read/Write Flow, Replication Consistency, Storage, and High Availability
DataFunTalk
DataFunTalk
Jun 4, 2024 · Databases

From Lambda Architecture to an All‑in‑One Apache Doris Real‑Time/Offline Data Platform for 5G Connected Factories

The article explains how China Unicom transformed its 5G fully‑connected factory data pipeline from a complex Lambda architecture into a streamlined, real‑time and offline‑integrated solution built on Apache Doris, detailing system requirements, architectural redesign, performance gains, and future plans.

5GApache DorisBig Data
0 likes · 15 min read
From Lambda Architecture to an All‑in‑One Apache Doris Real‑Time/Offline Data Platform for 5G Connected Factories
macrozheng
macrozheng
May 22, 2024 · Big Data

How to Install and Use DataEase: An Open‑Source Big Data Visualization Tool

This guide introduces DataEase, an open‑source BI platform built with SpringBoot, Apache Doris, and Kettle, walks through its architecture, provides step‑by‑step Docker‑based installation, and demonstrates how to create datasets, visualizations, and dashboards from Excel and MySQL sources.

Apache DorisBIData visualization
0 likes · 13 min read
How to Install and Use DataEase: An Open‑Source Big Data Visualization Tool
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 18, 2024 · Databases

Apache Doris 2.1.0 Release: Major Performance Boosts, New Data Types, Optimizer Enhancements and Operational Features

The Apache Doris 2.1.0 release introduces over 100% query performance improvements on TPC‑DS, up to 230% gains on ARM platforms, new Variant and IP data types, async materialized views, auto‑increment columns, auto‑partitioning, group commit, hardened workload groups, TopSQL monitoring, a built‑in job scheduler, and several behavior changes, all aimed at delivering faster, more flexible and more reliable OLAP processing.

ARM OptimizationApache DorisSQL
0 likes · 42 min read
Apache Doris 2.1.0 Release: Major Performance Boosts, New Data Types, Optimizer Enhancements and Operational Features
DataFunSummit
DataFunSummit
Feb 7, 2024 · Big Data

Evolution of OLAP with Apache Doris at Xingyun Retail Credit

Facing rapid data growth, Xingyun Retail Credit transitioned from traditional OLTP systems to an Apache Doris‑based OLAP solution, detailing the data demand generation, OLAP engine selection challenges, multi‑stage implementation, performance gains, data‑warehouse construction, and future roadmap for scalable analytics.

Apache DorisBig DataData Warehouse
0 likes · 17 min read
Evolution of OLAP with Apache Doris at Xingyun Retail Credit
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 14, 2023 · Big Data

Design and Implementation of a Data Service Platform for New Media Business

This article details the background, challenges, design principles, and implementation of a unified data service platform—including data modeling, multi-source governance, real-time processing, and a Doris-based storage solution—to support large‑scale video data for a new media operation.

Apache DorisData GovernanceData Platform
0 likes · 7 min read
Design and Implementation of a Data Service Platform for New Media Business
DataFunSummit
DataFunSummit
Dec 7, 2023 · Databases

Apache Doris: A High‑Performance Real‑Time Analytical Database for Online High‑Concurrency Reporting

This article introduces Apache Doris, a real‑time analytical database built on an MPP architecture, explains its suitability for massive data workloads and online high‑concurrency reporting scenarios, and details the core technologies—storage models, vectorized query engine, materialized views, partitioning, indexing, row‑store and prepared statements—that enable sub‑second query latency and high QPS, while also showing a real‑world case study and how to join the Doris community.

Apache DorisData WarehouseMaterialized Views
0 likes · 13 min read
Apache Doris: A High‑Performance Real‑Time Analytical Database for Online High‑Concurrency Reporting
DataFunTalk
DataFunTalk
Oct 25, 2023 · Databases

Apache Doris Summit Asia 2023: Highlights, Innovations, and Industry Use Cases

The Apache Doris Summit Asia 2023 showcased the milestone 2.0 release, impressive performance gains, rapid community growth, and diverse industry deployments, while outlining future cloud‑native and unified analytics directions that position Doris as a leading real‑time data warehouse solution.

Apache DorisBig DataCloud Native
0 likes · 13 min read
Apache Doris Summit Asia 2023: Highlights, Innovations, and Industry Use Cases
DataFunTalk
DataFunTalk
Sep 6, 2023 · Databases

Large Model + OLAP: Enabling a New Data Service Platform

This article details how Tencent Music combines large language models with an Apache Doris‑based OLAP engine, introduces a semantic layer, manual‑experience routing, schema mapping and plugin integration, and outlines the evolution of its data architecture through four versions to achieve real‑time, cost‑effective, and scalable intelligent data services.

Apache DorisData WarehouseOLAP
0 likes · 24 min read
Large Model + OLAP: Enabling a New Data Service Platform
DataFunTalk
DataFunTalk
Sep 3, 2023 · Big Data

Evolution of OLAP at Xingyun Retail Credit Using Apache Doris

This article details how Xingyun Retail Credit transitioned from traditional data warehouses to an Apache Doris‑based OLAP solution, covering data demand generation, OLAP engine selection challenges, multi‑stage implementation, performance optimizations, data‑warehouse construction, real‑world use cases, and future roadmap.

Apache DorisBig DataData Warehouse
0 likes · 16 min read
Evolution of OLAP at Xingyun Retail Credit Using Apache Doris
DataFunTalk
DataFunTalk
Aug 21, 2023 · Databases

Case Study: Building a Real‑Time Log Data Analysis Platform with Apache Doris at China Unicom

This article describes how China Unicom’s Western Innovation Research Institute designed and deployed a centralized, real‑time log analytics platform using Apache Doris, detailing the migration from Hive and ClickHouse, performance optimizations, storage cost reductions, and the resulting improvements in data ingestion, query speed, and operational efficiency.

Apache DorisBig DataCold‑Hot Data Management
0 likes · 18 min read
Case Study: Building a Real‑Time Log Data Analysis Platform with Apache Doris at China Unicom
DataFunTalk
DataFunTalk
Aug 15, 2023 · Databases

Apache Doris 2.0.0 Release Highlights and New Features

Apache Doris 2.0.0, released on August 11, 2023, introduces a new Cascades‑based optimizer, inverted index, point‑query acceleration, pipeline execution engine, multi‑tenant resource isolation, cloud‑native compute nodes, and extensive performance gains of up to ten‑fold in benchmark queries and dozens of times in real‑world workloads.

Apache DorisCloud NativePerformance Boost
0 likes · 24 min read
Apache Doris 2.0.0 Release Highlights and New Features
DataFunTalk
DataFunTalk
Jul 25, 2023 · Databases

Building an Integrated Metric Data Service Platform with Apache Doris: Architecture Evolution and Millisecond‑Level Query Performance

This article describes how Financial One Account, a technology service arm of Ping An, migrated from a Hadoop‑Presto‑Kylin stack to an Apache Doris‑based data platform, detailing the architectural evolution, OLAP engine selection, metric system design, performance optimizations, and future roadmap for real‑time analytics.

Apache DorisBig DataData Warehouse
0 likes · 15 min read
Building an Integrated Metric Data Service Platform with Apache Doris: Architecture Evolution and Millisecond‑Level Query Performance
DataFunSummit
DataFunSummit
Jul 18, 2023 · Databases

Apache Doris Data Lake Federation Features Overview

This article introduces Apache Doris’s data lake federation capabilities, detailing its lake‑warehouse integration design, supported data sources such as Hive, Iceberg, Hudi, and Elasticsearch, performance optimizations for metadata and file access, case studies, community roadmap, and Q&A on replacing Presto.

Apache DorisData LakeSQL Engine
0 likes · 21 min read
Apache Doris Data Lake Federation Features Overview
DataFunTalk
DataFunTalk
Jul 7, 2023 · Databases

Apache Doris 2.0-beta Release: New Query Optimizer, Pipeline Execution Engine, Workload Management and Major Performance Improvements

Apache Doris 2.0-beta, released on July 3, 2023, introduces a new Cascades‑based query optimizer, adaptive pipeline execution engine, workload‑aware resource isolation, enhanced memory management, partial column updates, multi‑catalog support, and numerous performance gains across real‑time analytics, ETL, and high‑concurrency point queries.

Apache DorisDatabase PerformancePipeline Execution
0 likes · 25 min read
Apache Doris 2.0-beta Release: New Query Optimizer, Pipeline Execution Engine, Workload Management and Major Performance Improvements
Big Data Technology Architecture
Big Data Technology Architecture
Jul 4, 2023 · Databases

Apache Doris 2.0‑beta Release: New Query Optimizer, Pipeline Engine, Workload Management and Performance Enhancements

Apache Doris 2.0‑beta, released on July 3, 2023, introduces a modern Cascades‑based query optimizer, a data‑driven pipeline execution engine, fine‑grained workload groups, enhanced memory management, partial‑column updates, compute nodes, cold‑hot tiering and cross‑cluster replication, delivering up to tenfold speedups and significant cost reductions for real‑time analytics.

Apache DorisPipeline ExecutionSQL Engine
0 likes · 24 min read
Apache Doris 2.0‑beta Release: New Query Optimizer, Pipeline Engine, Workload Management and Performance Enhancements
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 22, 2023 · Databases

Apache Doris 2.0 New Features: High‑Concurrency Data Serving Optimizations

Apache Doris 2.0 introduces a suite of high‑concurrency data‑serving enhancements—including row‑store format, partition‑bucket pruning, advanced indexing, materialized views, runtime filters, TOPN optimization, short‑circuit point‑query paths, prepared statements, and row cache—enabling single‑node tens of thousands QPS and dramatically reducing query latency.

Apache DorisBenchmarkData Serving
0 likes · 23 min read
Apache Doris 2.0 New Features: High‑Concurrency Data Serving Optimizations
DataFunTalk
DataFunTalk
Jun 20, 2023 · Databases

Hot and Cold Data Tiering in Apache Doris 2.0: Architecture, Configuration, and Performance Evaluation

This article explains the hot‑cold data tiering technique in Apache Doris 2.0, covering its motivation, storage‑layer design, configuration steps (resource, storage policy, table/partition settings), cost‑saving calculations, query performance impact, cold‑data compaction, and cache mechanisms, with practical code examples.

Apache DorisCold Data TieringStorage Policy
0 likes · 18 min read
Hot and Cold Data Tiering in Apache Doris 2.0: Architecture, Configuration, and Performance Evaluation
ITPUB
ITPUB
Jun 10, 2023 · Databases

How Apache Doris 2.0 Cuts Storage Costs with Hot‑Cold Data Tiering

The article explains how Apache Doris 2.0 introduces hot‑cold data tiering to move infrequently accessed data from expensive SSDs to cheaper object storage, dramatically reducing storage costs while maintaining query performance through automatic lifecycle management, storage policies, and cache mechanisms.

Apache DorisCost reductionSQL
0 likes · 19 min read
How Apache Doris 2.0 Cuts Storage Costs with Hot‑Cold Data Tiering
DataFunSummit
DataFunSummit
Jun 4, 2023 · Databases

From Apache Doris to SelectDB: Evolution Towards the Next‑Generation Cloud‑Native Data Warehouse

This presentation introduces Apache Doris, examines changing data analysis demands in the cloud era, explains why SelectDB was created, and details SelectDB’s cloud‑native architecture, performance, unified capabilities, ease of use, cost efficiency, open‑source nature, and its application scenarios for modern data warehousing and log analytics.

AnalyticsApache DorisCloud-native
0 likes · 15 min read
From Apache Doris to SelectDB: Evolution Towards the Next‑Generation Cloud‑Native Data Warehouse
DataFunTalk
DataFunTalk
May 17, 2023 · Databases

Evolution of 360 Commercial Real-Time Data Warehouse and Apache Doris Deployment

This article details the three‑stage evolution of 360's real‑time data warehouse—from Storm + Druid + MySQL to Flink + Druid + TiDB and finally to Flink + Apache Doris—explaining architectural pain points, the reasons for choosing Doris, and how the new system delivers sub‑second query latency, strong consistency, and simplified operations across advertising scenarios.

Apache DorisBig DataData Consistency
0 likes · 17 min read
Evolution of 360 Commercial Real-Time Data Warehouse and Apache Doris Deployment
DataFunTalk
DataFunTalk
May 9, 2023 · Databases

High‑Performance Inverted Index in Apache Doris for Log Data Storage and Analysis

This article explains how Apache Doris implements a high‑performance, column‑oriented inverted index to address the challenges of massive, real‑time log data storage and analysis, delivering dramatically higher write throughput, lower storage costs, and faster query performance than traditional Elasticsearch and Loki solutions.

Apache DorisBig DataLog Analytics
0 likes · 19 min read
High‑Performance Inverted Index in Apache Doris for Log Data Storage and Analysis
DataFunTalk
DataFunTalk
May 6, 2023 · Databases

Apache Doris: Overview, Data Lake Analysis Architecture, Community Development and Future Roadmap

This article provides a comprehensive overview of Apache Doris, detailing its origins, MPP‑based analytical capabilities, data‑lake integration techniques, recent architectural enhancements, performance optimizations, community growth, and upcoming development plans, while also addressing common user questions.

Analytical DatabaseApache DorisBig Data
0 likes · 20 min read
Apache Doris: Overview, Data Lake Analysis Architecture, Community Development and Future Roadmap
Big Data Technology & Architecture
Big Data Technology & Architecture
May 6, 2023 · Databases

Design and Implementation of Real‑Time OLAP with Apache Doris at Dingdong Maicai

This article details Dingdong Maicai’s business‑driven requirements for a real‑time OLAP platform, evaluates Apache Doris versus ClickHouse, describes the end‑to‑end architecture—including data ingestion, modeling, and query optimization techniques such as colocate joins, array_contains, broker load, bitmap, prefix and bloom‑filter indexes, and materialized views—and shares practical performance experiences and best‑practice recommendations.

Apache DorisOLAPbroker load
0 likes · 18 min read
Design and Implementation of Real‑Time OLAP with Apache Doris at Dingdong Maicai
DataFunTalk
DataFunTalk
Apr 18, 2023 · Big Data

Real-time OLAP with Apache Doris: Architecture, Use Cases, and Optimization at Dingdong Maicai

This article details Dingdong Maicai's adoption of Apache Doris as a real‑time OLAP engine, covering business requirements, comparative evaluation with ClickHouse, system architecture, practical applications such as real‑time analytics, B‑end queries, tag systems, and performance‑boosting techniques like Colocate Join, bitmap, prefix and Bloom‑filter indexes, materialized views, and streamlined Broker Load workflows.

Apache DorisBig DataData Warehouse
0 likes · 19 min read
Real-time OLAP with Apache Doris: Architecture, Use Cases, and Optimization at Dingdong Maicai
DataFunTalk
DataFunTalk
Apr 4, 2023 · Big Data

Upgrading Hangzhou Bank Consumer Finance Big Data Platform with Apache Doris 1.2: Architecture, Performance Gains, and Integration

This article details how Hangzhou Bank Consumer Finance modernized its big‑data platform by introducing Apache Doris 1.2, replacing the original Greenplum + CDH architecture, unifying data sources via Multi‑Catalog, achieving second‑level query latency, reducing storage and compute costs, and outlining the integration workflow with DolphinScheduler, SeaTunnel, and Spark.

Apache DorisBig DataData Integration
0 likes · 20 min read
Upgrading Hangzhou Bank Consumer Finance Big Data Platform with Apache Doris 1.2: Architecture, Performance Gains, and Integration
DataFunTalk
DataFunTalk
Mar 21, 2023 · Databases

Design and Technical Details of Apache Doris for Lakehouse Architecture

This article explains how Apache Doris extends its real‑time OLAP capabilities to support Lakehouse architectures, covering unified metadata, query acceleration, elastic compute, performance benchmarks, and future roadmap for richer data‑source integration and resource isolation.

Apache DorisBig DataData Warehouse
0 likes · 20 min read
Design and Technical Details of Apache Doris for Lakehouse Architecture
DataFunTalk
DataFunTalk
Mar 1, 2023 · Databases

Evolution and Optimization of Tencent Music Content Library Data Platform: From Architecture 1.0 to 4.0

This article details the evolution of Tencent Music's content library data platform from version 1.0 to 4.0, describing business requirements, architectural redesigns—including migration from ClickHouse to Apache Doris, introduction of a semantic layer, and extensive write, query, and cost optimizations—while sharing practical lessons and future directions.

Apache DorisBig DataData Warehouse
0 likes · 21 min read
Evolution and Optimization of Tencent Music Content Library Data Platform: From Architecture 1.0 to 4.0
macrozheng
macrozheng
Feb 28, 2023 · Big Data

How Tencent Music Scaled Its Content Data Platform with Apache Doris: From ClickHouse to 4.0 Architecture

This article details the evolution of Tencent Music's content data platform from version 1.0 to 4.0, describing the migration from ClickHouse to Apache Doris, the introduction of a semantic layer, optimization of data ingestion, query performance, and cost reduction strategies that dramatically improved data timeliness, operational efficiency, and storage costs.

Apache DorisBig DataData Architecture
0 likes · 23 min read
How Tencent Music Scaled Its Content Data Platform with Apache Doris: From ClickHouse to 4.0 Architecture
DataFunTalk
DataFunTalk
Feb 21, 2023 · Databases

Building a Stream‑Batch Integrated Data Architecture with Apache Doris at SelectDB

This article details how SelectDB’s data technology architect designed and implemented a new stream‑batch unified data platform using Apache Doris, covering the shortcomings of the early CDH‑based architecture, the selection process, data modeling, ingestion pipelines, performance testing, operational optimizations, and future plans.

Apache DorisBatch ProcessingBig Data
0 likes · 17 min read
Building a Stream‑Batch Integrated Data Architecture with Apache Doris at SelectDB
ITPUB
ITPUB
Feb 13, 2023 · Databases

How Apache Doris Enables Cloud‑Native Real‑Time Data Warehousing for Log Analytics

Based on a DTCC2022 presentation, this article explains Apache Doris's high‑performance MPP architecture, its cloud‑native extensions in SelectDB, and how they solve large‑scale log storage and analysis with superior write throughput, storage efficiency, and interactive query speed.

Apache DorisMPPReal-time analytics
0 likes · 11 min read
How Apache Doris Enables Cloud‑Native Real‑Time Data Warehousing for Log Analytics
DataFunSummit
DataFunSummit
Jan 24, 2023 · Big Data

Building a Real-Time Data and User Profiling Architecture with Apache Doris at Zhihu

The article details Zhihu's data empowerment team's design and implementation of a low‑cost, high‑response real‑time data platform built on Apache Doris, covering real‑time business metrics, algorithm features, and user profiling, and explains the challenges, architectural choices, tooling, performance gains, and future directions.

Apache DorisData IntegrationData Quality
0 likes · 22 min read
Building a Real-Time Data and User Profiling Architecture with Apache Doris at Zhihu
DataFunTalk
DataFunTalk
Jan 9, 2023 · Databases

What Does a Decade Mean for Apache Doris? – Highlights from Doris Summit 2022

The Doris Summit 2022 recap outlines a ten‑year journey from an internal Baidu project to a top‑level Apache OLAP database, detailing explosive community growth, 2022 milestones, major feature releases up to version 1.2, and an ambitious 2023 roadmap focused on performance, lakehouse integration, multi‑modal analysis, cost efficiency, and enhanced usability.

Apache DorisOLAPRoadmap
0 likes · 21 min read
What Does a Decade Mean for Apache Doris? – Highlights from Doris Summit 2022
DataFunTalk
DataFunTalk
Dec 6, 2022 · Databases

Performance Optimization of Apache Doris for A/B Experiment Queries at Xiaomi

This article analyzes the performance bottlenecks of A/B experiment report queries on Apache Doris at Xiaomi, presents data-driven insights on query latency, field usage, and experiment ID matching, and details a series of optimizations—including pre‑aggregation, materialized views, bitmap deduplication, and schema redesign—that reduced query times by up to 60× and lowered cluster load.

A/B testingApache DorisBitmap
0 likes · 17 min read
Performance Optimization of Apache Doris for A/B Experiment Queries at Xiaomi
DataFunTalk
DataFunTalk
Nov 28, 2022 · Databases

Optimizing Real‑Time Data Warehouse with Apache Doris at 360 DataTech

Facing stricter security, accuracy, and latency demands, 360 DataTech rebuilt its real‑time data warehouse by selecting Apache Doris for its high‑performance writes, SQL compatibility, low operational complexity, and active community, then detailed the architecture, ingestion, query acceleration, monitoring, troubleshooting, and future plans.

Apache DorisData Import OptimizationSQL acceleration
0 likes · 19 min read
Optimizing Real‑Time Data Warehouse with Apache Doris at 360 DataTech
DataFunTalk
DataFunTalk
Nov 14, 2022 · Databases

Performance Optimization and Tuning of Apache Doris Vectorized Version for Xiaomi's A/B Experiment Platform

Xiaomi upgraded its Apache Doris from version 0.13 to the vectorized 1.1.2 release for its A/B experiment platform, conducting extensive single‑SQL and concurrent tests, identifying CPU, memory, and fragment timeout issues, and applying tuning such as memory decommit settings, string matching improvements, and patches to achieve up to 5× query speed gains and enhanced stability.

Apache DorisDatabase Optimizationperformance tuning
0 likes · 20 min read
Performance Optimization and Tuning of Apache Doris Vectorized Version for Xiaomi's A/B Experiment Platform
DataFunSummit
DataFunSummit
Oct 27, 2022 · Databases

Vectorized Storage Layer Refactoring in Apache Doris: Design, Implementation, and Performance Evaluation

This article explains the motivation, design, and implementation of vectorizing Apache Doris's storage layer using SIMD techniques, covering engine overview, vectorized programming concepts, storage architecture, index and predicate optimizations, delayed materialization, output improvements, and performance test results.

Apache DorisOLAPSIMD
0 likes · 13 min read
Vectorized Storage Layer Refactoring in Apache Doris: Design, Implementation, and Performance Evaluation
DataFunSummit
DataFunSummit
Sep 21, 2022 · Big Data

Practical Implementation of NetEase Yanxuan DMP Tag System: Architecture, Tag Production, Storage, and High‑Performance Query

This article details NetEase Yanxuan's DMP tag system, covering platform overview, tag definitions, production pipelines, multi‑layer storage architecture, high‑performance query techniques, and future roadmap, illustrating how data from various sources is transformed into actionable user tags for refined operations.

Apache DorisBig DataDMP
0 likes · 10 min read
Practical Implementation of NetEase Yanxuan DMP Tag System: Architecture, Tag Production, Storage, and High‑Performance Query
dbaplus Community
dbaplus Community
Sep 14, 2022 · Databases

How Apache Doris Enables Real‑Time Analysis of Hudi Data Lakes

This article explains the architecture of Apache Doris, introduces Apache Hudi as a data‑lake format, compares Lambda and Kappa approaches, and details the design, implementation steps, and future roadmap for querying Hudi tables directly from Doris.

Apache DorisApache HudiBig Data
0 likes · 10 min read
How Apache Doris Enables Real‑Time Analysis of Hudi Data Lakes
DataFunSummit
DataFunSummit
Sep 7, 2022 · Big Data

Integrating Apache Doris with Hudi: Architecture, Design, and Implementation

This article explains the background, architecture, design choices, and step‑by‑step implementation for enabling Apache Doris to query Hudi data lake tables, covering Doris features, Hudi formats, Lambda/Kappa architectures, solution alternatives, and future roadmap for real‑time analytics.

Apache DorisBig DataData Lake
0 likes · 10 min read
Integrating Apache Doris with Hudi: Architecture, Design, and Implementation
DataFunTalk
DataFunTalk
Aug 14, 2022 · Big Data

NetEase Yanxuan DMP Tag System Construction Practice

This article details NetEase Yanxuan’s DMP tag system, covering its platform overview, tag production workflow, storage architecture, high‑performance query techniques, and future plans, illustrating how data from multiple sources is processed through ODS, DWD, DM layers and leveraged via Spark, Hive, and Apache Doris for real‑time and offline analytics.

Apache DorisDMPHive
0 likes · 11 min read
NetEase Yanxuan DMP Tag System Construction Practice
Big Data Technology Architecture
Big Data Technology Architecture
Aug 13, 2022 · Big Data

Apache Doris at Xiaomi: Architecture Evolution, Performance Optimizations, and Production Practices

This article details Xiaomi's three‑year journey of adopting Apache Doris across dozens of internal services, describing the transition from a Spark‑SQL‑based Lambda architecture to a unified MPP database, performance benchmarks, data ingestion pipelines, compaction tuning, two‑phase commit, single‑replica writes, monitoring, and community contributions.

Apache DorisData WarehouseMPP
0 likes · 19 min read
Apache Doris at Xiaomi: Architecture Evolution, Performance Optimizations, and Production Practices
DataFunTalk
DataFunTalk
Aug 2, 2022 · Databases

Apache Doris 1.0: Features, Architecture, Performance Improvements and Future Roadmap

This article introduces Apache Doris 1.0, detailing its simplified architecture, high‑concurrency support, MPP execution engine, vectorized engine, memory‑controlled stability, multi‑source integration, upcoming lake‑house unification, storage‑compute separation, real‑time ingestion, and community growth.

Analytical DatabaseApache DorisMPP
0 likes · 18 min read
Apache Doris 1.0: Features, Architecture, Performance Improvements and Future Roadmap
ITPUB
ITPUB
Jul 24, 2022 · Databases

How Apache Doris Enables Real‑Time Queries on Hudi Data Lakes

This article explains Apache Doris’s architecture, introduces the Hudi data‑lake format, compares Lambda and Kappa approaches, and details the design and implementation of Doris’s Hudi external table support, including practical steps, code examples, and future roadmap.

Apache DorisBig DataData Lake
0 likes · 10 min read
How Apache Doris Enables Real‑Time Queries on Hudi Data Lakes
DataFunTalk
DataFunTalk
Jul 18, 2022 · Big Data

Integrating Apache Doris with Hudi: Design, Implementation, and Future Plans

This article introduces Apache Doris, an MPP analytical database, and explains how it integrates with the Hudi data lake format, covering architectural features, design choices, implementation steps including external table creation and query processing, and outlines future enhancements for supporting MOR snapshots and incremental queries.

Apache DorisData LakeHudi
0 likes · 12 min read
Integrating Apache Doris with Hudi: Design, Implementation, and Future Plans
Youzan Coder
Youzan Coder
Jul 7, 2022 · Big Data

Optimizing Apache Doris Performance: A Case Study in Query Processing

Youzan replaced ClickHouse and Druid with Apache Doris, refined its vectorized engine by eliminating deserialization overhead in the merge‑aggregation phase, achieving roughly a 30 % query‑time boost, and validated compatibility through SQL rewriting and traffic replay, while planning further SIMD‑based optimizations and broader adoption.

Apache DorisClickHouseDruid
0 likes · 8 min read
Optimizing Apache Doris Performance: A Case Study in Query Processing
dbaplus Community
dbaplus Community
Jul 6, 2022 · Big Data

Building Real‑Time User Profiles at Zhihu with Apache Doris: A Practical Guide

Zhihu's data‑empowerment team designed a low‑cost, high‑response real‑time data architecture on Apache Doris that powers business analytics, algorithm features, and user profiling, dramatically improving timeliness, reducing targeting costs, and boosting key performance metrics across multiple services.

Apache DorisPerformance Optimizationreal-time data
0 likes · 23 min read
Building Real‑Time User Profiles at Zhihu with Apache Doris: A Practical Guide
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 1, 2022 · Big Data

Curated List of Big Data Resources: ClickHouse, Apache Doris, and Apache Hudi

This article compiles a comprehensive set of Chinese-language resources covering major big-data technologies such as ClickHouse, Apache Doris, and Apache Hudi, including series on distributed tables, MergeTree, replication, optimization techniques, and practical tutorials, with direct links to each detailed guide.

Apache DorisApache HudiBig Data
0 likes · 6 min read
Curated List of Big Data Resources: ClickHouse, Apache Doris, and Apache Hudi
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 20, 2022 · Databases

Apache Doris Installation, Cluster Deployment, Operations Manual, and Integration with Spark & Flink

This guide provides step‑by‑step instructions for downloading Apache Doris, configuring and deploying FE, BE, and Broker nodes, performing scaling operations, managing users and tables, importing and exporting data, and integrating Doris with Spark and Flink using code examples.

Apache DorisDatabase DeploymentFlink Integration
0 likes · 17 min read
Apache Doris Installation, Cluster Deployment, Operations Manual, and Integration with Spark & Flink
NetEase Game Operations Platform
NetEase Game Operations Platform
Jun 10, 2022 · Databases

Apache Doris Deployment and Optimization at NetEase Interactive Entertainment

This article details NetEase Interactive Entertainment's adoption of Apache Doris for large‑scale game data analytics, covering background, Doris architecture, cluster governance, tablet and compaction tuning, scaling strategies, monitoring, alerting, and fault‑handling practices to improve performance and stability.

Apache DorisBig DataCluster Management
0 likes · 22 min read
Apache Doris Deployment and Optimization at NetEase Interactive Entertainment
Big Data Technology Architecture
Big Data Technology Architecture
Jun 9, 2022 · Databases

Building a Real‑Time Data Warehouse with Apache Doris: Architecture, Benefits, and Lessons Learned

This article details how a fast‑growing supply‑chain platform migrated from MySQL and Hive to Apache Doris for real‑time analytics, describing the architectural evolution, the advantages of the new design, practical implementation steps, encountered challenges, and the performance and cost benefits achieved.

Apache DorisData IntegrationFlink CDC
0 likes · 12 min read
Building a Real‑Time Data Warehouse with Apache Doris: Architecture, Benefits, and Lessons Learned
Zuoyebang Tech Team
Zuoyebang Tech Team
Jun 7, 2022 · Big Data

How Doris Powered Zuoyebang’s Real‑Time Data Warehouse for Faster Insights

Zuoyebang’s data team replaced fragmented, slow query solutions with Apache Doris, building a unified real‑time data warehouse that dramatically cut query latency from hours to seconds, streamlined data modeling, and improved reliability across diverse business scenarios, while integrating with Flink, Kafka, and ES via a unified API.

Apache DorisElasticsearchFlink
0 likes · 20 min read
How Doris Powered Zuoyebang’s Real‑Time Data Warehouse for Faster Insights
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 2, 2022 · Operations

Common Operational, Data, and SQL Issues in Apache Doris – FAQs and Solutions

This article compiles frequently asked questions and detailed solutions covering Apache Doris operational problems, data handling errors, and SQL query issues, providing step‑by‑step guidance, configuration tips, and command examples to help administrators troubleshoot and maintain a stable Doris cluster.

Apache DorisConfigurationOperations
0 likes · 28 min read
Common Operational, Data, and SQL Issues in Apache Doris – FAQs and Solutions
Big Data Technology & Architecture
Big Data Technology & Architecture
May 24, 2022 · Databases

Apache Doris Basics: Creating Databases, Tables, Partitioning, Data Import, and Rollup

This article provides a comprehensive guide to Apache Doris, covering how to create databases and tables with single and composite partitions, import data via broker and routine loads, understand its aggregate, uniq, and duplicate data models, and leverage rollup and prefix index features for optimized querying.

Apache DorisPartitioningRollup
0 likes · 20 min read
Apache Doris Basics: Creating Databases, Tables, Partitioning, Data Import, and Rollup
DataFunTalk
DataFunTalk
Apr 20, 2022 · Databases

Apache Doris 1.0 Release: New Vectorized Engine, Hive External Tables, Z‑Order Indexing and More

Apache Doris (incubating) announced its 1.0 release on April 18, 2022, featuring a new vectorized execution engine, Hive external tables, Lateral View syntax, Z‑Order indexing, SeaTunnel integration, numerous performance optimizations, new bitmap functions, security enhancements, and detailed upgrade instructions for users.

Apache DorisHive External TableRelease 1.0
0 likes · 10 min read
Apache Doris 1.0 Release: New Vectorized Engine, Hive External Tables, Z‑Order Indexing and More
DataFunSummit
DataFunSummit
Mar 21, 2022 · Databases

Vectorization in Apache Doris: Design, Implementation, and Future Roadmap

This article explains how Apache Doris adopts CPU‑level vectorization and columnar storage to boost query performance, details the design and current status of its vectorized engine, and outlines future work such as JOIN acceleration, storage‑layer vectorization, import optimization, and extensive SQL function support.

Apache DorisColumnar StoragePerformance Optimization
0 likes · 21 min read
Vectorization in Apache Doris: Design, Implementation, and Future Roadmap
DataFunTalk
DataFunTalk
Feb 27, 2022 · Databases

Vectorization in Apache Doris: Design, Implementation, Current Status, and Future Plans

This article explains how Apache Doris adopts CPU vectorization techniques—such as SIMD, columnar storage, and cache‑friendly designs—to boost query performance, detailing its current vectorized engine architecture, recent benchmarks, ongoing work on JOIN, storage, import, and future enhancements.

Apache DorisColumnar StorageDatabase Performance
0 likes · 22 min read
Vectorization in Apache Doris: Design, Implementation, Current Status, and Future Plans
IT Services Circle
IT Services Circle
Feb 5, 2022 · Big Data

DataEase: Open‑Source Data Visualization Tool Based on SpringBoot, Apache Doris, and Kettle – Installation and Usage Guide

This article introduces DataEase, an open‑source BI platform built with SpringBoot, Apache Doris, and Kettle, explains its system and functional architecture, provides step‑by‑step installation commands and configuration details, and demonstrates how to create datasets, views, and dashboards for data analysis.

Apache DorisBIData visualization
0 likes · 11 min read
DataEase: Open‑Source Data Visualization Tool Based on SpringBoot, Apache Doris, and Kettle – Installation and Usage Guide
macrozheng
macrozheng
Dec 30, 2021 · Big Data

How to Install and Use DataEase: Open‑Source BI with Apache Doris and Docker

This guide walks you through installing the open‑source BI platform DataEase, explains its architecture built on SpringBoot, Apache Doris, and Kettle, and demonstrates how to create data sources, datasets, and visual dashboards from Excel and MySQL using Docker containers.

Apache DorisBIData visualization
0 likes · 12 min read
How to Install and Use DataEase: Open‑Source BI with Apache Doris and Docker
DataFunSummit
DataFunSummit
Dec 18, 2021 · Big Data

Fast OLAP Forum – Latest Practices and Innovations in Real‑Time OLAP

The Fast OLAP Forum held on December 19 at DataFunCon gathers leading experts from Baidu, Tencent, JD, and FreeWheel to share cutting‑edge techniques in vectorized execution, cloud‑native ClickHouse, large‑scale OLAP architectures, and Presto optimizations, offering deep insights for practitioners dealing with massive real‑time data workloads.

Apache DorisBig DataClickHouse
0 likes · 7 min read
Fast OLAP Forum – Latest Practices and Innovations in Real‑Time OLAP
Baidu Geek Talk
Baidu Geek Talk
Nov 24, 2021 · Big Data

Building Big Data Infrastructure at Baidu Aifanfan: Architecture Practices and Lessons Learned

At Baidu Aifanfan, the data team built a unified real‑time and offline big‑data platform—leveraging Watt, Bigpipe, Fengge, AFS and Palo within Lambda/Kappa patterns and a fast‑slow parallel rollout—that cut OLAP query latency from 18 minutes to under 15 seconds, enabled self‑service analytics, and standardized metrics across 15 agile teams.

Apache DorisBig Data ArchitectureData Governance
0 likes · 23 min read
Building Big Data Infrastructure at Baidu Aifanfan: Architecture Practices and Lessons Learned
dbaplus Community
dbaplus Community
Nov 23, 2021 · Databases

Doris vs ClickHouse: Which MPP Database Wins for Large‑Scale OLAP?

This article compares Apache Doris and ClickHouse across architecture, deployment, multi‑tenant management, data import, storage, query capabilities, performance testing, and cost, providing practical guidance for selecting the most suitable analytical database in large‑scale OLAP scenarios.

Analytical DatabaseApache DorisClickHouse
0 likes · 26 min read
Doris vs ClickHouse: Which MPP Database Wins for Large‑Scale OLAP?
JD Retail Technology
JD Retail Technology
Oct 13, 2021 · Databases

Comparative Analysis of Apache Doris and ClickHouse for OLAP Workloads

This article presents a detailed technical comparison between Apache Doris and ClickHouse, covering their architecture, deployment, distributed capabilities, transaction support, data import, storage design, query performance, cost, and future development, and provides guidance on selecting the appropriate engine for specific OLAP scenarios.

Apache DorisClickHouseOLAP
0 likes · 26 min read
Comparative Analysis of Apache Doris and ClickHouse for OLAP Workloads
DataFunTalk
DataFunTalk
Jul 27, 2021 · Big Data

Building a Real‑Time Data Warehouse with Apache Doris at Shuhai Supply Chain

This article describes how Shuhai Supply Chain upgraded its data warehouse from a complex, high‑cost 1.0 architecture to a streamlined, real‑time solution built around Apache Doris, detailing the motivations, design choices, zero‑code ingestion, metadata management, Flink connector, and the resulting performance gains.

Apache DorisBig DataFlink
0 likes · 13 min read
Building a Real‑Time Data Warehouse with Apache Doris at Shuhai Supply Chain