Tag

storage-compute separation

0 views collected around this technical thread.

DataFunTalk
DataFunTalk
Feb 20, 2025 · Big Data

From Integrated Storage‑Compute to Decoupled Architecture: Practical Exploration of Kubernetes, Kyuubi, Celeborn, Blaze, and Hue in Big Data Platforms

This article analyzes the transition from a tightly coupled storage‑compute architecture to a decoupled model, detailing how Kubernetes, Kyuubi, Celeborn, Blaze, and Hue together solve resource inefficiencies, improve scalability, and boost query performance in modern big‑data environments.

Big DataBlazeCeleborn
0 likes · 16 min read
From Integrated Storage‑Compute to Decoupled Architecture: Practical Exploration of Kubernetes, Kyuubi, Celeborn, Blaze, and Hue in Big Data Platforms
DataFunSummit
DataFunSummit
Dec 21, 2024 · Big Data

Big Data Implementation Practices and Architecture in a Foreign Bank

This article shares the foreign bank's big data implementation journey, covering background and goals, overall planning and architecture, practical insights, phased rollout, data governance, security, and Q&A, illustrating how a unified data platform, storage‑compute separation, and AI‑driven tools drive business innovation.

AIBig Databanking
0 likes · 19 min read
Big Data Implementation Practices and Architecture in a Foreign Bank
Baidu Geek Talk
Baidu Geek Talk
Apr 3, 2024 · Databases

Cloud-Native Database: Market Trends, Technical Evolution and Accessibility

Cloud-native databases, now backed by major providers and projected to power 95 % of digital business by 2025, are rapidly evolving from traditional systems to flexible, Kubernetes-compatible, MySQL/PostgreSQL-compatible, HTAP-enabled, serverless platforms—exemplified by Baidu’s GaiaDB with advanced consensus, low-latency networking, columnar storage, AI-driven operations—while enterprises balance adoption benefits against deployment, maturity, and sustainability concerns.

AI4DBGaiaDBHTAP
0 likes · 15 min read
Cloud-Native Database: Market Trends, Technical Evolution and Accessibility
DataFunSummit
DataFunSummit
Feb 1, 2024 · Databases

StarRocks 3.0 Storage‑Compute Separation Architecture: Design, Implementation, and Evaluation

This article explains the storage‑compute separation architecture introduced in StarRocks 3.0, presents industry case studies, details the design of StarOS and compute nodes, discusses technical challenges and key techniques, and evaluates cost, reliability, elasticity, and performance through benchmarks and user feedback.

Cloud NativeDistributed DatabasesStarRocks
0 likes · 11 min read
StarRocks 3.0 Storage‑Compute Separation Architecture: Design, Implementation, and Evaluation
DataFunTalk
DataFunTalk
Jan 24, 2024 · Databases

Kuaishou Graph Database Storage‑Compute Separation Architecture and Its Application in Real‑Time Recommendation

This article presents Kuaishou's graph database storage‑compute separation architecture, detailing its application in real‑time recommendation scenarios, core requirements of cost, performance and usability, the layered service design, memory‑compact models, edge structures, snapshot isolation, and key performance optimizations such as Share‑Nothing and columnar data flow.

graph databaseperformance optimizationreal-time recommendation
0 likes · 11 min read
Kuaishou Graph Database Storage‑Compute Separation Architecture and Its Application in Real‑Time Recommendation
Sohu Tech Products
Sohu Tech Products
Nov 1, 2023 · Databases

Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions

Douyin tackled vector‑retrieval challenges by optimizing HNSW and creating a high‑performance IVF algorithm, implementing custom scalar quantization, SIMD acceleration, and a DSL‑driven engine that merges filtering with search, then built a cloud‑native, storage‑compute‑separated vector database (VikingDB) delivering sub‑10 ms latency, real‑time updates, multi‑tenant support, and secure, scalable retrieval for LLM‑driven applications.

ANNCloud NativeLLM integration
0 likes · 18 min read
Engineering Practices of Douyin's Vector Database: From Retrieval Challenges to Cloud‑Native Solutions
DataFunTalk
DataFunTalk
Sep 17, 2023 · Cloud Native

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

REDck is a cloud‑native, storage‑compute separated real‑time OLAP data warehouse derived from ClickHouse that addresses scalability, operational cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, and two‑phase commit transactions.

ClickHouseCloud NativeReal-time OLAP
0 likes · 18 min read
REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Sep 6, 2023 · Databases

REDck: A Cloud‑Native Real‑Time OLAP Data Warehouse Built on ClickHouse

REDck is a cloud‑native, real‑time OLAP data warehouse built on ClickHouse that adds elastic compute and storage scaling, object‑storage optimizations, multi‑level caching, and exactly‑once ingestion, delivering petabyte‑scale interactive analytics with ten‑fold CPU efficiency, ten‑fold cost reduction, and 99.9% availability.

Big DataClickHouseCloud Native
0 likes · 21 min read
REDck: A Cloud‑Native Real‑Time OLAP Data Warehouse Built on ClickHouse
Baidu Geek Talk
Baidu Geek Talk
Jul 1, 2022 · Big Data

Evolution of Data Platform Technology: From Data Warehouse to Lakehouse Architecture

The article traces the evolution of data platforms from early data warehouses—using schema‑on‑write, columnar storage, and MPP engines—to data lakes that retain raw data with schema‑on‑read, and finally to lakehouse architectures that merge storage and compute, offering unified metadata, versioning, and support for BI, big‑data, AI, and HPC workloads.

Data WarehouseLakehouseMPP
0 likes · 25 min read
Evolution of Data Platform Technology: From Data Warehouse to Lakehouse Architecture
DataFunSummit
DataFunSummit
May 14, 2022 · Databases

Design of Cloud‑Native ClickHouse: Architecture, Storage‑Compute Separation, and MPP Query Layer

This article presents the cloud‑native redesign of ClickHouse, covering its current technical limitations in storage and computation, the proposed storage‑compute separation with DDL task management, multi‑replica and CommitLog mechanisms, and a new MPP query layer to meet future data‑warehouse demands such as real‑time analytics, flexibility, high throughput, low cost, and support for semi‑structured data.

Big DataClickHouseCloud Native
0 likes · 15 min read
Design of Cloud‑Native ClickHouse: Architecture, Storage‑Compute Separation, and MPP Query Layer
Tencent Cloud Developer
Tencent Cloud Developer
Feb 28, 2022 · Big Data

GooseFS: Distributed Caching System for Storage-Compute Separation Architecture

GooseFS, Tencent Cloud’s distributed caching system for storage‑compute separation, links compute frameworks to underlying storage (COS, CHDFS, COSN) and boosts big‑data and AI workloads by 2‑10× through transparent acceleration, robust master‑worker architecture, Raft‑based HA, tiered caching, and metadata optimizations, delivering up to 50% cost savings and 29% faster compute jobs.

GooseFSMetadata OptimizationRaft consensus
0 likes · 18 min read
GooseFS: Distributed Caching System for Storage-Compute Separation Architecture
Tencent Architect
Tencent Architect
Sep 10, 2021 · Databases

Design and Advantages of a Cloud‑Native ClickHouse OLAP System

This article presents the architecture, key features, and operational benefits of a cloud‑native ClickHouse OLAP platform, describing how storage‑compute separation, a unified master node, and shared storage reduce cost, improve availability, and simplify management while remaining fully compatible with the open‑source ClickHouse ecosystem.

ClickHouseCloud NativeDatabase Architecture
0 likes · 18 min read
Design and Advantages of a Cloud‑Native ClickHouse OLAP System
Tencent Cloud Developer
Tencent Cloud Developer
Dec 20, 2018 · Databases

CynosDB Architecture and Optimization: A PostgreSQL-Compatible NewSQL Database

CynosDB, Tencent’s PostgreSQL‑compatible NewSQL service, separates compute and storage, uses a log‑based distributed CynosStore with idempotent logs, offloads CRC checks, and implements async table extension, eliminating full‑page writes and dirty‑page flushing to deliver scalable, cost‑effective performance while preserving PostgreSQL features.

CynosDBDatabase ArchitectureLog System Optimization
0 likes · 12 min read
CynosDB Architecture and Optimization: A PostgreSQL-Compatible NewSQL Database