Tag

Data Partitioning

0 views collected around this technical thread.

Architect's Guide
Architect's Guide
Sep 5, 2024 · Databases

Strategies for Fast Import of 1 Billion Records into MySQL

To import one billion 1 KB log records stored in HDFS or S3 into MySQL efficiently, the article examines data partitioning, B‑tree index limits, batch insertion, storage engine choices, concurrency control, file‑reading methods, task scheduling with Redis, Redisson, and Zookeeper for reliable, ordered, high‑throughput loading.

Data PartitioningMySQLRedis
0 likes · 18 min read
Strategies for Fast Import of 1 Billion Records into MySQL
DataFunTalk
DataFunTalk
Jul 14, 2023 · Databases

Implementing Real‑Time Materialized Views to Accelerate Large‑Scale Time‑Series Queries

This article explains how to implement real‑time materialized views to accelerate large‑scale time‑series data queries, covering the need for materialized views, their definition, storage, incremental updates, pre‑computation, query partitioning, performance testing, and future directions.

Data PartitioningPre-aggregationQuery Acceleration
0 likes · 16 min read
Implementing Real‑Time Materialized Views to Accelerate Large‑Scale Time‑Series Queries
Architect
Architect
Dec 30, 2022 · Databases

Database Sharding and Partitioning Strategy for High‑Volume Order Systems

The article explains how to handle billions of daily orders by classifying data into hot and cold segments, storing them in MySQL, Elasticsearch, and Hive, and applying sharding and partitioning techniques at both table and database levels to achieve scalable performance.

Data PartitioningElasticsearchHive
0 likes · 9 min read
Database Sharding and Partitioning Strategy for High‑Volume Order Systems
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 1, 2022 · Databases

Understanding Redis Cluster Architecture: High Availability, Data Partitioning, and Proxy Strategies

This article explains the fundamental concepts of Redis cluster architecture, covering high‑availability with Sentinel, data partitioning methods, proxy‑based sharding techniques, the mechanics of Redis Cluster without a central node, and practical considerations for multi‑key operations in a distributed environment.

ClusterData PartitioningHigh Availability
0 likes · 9 min read
Understanding Redis Cluster Architecture: High Availability, Data Partitioning, and Proxy Strategies
Top Architect
Top Architect
May 18, 2022 · Databases

Evolution of JD Baitiao Backend Architecture: From MySQL to ShardingSphere

This article chronicles the architectural evolution of JD Baitiao’s backend—from early MySQL monoliths through Solr‑HBase, MongoDB, and DBRep—to the adoption of Apache ShardingSphere, highlighting the motivations, technical trade‑offs, decoupling strategies, and performance outcomes for a high‑throughput financial service.

Data PartitioningDatabase ArchitectureJD Baitiao
0 likes · 15 min read
Evolution of JD Baitiao Backend Architecture: From MySQL to ShardingSphere
vivo Internet Technology
vivo Internet Technology
Feb 28, 2022 · Databases

Distributed Database Sorting Solutions

In distributed databases, proxies must merge sorted results from multiple shards, but large result sets exceed memory limits; the article proposes a batch‑fetching approach using per‑shard sort buffers and a priority‑queue merge, eliminating disk I/O and reducing network waste while preserving global order.

Data PartitioningDatabase ArchitectureDistributed Databases
0 likes · 15 min read
Distributed Database Sorting Solutions
vivo Internet Technology
vivo Internet Technology
Oct 20, 2021 · Databases

Database Sharding Strategies: Common Approaches, Pitfalls, and Best Practices

Effective MySQL sharding requires sustainable, low‑skew designs, favoring hash‑based methods with proper coprime counts, two‑stage partitioning, routing tables, or consistent hashing, while supporting expansion via doubling or flexible consistent‑hash growth to avoid hot spots and uneven data distribution.

Data PartitioningHash ShardingMySQL
0 likes · 23 min read
Database Sharding Strategies: Common Approaches, Pitfalls, and Best Practices
Architect
Architect
Dec 27, 2020 · Big Data

Optimizing Billion‑Scale Hive Queries: Partitioning, Indexing, Bucketing, Active‑User Segmentation, and Data Structure Refactoring

This article walks through the challenges of querying a 300‑billion‑row Hive table, analyzes why traditional partitioning, indexing, and bucketing fall short, and presents a practical solution that combines active‑user segmentation and a redesigned array‑based data model to cut query time from hours to minutes.

Big DataData ModelingData Partitioning
0 likes · 10 min read
Optimizing Billion‑Scale Hive Queries: Partitioning, Indexing, Bucketing, Active‑User Segmentation, and Data Structure Refactoring
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 3, 2020 · Databases

Understanding ClickHouse MergeTree Partitioning and Merge Rules

This article explains how ClickHouse's MergeTree engine creates partition directories based on a partition key, details the naming convention PartitionID_MinBlockNum_MaxBlockNum_Level, and describes the automatic and manual merge processes that consolidate partitions for efficient storage.

ClickHouseData PartitioningDatabase
0 likes · 8 min read
Understanding ClickHouse MergeTree Partitioning and Merge Rules
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Aug 9, 2020 · Databases

MySQL Passive Performance Optimization Principles and Practices

This article explains the principles of MySQL performance optimization, distinguishes active and passive approaches, and provides concrete solutions for slow single queries, partially slow queries, and overall slow queries through proper indexing, data partitioning, slow‑query‑log configuration, and read‑write splitting.

Data PartitioningMySQLRead-Write Splitting
0 likes · 12 min read
MySQL Passive Performance Optimization Principles and Practices
Sohu Tech Products
Sohu Tech Products
Jan 8, 2020 · Databases

Understanding Distributed Database Scenarios and Data Partitioning

This article explains the primary use cases for distributed databases, contrasts them with traditional databases, and describes how data partitioning and metadata enable clients to locate data without scanning all nodes, highlighting both external user benefits and internal implementation challenges.

Data PartitioningDistributed DatabasesHigh Availability
0 likes · 3 min read
Understanding Distributed Database Scenarios and Data Partitioning
Architecture Digest
Architecture Digest
Apr 27, 2019 · Backend Development

Scalable Distributed System Design Using the Cube Model (X/Y/Z Axis Expansion)

The article introduces the Cube Model for scalable microservice architectures, explaining how X‑axis (horizontal scaling), Y‑axis (functional decomposition) and Z‑axis (data partitioning and isolation) expansions address capacity, complexity, and differentiated service demands in high‑traffic distributed systems.

Data PartitioningLoad Balancingbackend architecture
0 likes · 9 min read
Scalable Distributed System Design Using the Cube Model (X/Y/Z Axis Expansion)
Tencent Cloud Developer
Tencent Cloud Developer
Feb 20, 2019 · Databases

Understanding Database Sharding: Concepts, Benefits, Drawbacks, and Strategies

Database sharding, a horizontal partitioning technique that splits a table’s rows across multiple nodes, enables scalable performance and fault isolation for high‑traffic applications, but introduces complexity, potential data imbalance, and recovery challenges, so it should be adopted only after simpler optimizations are exhausted.

Data Partitioningdatabase shardinghorizontal scaling
0 likes · 15 min read
Understanding Database Sharding: Concepts, Benefits, Drawbacks, and Strategies
Architecture Digest
Architecture Digest
Dec 22, 2017 · Big Data

Redesign and Optimization of the WeChat Pay Transaction Record System

This article presents a comprehensive case study of how WeChat Pay rebuilt its transaction record storage system to handle massive data volumes, improve performance, ensure data completeness, support flexible queries, and strengthen security through distributed key‑value storage, data partitioning, and operational safeguards.

Big DataData PartitioningTransaction Storage
0 likes · 11 min read
Redesign and Optimization of the WeChat Pay Transaction Record System
Architecture Digest
Architecture Digest
Dec 5, 2017 · Big Data

Redesign and Optimization of WeChat Pay Transaction Record System

The article presents a comprehensive case study of how WeChat Pay rebuilt its transaction record storage to handle massive data growth, improve performance, ensure data completeness, support flexible queries, and strengthen security through distributed key‑value storage, hierarchical partitioning, holiday traffic shaping, and strict access controls.

Big DataData PartitioningTransaction Storage
0 likes · 11 min read
Redesign and Optimization of WeChat Pay Transaction Record System