Tag

Storage Optimization

0 views collected around this technical thread.

360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jun 6, 2025 · Fundamentals

How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive

This article explains how Erasure Coding (EC) improves data reliability and dramatically reduces storage overhead in Ozone by leveraging hot‑cold data characteristics, intelligent tiering, dynamic EC ratios, and repair throttling, while also discussing performance trade‑offs and limitations.

Data ReliabilityErasure CodingOzone
0 likes · 9 min read
How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive
Kuaishou Tech
Kuaishou Tech
May 28, 2025 · Databases

Optimizing Kuaishou's Photo Object Storage: Reducing Size and Boosting Cache Hit Rate

This article details how Kuaishou dramatically cut storage costs and improved cache efficiency for its core Photo data object by cleaning up redundant JSON fields, applying selective serialization, and performing large‑scale data cleaning, achieving a 25% size reduction, a 2% cache‑hit increase, and multi‑hundred‑TB savings.

Big DataCache Hit RateDatabase
0 likes · 20 min read
Optimizing Kuaishou's Photo Object Storage: Reducing Size and Boosting Cache Hit Rate
DataFunSummit
DataFunSummit
Jun 3, 2024 · Big Data

Data Governance and Active Metadata Practices at JD Retail

The article outlines JD Retail's data management challenges—including asset awareness, architectural agility, development quality, and rising resource costs—and presents a comprehensive data governance framework that leverages data standards, agile architecture, development isolation, resource optimization, and active metadata to achieve intelligent lifecycle evaluation, automated back‑fill, and future‑oriented data fabric improvements.

Active MetadataBig DataStorage Optimization
0 likes · 18 min read
Data Governance and Active Metadata Practices at JD Retail
JD Tech
JD Tech
Feb 21, 2024 · Operations

Storage Model Optimization and Performance Testing for Hot SKU Inventory Pre‑occupancy

This article explores practical performance testing and tuning techniques, focusing on storage model optimization and call‑chain analysis to improve hot‑SKU inventory pre‑occupancy throughput, presenting detailed pressure‑testing scenarios, results, cache‑layer redesign, and strategies for identifying and mitigating system bottlenecks.

Storage Optimizationcachinginventory pre-occupancy
0 likes · 15 min read
Storage Model Optimization and Performance Testing for Hot SKU Inventory Pre‑occupancy
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jan 25, 2024 · Backend Development

Cloud Music RTA Advertising and User Acquisition System: Architecture and Optimization Practices

NetEase Cloud Music’s RTA advertising system delivers real‑time, personalized ads at massive scale by using isolated Nginx clusters, layered decoupling, asynchronous Netty/Redis processing, and optimized storage with hash‑based key compression and Protostuff serialization, while supporting automated audience selection and in‑app attribution to boost user acquisition.

Advertising TechnologyHigh Performance ComputingRTA advertising
0 likes · 12 min read
Cloud Music RTA Advertising and User Acquisition System: Architecture and Optimization Practices
Didi Tech
Didi Tech
Jan 9, 2024 · Big Data

Introducing Apache Pulsar: Technical Benefits and Solutions for Didi Big Data Messaging System

Apache Pulsar, a cloud‑native distributed messaging platform, solves Didi Big Data’s DKafka bottlenecks by separating compute and storage, using sequential log writes, heterogeneous disks, multi‑level caching, bundle‑based load balancing and automatic scaling, dramatically improving stability while introducing richer monitoring complexity.

Apache PulsarBig DataCluster Management
0 likes · 17 min read
Introducing Apache Pulsar: Technical Benefits and Solutions for Didi Big Data Messaging System
JD Retail Technology
JD Retail Technology
Dec 28, 2023 · Databases

Methods and Practices for Reducing MySQL Database Storage Costs

This article outlines the background, challenges, systematic methods, benefit calculations, data‑safety and stability checks, verification steps, rollback strategies, and gray‑deployment practices for lowering MySQL storage expenses in large‑scale billing systems while maintaining system reliability.

Data SafetyDatabase Cost ReductionMySQL
0 likes · 12 min read
Methods and Practices for Reducing MySQL Database Storage Costs
Tencent Cloud Developer
Tencent Cloud Developer
Sep 20, 2023 · Operations

Storage Governance and Optimization Practices for Meeting Control Systems

The article explains how a meeting control system tackled severe storage pressure from high concurrent traffic by introducing a proxy layer, multi‑active disaster‑recovery, identity‑based data isolation, dynamic‑static key separation, multi‑level caching, overload protection, sharding with dual‑write migration, and extensive monitoring to meet 100k QPS and ensure reliability.

RedisStorage Optimizationdatabase sharding
0 likes · 49 min read
Storage Governance and Optimization Practices for Meeting Control Systems
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 25, 2023 · Big Data

Venus Log Platform Architecture Evolution: From ELK to Data Lake

The Venus log platform at iQiyi migrated from an ElasticSearch‑Kibana architecture to an Iceberg‑based data lake with Trino, cutting storage and compute costs by over 70%, boosting stability by 85%, and efficiently supporting billions of daily logs through write‑heavy, low‑query workloads.

Big DataElasticsearchIceberg
0 likes · 22 min read
Venus Log Platform Architecture Evolution: From ELK to Data Lake
Didi Tech
Didi Tech
May 26, 2023 · Big Data

Design and Optimization of Didi's Spatial‑Temporal Supply‑Demand System

Didi’s redesigned Spatial‑Temporal Supply‑Demand System replaces a single‑Redis bottleneck with a multi‑cluster routing layer, semantic sharding, multi‑level caching and delayed queues, achieving higher horizontal scalability, fault isolation, ~30 % latency reduction, increased cache hit rates, fewer query nodes, and faster, code‑free feature configuration.

Big DataConfiguration ManagementPerformance Tuning
0 likes · 19 min read
Design and Optimization of Didi's Spatial‑Temporal Supply‑Demand System
DataFunTalk
DataFunTalk
May 22, 2023 · Big Data

Alibaba Cloud Data Lake: Unified Metadata and Storage Management Practices

This article explains Alibaba Cloud's data lake architecture, unified metadata services, storage management optimizations, and format handling techniques, illustrating how lakehouse concepts, multi‑engine support, and lifecycle policies enable efficient, secure, and cost‑effective big data processing in the cloud.

Big DataCloud ServicesLakehouse
0 likes · 22 min read
Alibaba Cloud Data Lake: Unified Metadata and Storage Management Practices
Coolpad Technology Team
Coolpad Technology Team
Apr 27, 2023 · Cloud Computing

EROFS Cluster Mode Analysis in Linux Kernel 6.x

This article analyzes the EROFS cluster modes (INFLIGHT, HOOKED, FOLLOWED, FOLLOWED_NOINPLACE) in Linux kernel 6.x, explaining how they determine whether in-place I/O can be used based on the current status of pclusters in the chain.

Cluster ModesEROFSFile System
0 likes · 6 min read
EROFS Cluster Mode Analysis in Linux Kernel 6.x
Bilibili Tech
Bilibili Tech
Apr 11, 2023 · Big Data

Bilibili Big Data Governance: From Reactive Storage Management to Proactive Multi‑Dimensional Governance

Bilibili’s exabyte‑scale big‑data platform, after rapid growth created fragmented ownership and costly storage, launched the Wanglou project to build a metadata‑driven, indicator‑based governance framework that cut storage use by half, introduced compliance scoring and automation, and now plans to extend proactive, multi‑dimensional governance to compute, traffic and lake‑house resources.

Big DataBilibiliCost Management
0 likes · 21 min read
Bilibili Big Data Governance: From Reactive Storage Management to Proactive Multi‑Dimensional Governance
Bilibili Tech
Bilibili Tech
Mar 14, 2023 · Big Data

Bilibili HDFS Erasure Coding Strategy and Implementation

Bilibili reduced petabyte‑scale storage costs by back‑porting erasure‑coding patches to its HDFS 2.8.4 cluster, deploying a parallel EC‑enabled cluster, adding a data‑proxy service, intelligent routing and block‑checking, and automating cold‑data migration, while noting write overhead and planning native acceleration.

Big DataData ReliabilityErasure Coding
0 likes · 14 min read
Bilibili HDFS Erasure Coding Strategy and Implementation
DeWu Technology
DeWu Technology
Feb 15, 2023 · Backend Development

E-commerce Product Ranking System Migration: Technical Implementation and Storage Optimization

The article describes how an e‑commerce product ranking system was migrated to the new “Liao Yue” platform, decoupling it from the search module, introducing fresh metrics and Elasticsearch‑based sorting, then optimizing storage by separating B‑end and C‑end data—cutting costs 60%—with a gray‑scale rollout, dual‑read validation, rollback safeguards, and completing the two‑week, zero‑failure migration that delivered a closed‑loop, faster iteration system.

ElasticsearchStorage Optimizationbackend development
0 likes · 15 min read
E-commerce Product Ranking System Migration: Technical Implementation and Storage Optimization
Architecture Digest
Architecture Digest
Aug 14, 2022 · Big Data

Replacing Classic Data Warehouse Dimensional Model with a Single Wide Table: Architecture, Benefits, and Challenges

This article analyzes the shift from traditional multi‑layer data warehouse dimensional modeling to a single-layer wide‑table approach, detailing business drivers, technical architecture, storage and query performance gains, as well as the development, maintenance, and operational challenges involved.

Big DataData WarehouseStorage Optimization
0 likes · 10 min read
Replacing Classic Data Warehouse Dimensional Model with a Single Wide Table: Architecture, Benefits, and Challenges
vivo Internet Technology
vivo Internet Technology
Jun 29, 2022 · Big Data

Lossless Image Compression Overview and Lepton Optimization for Large‑Scale Storage

The article explains JPEG’s lossy fundamentals, introduces Lepton’s lossless layer and its optimizations—such as arithmetic coding and multithreaded Huffman switching—and describes how vivo’s hybrid physical‑server and Kubernetes deployment achieves roughly 22 % storage reduction across petabytes of JPEG images despite high CPU demands.

Huffman codingJPEGLepton
0 likes · 13 min read
Lossless Image Compression Overview and Lepton Optimization for Large‑Scale Storage
Baidu Geek Talk
Baidu Geek Talk
Jun 15, 2022 · Big Data

Replacing Classic Data Warehouse with a One‑Layer Wide Table Model: Architecture, Benefits, and Challenges

The article proposes replacing the traditional multi‑layered data‑warehouse architecture (ODS‑DWD‑DWS‑ADS) with a single, column‑store wide‑table per business theme, achieving roughly 30 % storage savings and faster queries, while acknowledging higher ETL complexity, back‑tracking costs, and production timing challenges.

Big DataData WarehouseETL
0 likes · 11 min read
Replacing Classic Data Warehouse with a One‑Layer Wide Table Model: Architecture, Benefits, and Challenges
ByteDance Data Platform
ByteDance Data Platform
Apr 27, 2022 · Big Data

How ByteDance Built a Scalable Data Catalog: Key Technologies and Future Plans

ByteDance’s Data Catalog article details the system’s unified metadata model, standardized ingestion connectors, search optimization techniques, lineage capabilities, and storage layer enhancements, highlighting key technical designs, performance improvements, and future work to advance data governance and asset utilization.

Big DataData CatalogStorage Optimization
0 likes · 12 min read
How ByteDance Built a Scalable Data Catalog: Key Technologies and Future Plans
DeWu Technology
DeWu Technology
Apr 18, 2022 · Artificial Intelligence

Warehouse Storage Location Recommendation: Architecture, Recall, and Ranking Strategies

The article outlines DeWu’s warehouse‑management recommendation system, which combines an online‑near‑line‑offline architecture to quickly recall viable shelf slots and rank them by space utilization, travel time, and sales potential, enabling automated, constraint‑aware placement that cuts picking time and inventory costs.

AIBig DataRanking
0 likes · 16 min read
Warehouse Storage Location Recommendation: Architecture, Recall, and Ranking Strategies