Tagged articles

data storage

85 articles · Page 1 of 1

Jun 20, 2026 · Databases

Why Using DELETE in MySQL Can Get You Fired: Risks and Safer Alternatives

The article explains MySQL's storage architecture, compares DELETE, TRUNCATE and DROP for removing data, shows how DELETE leaves hidden rows and generates logs that waste space, and advises using TRUNCATE, DROP or OPTIMIZE TABLE to reclaim disk space and avoid irreversible mistakes.

DROPTRUNCATETransaction

0 likes · 8 min read

Why Using DELETE in MySQL Can Get You Fired: Risks and Safer Alternatives

Big Data Technology Tribe

Jun 6, 2026 · Big Data

What Is Lance BlobV2 and How Does It Improve Large Binary Data Storage?

Lance BlobV2 introduces a PyArrow ExtensionType for large binary objects, enabling lazy streaming, external URI references, range slicing, and flexible ingest or reference modes, while providing clear APIs and schema helpers that address the limitations of previous blob implementations.

BlobV2ExtensionTypeLance

0 likes · 24 min read

What Is Lance BlobV2 and How Does It Improve Large Binary Data Storage?

Java Tech Enthusiast

May 22, 2026 · Industry Insights

When Hard Drives Were the Size of Washing Machines: A 60‑Year Storage Evolution

From the 1962 IBM 1311—an appliance‑sized, 9‑pound disk pack holding just 1.5 MB—to today’s pocket‑sized terabyte USB drives, this article traces six decades of magnetic storage breakthroughs, cost drops, and the societal impact of ever‑shrinking data capacity.

IBM 1311data storagehard drive history

0 likes · 12 min read

When Hard Drives Were the Size of Washing Machines: A 60‑Year Storage Evolution

php Courses

Nov 24, 2025 · Backend Development

How to Build Secure User Registration and Data Storage with PHP

This guide shows how to implement a PHP function for user registration that validates input, hashes passwords, stores user details in a MySQL database, and a separate function for generic data storage, while highlighting key security considerations.

PHPdata storagemysql

0 likes · 4 min read

How to Build Secure User Registration and Data Storage with PHP

Architect's Tech Stack

Nov 8, 2025 · Databases

Why Store IPv4 as Unsigned INT in MySQL? Benefits, Drawbacks & Java Conversion

The article explains why MySQL recommends storing IPv4 addresses as unsigned 32‑bit integers instead of strings, detailing space savings, faster range queries, and indexing benefits, while also noting readability drawbacks and providing MySQL functions and Java code for converting between string and integer representations.

IPv4Unsigned Integerdata storage

0 likes · 5 min read

Why Store IPv4 as Unsigned INT in MySQL? Benefits, Drawbacks & Java Conversion

JD Tech Talk

Jul 16, 2025 · Databases

How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Data Tiering

JD Advertising built a massive ad‑data warehouse on Apache Doris, reaching nearly 1 PB and 18 trillion rows, then implemented a hot‑cold data tiering strategy—first a lake‑based approach, later a native tiering solution in Doris 2.0—reducing storage costs by 87% and boosting query performance over tenfold.

Apache DorisSchema Changecold-hot tiering

0 likes · 18 min read

How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Data Tiering

Python Programming Learning Circle

Jul 1, 2025 · Operations

How to Scrape Weibo Data with Python: Complete Guide & Code

This tutorial walks through using Python to crawl Weibo, covering environment setup, three login methods, data extraction functions for user info, posts and comments, anti‑crawling strategies, storage to CSV or MySQL, a full example script, and legal considerations.

SeleniumWeb ScrapingWeibo

0 likes · 12 min read

How to Scrape Weibo Data with Python: Complete Guide & Code

dbaplus Community

Jun 15, 2025 · Databases

Why MySQL DELETE Is Discouraged and How DROP/TRUNCATE Offer Faster, Safer Alternatives

This article explains MySQL's storage architecture, why using DELETE to remove rows is discouraged, compares DELETE, DROP, and TRUNCATE in terms of speed, space reclamation, transaction behavior, and auto‑increment handling, and provides practical commands and optimization tips.

DROPDatabase ManagementTRUNCATE

0 likes · 9 min read

Why MySQL DELETE Is Discouraged and How DROP/TRUNCATE Offer Faster, Safer Alternatives

DataFunSummit

Jun 3, 2025 · Big Data

BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing

BiFang is a lake‑stream integrated storage engine that merges Apache Pulsar message‑queue capabilities with Iceberg data‑lake features, providing a single unified data store with full‑incremental queries, sub‑second visibility, exactly‑once semantics, and seamless integration with Flink, Spark, and StarRocks for both real‑time analytics and batch processing.

Apache IcebergApache PulsarLakehouse

0 likes · 13 min read

BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing

JD Retail Technology

Feb 20, 2025 · Big Data

Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris

JD Advertising built a petabyte‑scale ad analytics service on Apache Doris, identified a hot‑cold access pattern, and implemented a native cold‑hot tiering solution (upgrading to Doris 2.0 and optimizing schema changes) that cut storage costs by ~87% and boosted concurrent query capacity over tenfold while simplifying operations.

Apache DorisBig DataPerformance Optimization

0 likes · 18 min read

Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris

Architect's Guide

Jan 21, 2025 · Databases

Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Conversion Techniques

The article explains that using a 32‑bit UNSIGNED INT to store IPv4 addresses in MySQL saves space and improves index and range‑query performance, outlines the storage savings compared to VARCHAR, mentions the need for manual conversion, and provides MySQL and Java code examples for converting between string and integer representations.

IPv4SQLUNSIGNED INT

0 likes · 5 min read

Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Conversion Techniques

DevOps

Jan 8, 2025 · Artificial Intelligence

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

The article explains how generative AI agents combine language models, tool integration, self‑guided planning, prompt‑engineering frameworks, extensions, function calls, and vector‑based data storage to create adaptable, retrieval‑augmented systems that can interact with real‑world APIs and perform complex tasks.

ExtensionsRAGdata storage

0 likes · 12 min read

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

Python Programming Learning Circle

Nov 19, 2024 · Databases

Overview of Lightweight Python Databases and Their Usage

This article surveys several lightweight Python databases—including PickleDB, TinyDB, ZODB, Durus, Buzhug, Gadfly, and PyTables—detailing their main features, typical use cases, cautions, and providing basic code examples to help developers choose and apply the right storage solution for small‑scale or prototype projects.

DatabasesNoSQLObject Persistence

0 likes · 23 min read

Overview of Lightweight Python Databases and Their Usage

Architect

Nov 6, 2024 · Databases

Storing IPv4 as Unsigned Int in MySQL: Benefits, Drawbacks & Code

Using an unsigned INT to store IPv4 addresses in MySQL saves space and enables efficient range queries, while strings are larger and slower; the article explains these advantages, outlines conversion functions INET_ATON/INET_NTOA, shows equivalent handling for IPv6, and provides Java utilities for bidirectional conversion.

IPv4JavaSQL

0 likes · 6 min read

Storing IPv4 as Unsigned Int in MySQL: Benefits, Drawbacks & Code

php Courses

Sep 27, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This tutorial explains how to implement user registration and data storage in web applications using PHP functions, covering input validation, password hashing, MySQL connection, SQL insertion, and providing complete code examples for both registering users and storing generic data.

backenddata storageuser registration

0 likes · 4 min read

Implementing User Registration and Data Storage with PHP Functions

DataFunSummit

Aug 31, 2024 · Big Data

Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)

This article explains Apache Hudi's clustering service, detailing its workflow, three execution modes, and layout optimization strategies—including linear, Z‑order, and Hilbert space‑filling curves—to improve storage locality and query performance in large‑scale data lake environments.

Apache HudiBig DataClustering

0 likes · 8 min read

Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)

Java Backend Technology

Jul 26, 2024 · Databases

Why SQLite Powers Billions of Devices: A Deep Dive into Its Uses and Benefits

This article explains why SQLite, the lightweight embedded relational database, is the most widely deployed database in the world, detailing its origins, core features, and diverse usage scenarios across mobile, embedded, desktop, data analysis, and web acceleration contexts.

Embedded DatabaseMobile DevelopmentSQLite

0 likes · 6 min read

Why SQLite Powers Billions of Devices: A Deep Dive into Its Uses and Benefits

php Courses

Jul 25, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This article explains how to use PHP functions to create a user registration system with input validation, password hashing, and MySQL insertion, and also demonstrates a generic data‑storage function that connects to a database and saves arbitrary data, highlighting key security and implementation steps.

Web Developmentdata storagemysql

0 likes · 4 min read

Volcano Engine Developer Services

Jun 14, 2024 · Operations

How ByteDance Built an EB‑Scale Log Service: Design & Optimization

This article details the evolution of ByteDance's TLS (Tinder Log Service) from a Loki‑based prototype to an EB‑scale, cloud‑native log system, covering its core properties, data organization, architecture, caching, hybrid storage, private codec, ecosystem compatibility, intelligent features, and real‑world case studies.

ByteDanceCloud NativeScalable Architecture

0 likes · 24 min read

How ByteDance Built an EB‑Scale Log Service: Design & Optimization

Architect's Guide

May 25, 2024 · Fundamentals

How Data Is Stored: An Overview of RAM, DRAM, and Memory Controllers

This article explains the fundamentals of data storage in computers, covering the concepts of RAM and DRAM, the role of capacitors and transistors, and how memory controllers and CPU caches work together to manage and accelerate access to binary information.

Computer ArchitectureDRAMMemory Controller

0 likes · 7 min read

How Data Is Stored: An Overview of RAM, DRAM, and Memory Controllers

php Courses

May 21, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This article demonstrates how to use PHP functions to implement user registration and data storage, covering input validation, password hashing, MySQL database connection, SQL insertion, and returning operation results, while highlighting security considerations and practical code examples.

Backend DevelopmentPHPdata storage

0 likes · 4 min read

DataFunTalk

Apr 28, 2024 · Big Data

Ant Group’s Data Governance Practices: Overview, Data Quality, and Data Storage Governance

This article shares Ant Group's extensive experience in big data governance, detailing the overall data governance framework, data quality management, data storage governance, and future considerations, illustrated with practical cases and strategies for ensuring compliance, reliability, and cost efficiency.

Ant GroupBig DataData Architecture

0 likes · 17 min read

Ant Group’s Data Governance Practices: Overview, Data Quality, and Data Storage Governance

Didi Tech

Mar 5, 2024 · Databases

Migrating Didi's Log Retrieval from Elasticsearch to ClickHouse: Architecture, Challenges, and Performance Optimizations

Didi replaced its Elasticsearch‑based log platform with ClickHouse, redesigning architecture into isolated Log and Trace clusters, using hourly‑partitioned MergeTree tables and aggregating views to handle petabyte‑scale writes, diverse low‑latency queries, and high QPS, achieving over 400 nodes, 40 GB/s throughput, 30 % cost savings and four‑fold query latency reduction.

Big DataClickHouseElasticsearch

0 likes · 15 min read

Migrating Didi's Log Retrieval from Elasticsearch to ClickHouse: Architecture, Challenges, and Performance Optimizations

Sohu Tech Products

Feb 28, 2024 · Big Data

Why Use Zarr? Storing and Accessing Large NumPy Arrays with mmap and Zarr

Zarr provides a modern, chunked and compressed storage format that lets you treat massive NumPy arrays like in‑memory objects, offering on‑demand loading, flexible back‑ends (disk, S3, zip), automatic caching, resizing, parallel reads/writes, and superior performance compared to traditional mmap‑based memmap files.

NumPyPythonZarr

0 likes · 18 min read

Why Use Zarr? Storing and Accessing Large NumPy Arrays with mmap and Zarr

Open Source Linux

Dec 18, 2023 · Fundamentals

From Bits to Brontobytes: Understanding Data Storage Units

This article explains the hierarchy of digital storage units—from the single bit up to speculative sizes like Brontobyte and beyond—detailing their values, typical examples, and real‑world analogies such as characters per byte and the amount of text each unit can hold.

bitsbytesdata storage

0 likes · 5 min read

From Bits to Brontobytes: Understanding Data Storage Units

Liangxu Linux

Nov 8, 2023 · Fundamentals

How Floating‑Gate Transistors Store Data: Inside NAND Flash Write/Read Operations

This article explains the structure of NAND flash memory, how MOSFETs with floating‑gate and tunneling layers perform write operations for logical 0 and 1, how read operations detect stored electrons, and how matrix control enables block‑level access.

Floating GateNANDdata storage

0 likes · 5 min read

How Floating‑Gate Transistors Store Data: Inside NAND Flash Write/Read Operations

Architects' Tech Alliance

Oct 26, 2023 · Fundamentals

Types of Computer Storage and an Overview of RAID

The article explains the four main categories of computer storage—primary, secondary, tertiary, and offline—detailing their connection to the CPU, typical devices, the concept of direct‑attached storage, and an overview of RAID technology for performance and redundancy.

Direct-Attached StorageRAIDStorage Hierarchy

0 likes · 5 min read

Types of Computer Storage and an Overview of RAID

php Courses

Jul 3, 2023 · Databases

Top 5 Revolutionary Vector Databases Transforming Machine Learning and Similarity Search (2023)

Vector databases store and search large-scale vector data, and in 2023 the five leading solutions—Chroma, Pinecone, Weaviate, Milvus, and Faiss—offer scalable, high-performance options for applications such as LLM-driven services, audio search, recommendation systems, image/video analysis, and semantic retrieval across various industries.

AILLMdata storage

0 likes · 4 min read

Top 5 Revolutionary Vector Databases Transforming Machine Learning and Similarity Search (2023)

Big Data Technology Architecture

Apr 19, 2023 · Big Data

Why the Big Data Era Is Over

The article argues that the era of big data is ending, showing that most organizations store only modest amounts of data, that storage costs outweigh benefits, and that modern cloud and analytics tools allow efficient processing without needing massive datasets.

AnalyticsBig DataData Management

0 likes · 16 min read

DataFunSummit

Mar 11, 2023 · Databases

Graph Database Storage and Knowledge Graph Practices – Forum Overview

The forum explores the rapid growth and complexity of knowledge graphs, addressing storage and computation challenges through expert talks on graph database storage, query languages, practical implementation, and large‑scale financial knowledge graph platforms, offering attendees deep technical insights and hands‑on guidance.

Big DataKnowledge Graphdata storage

0 likes · 8 min read

Graph Database Storage and Knowledge Graph Practices – Forum Overview

DataFunSummit

Feb 12, 2023 · Big Data

Applying Erasure Coding in HDFS: Strategies, Performance, and Repair Techniques

This article explains how Zhihu adopted HDFS erasure coding to reduce storage costs, outlines cold‑hot file tiering policies, describes the EC conversion workflow and the custom EC Worker tool, and details methods for detecting and repairing damaged EC files in a Hadoop environment.

Big DataHDFSRepair

0 likes · 16 min read

Applying Erasure Coding in HDFS: Strategies, Performance, and Repair Techniques

Architect's Guide

Jan 9, 2023 · Fundamentals

How Data Is Stored: An Introduction to RAM, DRAM, and Memory Controllers

The article explains the fundamentals of data storage in computers, describing how binary data is saved using memory modules, the differences between static SRAM and dynamic DRAM, and the role of memory controllers in translating addresses and managing refresh cycles.

Computer ArchitectureDRAMMemory Controller

0 likes · 6 min read

How Data Is Stored: An Introduction to RAM, DRAM, and Memory Controllers

Baidu Intelligent Cloud Tech Hub

Jan 5, 2023 · Artificial Intelligence

How Baidu’s AI IaaS Supercharges Autonomous Driving: 5× Data Speed & 391% Model Gains

The talk outlines Baidu’s Baige AI IaaS solution for autonomous driving, detailing a low‑cost, high‑efficiency cloud stack that accelerates data access fivefold, boosts model training speed up to 391 %, cuts inference latency by 90 %, reduces simulation costs by 60 %, and explains the underlying storage, compute, container and GPU virtualization technologies.

AI IaaSModel Trainingautonomous driving

0 likes · 17 min read

How Baidu’s AI IaaS Supercharges Autonomous Driving: 5× Data Speed & 391% Model Gains

ITPUB

Dec 31, 2022 · Databases

Why HBase? Strengths, Weaknesses, Real‑World Scenarios, and Architecture Explained

This article examines HBase’s high reliability and performance as a column‑oriented NoSQL store, outlines its advantages and limitations, presents two practical use cases from e‑commerce, and details its data model, architecture components, and design considerations for effective deployment.

Big DataHBaseNoSQL

0 likes · 12 min read

Why HBase? Strengths, Weaknesses, Real‑World Scenarios, and Architecture Explained

Python Crawling & Data Mining

Aug 12, 2022 · Big Data

Master the Big Data Ecosystem: 9 Core Technology Frameworks Explained

This article provides a comprehensive overview of the big data ecosystem, detailing nine essential technology categories—including data collection, storage, computation, analysis, resource management, retrieval, underlying infrastructure, and cluster installation—while comparing popular tools and illustrating their typical use‑cases with diagrams.

cluster managementdata collectiondata storage

0 likes · 11 min read

Master the Big Data Ecosystem: 9 Core Technology Frameworks Explained

Past Memory Big Data

Aug 9, 2022 · Big Data

Master the Complete Big Data Ecosystem in One Article

This article provides a comprehensive overview of the big data ecosystem, detailing nine core technology categories—from data collection and storage to computation, analysis, scheduling, and underlying infrastructure—along with tool comparisons, selection guidelines to help readers quickly build a complete big data knowledge system.

Big DataResource ManagementTask scheduling

0 likes · 12 min read

Master the Complete Big Data Ecosystem in One Article

Top Architect

Jun 5, 2022 · Databases

Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Java Conversion

The article explains that using a 32‑bit UNSIGNED INT to store IPv4 addresses in MySQL saves space and improves indexing and range queries, outlines the conversion functions and their performance, discusses readability drawbacks, and provides Java code for bidirectional conversion between string and integer representations.

IP addressJavaUNSIGNED INT

0 likes · 6 min read

Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Java Conversion

Liangxu Linux

Jun 5, 2022 · Databases

MySQL IPv4 Storage: Unsigned INT vs String – Benefits, Drawbacks & Java Example

The article explains why MySQL’s high‑performance guide recommends storing IPv4 addresses as a 32‑bit UNSIGNED INT instead of VARCHAR, detailing space savings, faster range queries, conversion functions (INET_ATON/INET_NTOA), associated drawbacks, and provides Java code to convert between string and integer representations.

IPv4JavaUnsigned Integer

0 likes · 5 min read

MySQL IPv4 Storage: Unsigned INT vs String – Benefits, Drawbacks & Java Example

Tencent Cloud Developer

May 31, 2022 · Industry Insights

What’s Driving the NoSQL Revolution? Key Takeaways from the 5th Techo TVP Summit

The 5th Techo TVP Developer Summit explored the surge of data, the strategic role of NoSQL in digital transformation, presented cutting‑edge trends, performance breakthroughs, cloud‑native multi‑model solutions, and real‑world case studies from finance to gaming, highlighting future directions for database technology.

Cloud NativeDatabase TrendsNoSQL

0 likes · 18 min read

What’s Driving the NoSQL Revolution? Key Takeaways from the 5th Techo TVP Summit

Architects' Tech Alliance

May 24, 2022 · Databases

Understanding Oracle Database Architecture: From Instances to Multitenancy

This article provides a comprehensive overview of Oracle's database products, explaining why they matter, detailing the physical and logical storage structures, instance processes and memory areas, and tracing the evolution from early versions through RAC, multitenancy, and the latest 18c features.

Database ArchitectureMultitenancyOracle

0 likes · 17 min read

Understanding Oracle Database Architecture: From Instances to Multitenancy

MaGe Linux Operations

Apr 2, 2022 · Operations

Why Prometheus Uses TSDB: Mastering Scalable Monitoring and Queries

This article explains how Prometheus, a data‑driven monitoring system, leverages a time‑series database (TSDB) to handle massive metric volumes, perform efficient queries, and enable powerful calculations such as recording rules for pre‑computed results.

Query OptimizationTSDBTime-series

0 likes · 8 min read

Why Prometheus Uses TSDB: Mastering Scalable Monitoring and Queries

IT Architects Alliance

Mar 23, 2022 · Big Data

How Elasticsearch’s Cluster Architecture Powers Scalable Search and Analytics

This article explains Elasticsearch’s distributed cluster design, covering core concepts such as nodes, indices, shards, and replicas, compares mixed and tiered deployment models, examines data‑layer storage options, and discusses two typical distributed system architectures with their trade‑offs.

Big DataCluster ArchitectureElasticsearch

0 likes · 15 min read

How Elasticsearch’s Cluster Architecture Powers Scalable Search and Analytics

NetEase LeiHuo UX Big Data Technology

Mar 23, 2022 · Databases

Overview of Database Architecture, Storage Engines, and Data Layouts

This article explains the core components of database systems, including their client‑server architecture, query processing, storage engine modules, classification by storage media (memory vs. disk), and the differences between row‑oriented and column‑oriented data layouts, concluding with future topics to explore.

Column LayoutRow LayoutStorage Engine

0 likes · 8 min read

Architect

Jan 2, 2022 · Backend Development

Efficient Read/Unread Tracking for Group Chat Messages Using Bitmaps

The article examines how to efficiently store read/unread status for group chat messages by replacing per‑user lists with compact bitmap structures, discusses handling member exits, presents C++‑style struct definitions, and quantifies storage savings compared to the naïve 8‑byte per‑user approach.

Optimizationbitmapdata storage

0 likes · 7 min read

Efficient Read/Unread Tracking for Group Chat Messages Using Bitmaps

Architecture Digest

Nov 11, 2021 · Databases

Elasticsearch Cluster Architecture and Distributed Data System Design Overview

This article explains the core concepts of Elasticsearch—including nodes, indices, shards, and replicas—covers its cluster and data‑layer architectures, compares mixed and tiered deployment models, and discusses the advantages and drawbacks of replica‑based distributed storage systems.

Cluster ArchitectureElasticsearchdata storage

0 likes · 15 min read

Elasticsearch Cluster Architecture and Distributed Data System Design Overview

Python Crawling & Data Mining

Oct 8, 2021 · Big Data

Why Feather Beats CSV for Large-Scale Data: Speed, Size, and Simplicity

This article explains the limitations of CSV for big datasets, introduces the Feather binary format, shows how to install and use it with Python and pandas, and compares its saving/loading speed and storage size against CSV, highlighting Feather's advantages for efficient data handling.

Big DataFeatherPandas

0 likes · 7 min read

Why Feather Beats CSV for Large-Scale Data: Speed, Size, and Simplicity

Sohu Tech Products

Oct 6, 2021 · Databases

Elasticsearch Cluster Architecture and Distributed System Design

This article explains the architecture of Elasticsearch clusters, covering node roles, index, shard and replica concepts, deployment models, data storage mechanisms, and compares two distributed system designs—local‑file‑system and shared‑file‑system—highlighting their advantages and trade‑offs.

Cluster ArchitectureElasticsearchdata storage

0 likes · 14 min read

Elasticsearch Cluster Architecture and Distributed System Design

MaGe Linux Operations

Sep 18, 2021 · Operations

Why Prometheus’s TSDB Makes Massive Monitoring Data Manageable

The article explains how Prometheus, a data‑driven monitoring system, handles massive time‑series data using its TSDB storage engine, detailing concepts, query examples, storage characteristics, indexing mechanisms, and the benefits of pre‑computing rules for efficient monitoring at scale.

TSDBdata storageprometheus

0 likes · 8 min read

Why Prometheus’s TSDB Makes Massive Monitoring Data Manageable

Architects' Tech Alliance

Jul 22, 2021 · Fundamentals

Evolution of Next-Generation Data Storage Technologies: Media, Architecture, Protocols, Applications, and Operations

This article reviews the evolution of next‑generation data storage technologies, covering advances in storage media such as all‑flash and non‑volatile memory, modern storage architectures like software‑defined and hyper‑converged systems, emerging protocols (NVMe, NVMe‑oF), cloud‑based application models, and intelligent operation approaches.

Hyper-Converged InfrastructureIntelligent OperationsNVMe

0 likes · 14 min read

Evolution of Next-Generation Data Storage Technologies: Media, Architecture, Protocols, Applications, and Operations

Architects Research Society

Jun 23, 2021 · Big Data

Understanding Data Lakes: Concepts, Benefits, and Comparison with Data Warehouses

The article explains what a data lake is, its origins, key characteristics such as storing all raw data, flexible access, and low‑cost storage, compares it with traditional data warehouses, discusses advantages, common criticisms, and the types of users who can benefit from it.

Data LakeData ManagementData Warehouse

0 likes · 10 min read

Understanding Data Lakes: Concepts, Benefits, and Comparison with Data Warehouses

Youzan Coder

Jun 9, 2021 · Mobile Development

Mobile SkyNet Platform: Architecture, Log Collection, Storage, and Alerting Design

The Mobile SkyNet platform adds a dedicated mobile monitoring layer to SaaS services, using Zanlogger for error, warning, and info logs, Kafka‑HBase pipelines for high‑throughput storage, WeChat‑based alerting, and an MPaaS console for issue visualization, reducing mobile‑side incidents by about twenty percent.

AlertingBackend IntegrationLog Monitoring

0 likes · 11 min read

Mobile SkyNet Platform: Architecture, Log Collection, Storage, and Alerting Design

MaGe Linux Operations

May 24, 2021 · Fundamentals

Mastering Python Object Persistence: A Deep Dive into Pickle and Advanced Serialization

This article explains how Python persistence works by serializing objects with pickle and cPickle, compares file‑based and database storage, demonstrates basic and advanced usage—including handling circular references, custom classes, and versioning—and offers practical tips for maintaining pickled data across code changes.

Object PersistencePicklePython

0 likes · 22 min read

Mastering Python Object Persistence: A Deep Dive into Pickle and Advanced Serialization

Python Programming Learning Circle

May 18, 2021 · Fundamentals

Object Persistence in Python Using Pickle and Related Techniques

This article explains Python object persistence, covering the concepts of serialization with pickle and cPickle, various storage mechanisms, handling of complex objects, reference cycles, class instance pickling, versioning strategies, and advanced techniques such as custom state methods and Pickler/Unpickler usage.

PersistencePicklePython

0 likes · 22 min read

Object Persistence in Python Using Pickle and Related Techniques

JD Retail Technology

May 11, 2021 · Backend Development

Redesigning JD's C‑End Invoice System: Architecture Upgrade, Performance Optimizations, and Future Roadmap

The article details JD's transition from a fragmented C‑end invoice service to a unified Invoice Center, describing the original user‑experience and architectural flaws, the DDD‑based layered redesign, performance‑boosting data and file‑handling strategies, achieved operational gains, and the planned next‑year initiatives.

DDDdata storageinvoice system

0 likes · 10 min read

Redesigning JD's C‑End Invoice System: Architecture Upgrade, Performance Optimizations, and Future Roadmap

Programmer DD

Apr 13, 2021 · Big Data

What Makes HDFS the Backbone of Big Data? Overview, Architecture & Key Features

This article provides a comprehensive overview of HDFS—including its design goals, core components, data read/write workflows, high‑availability mechanisms, federation, storage policies, colocation benefits, and practical usage scenarios—explaining why it is the foundational distributed file system for large‑scale data processing.

Big DataFederationHDFS

0 likes · 17 min read

What Makes HDFS the Backbone of Big Data? Overview, Architecture & Key Features

Baidu App Technology

Jan 13, 2021 · Frontend Development

San CLI UI: Architecture and Plugin System

San CLI UI combines a San‑based client, a Node.js/Express GraphQL server, and lowdb file storage, enabling custom routing, component enhancements, and a versatile plugin system—supporting widgets, configurations, tasks, and custom views—managed through ClientAddonApi and PluginManager with IPC communication.

GraphQLPlugin systemSan CLI UI

0 likes · 18 min read

San CLI UI: Architecture and Plugin System

Ctrip Technology

Jan 7, 2021 · Databases

Practical Experience of Data Storage in Ctrip Flight Big Data Platform: From Redis/MySQL to CrateDB

This article shares the Ctrip flight big‑data platform’s journey of evaluating and migrating data storage from Hive, MySQL and Redis to CrateDB, covering performance requirements, query patterns, maintenance challenges, containerization, and production results that reduced interface latency and resource consumption.

CrateDBCtripdata storage

0 likes · 10 min read

Practical Experience of Data Storage in Ctrip Flight Big Data Platform: From Redis/MySQL to CrateDB

Big Data Technology & Architecture

Jan 1, 2021 · Big Data

Deep Dive into Apache Druid V1 Data Storage Format and Architecture

This article provides an in‑depth analysis of Apache Druid V1’s column‑oriented storage format, covering its dictionary, encoded dimension values, bitmap inverted index, array handling, and how these structures are used during query execution, illustrated with diagrams and code examples.

Apache DruidBitmap IndexColumnar

0 likes · 9 min read

Deep Dive into Apache Druid V1 Data Storage Format and Architecture

JD Cloud Developers

Sep 27, 2020 · Databases

How ClickHouse Achieves Billion‑Row Queries in Seconds: Architecture & Cloud Deployment

This article explains why ClickHouse, the high‑performance columnar OLAP database, can return results on billions of rows within seconds, detailing its columnar storage, MergeTree engine, and how JD Cloud deploys and optimizes it on Kubernetes for scalability and reliability.

ClickHouseColumnar DatabaseMergeTree

0 likes · 14 min read

How ClickHouse Achieves Billion‑Row Queries in Seconds: Architecture & Cloud Deployment

Architects' Tech Alliance

Sep 6, 2020 · Fundamentals

Introduction to Data Storage Virtualization Technology (Part 2)

This article explains the concepts, principles, and benefits of storage virtualization, covering host‑level, network‑level, and storage‑level approaches, and compares NAS and SAN technologies while highlighting how virtualization creates unified storage pools and reduces total cost of ownership.

NASSANStorage Virtualization

0 likes · 5 min read

Introduction to Data Storage Virtualization Technology (Part 2)

Big Data Technology & Architecture

Aug 16, 2020 · Big Data

Comprehensive Overview of HDFS: Architecture, Advantages, Limitations, Commands, and Advanced Features

This article provides a detailed introduction to HDFS, covering its application scenarios, core architecture, fault‑tolerance benefits, drawbacks such as high latency and small‑file inefficiency, essential shell and API commands, cluster management procedures, and newer Hadoop 2.0 features like HA, Federation, snapshots, ACLs, and heterogeneous storage.

Big DataCLIHA

0 likes · 10 min read

Comprehensive Overview of HDFS: Architecture, Advantages, Limitations, Commands, and Advanced Features

Efficient Ops

Aug 2, 2020 · Fundamentals

Understanding Modern Data Storage: From Hard Disks to NAS and SAN

This article explains the fundamentals of data storage, covering hard‑disk hardware, internal structures, logical volumes, file systems, and the differences between direct‑attached, network‑attached, and storage‑area network solutions.

Logical VolumeNASSAN

0 likes · 10 min read

Understanding Modern Data Storage: From Hard Disks to NAS and SAN

DataFunTalk

Jul 4, 2020 · Databases

Deep Dive into Apache Druid V1 Data Storage Format: Index Structures and Disk Layout

This article provides an in‑depth analysis of Apache Druid V1's column‑oriented storage format, covering dimension structures, dictionaries, variable‑length integer encoding, inverted indexes, array handling, and how these components are used during query execution.

Apache DruidColumnar DatabaseOLAP

0 likes · 9 min read

Deep Dive into Apache Druid V1 Data Storage Format: Index Structures and Disk Layout

Alibaba Cloud Developer

Jun 4, 2020 · Artificial Intelligence

Can Deep Reinforcement Learning Revolutionize Time-Series Data Compression?

This article reviews the challenges of compressing massive time‑series data, surveys existing methods, and introduces a novel two‑stage deep reinforcement learning framework (AMMMO) that adaptively selects compression modes, demonstrating significant compression ratio improvements and high throughput on large‑scale IoT and server workloads.

adaptive algorithmsdata storagedeep reinforcement learning

0 likes · 18 min read

Can Deep Reinforcement Learning Revolutionize Time-Series Data Compression?

HomeTech

May 7, 2020 · Big Data

Construction and Evaluation of User Profiles: Identification, Tagging, Storage, and Quality Assessment

This article explains how to build user profiles by distinguishing persona from profile, describing the evolution of ID‑mapping techniques, designing a multi‑layer tag system, implementing statistical, interest, and model tags, storing the data in Hive, HBase, Codis and Elasticsearch, and finally evaluating profile timeliness, coverage and accuracy.

Big Datadata storagedata tagging

0 likes · 11 min read

Construction and Evaluation of User Profiles: Identification, Tagging, Storage, and Quality Assessment

WeChat Client Technology Team

Oct 8, 2019 · Mobile Development

How WeChat Solved Android Storage Challenges with a Virtual File System

This article examines WeChat's approach to Android data storage issues, detailing the limitations of internal and external storage, the gradual migration strategy, and the design of the wechat‑vfs component that provides abstracted file migration, encryption, and cleanup capabilities.

AndroidEncryptionFile Migration

0 likes · 12 min read

How WeChat Solved Android Storage Challenges with a Virtual File System

360 Tech Engineering

Sep 19, 2019 · Big Data

Understanding HDFS: Architecture, Read/Write Operations, Component Roles, Commands, and Pros & Cons

This article provides a comprehensive overview of HDFS, covering its purpose, architecture, read/write mechanisms, replication strategies, component responsibilities, common command‑line tools, and the advantages and disadvantages of using Hadoop Distributed File System for large‑scale data storage.

Distributed File SystemHDFSHadoop

0 likes · 10 min read

Understanding HDFS: Architecture, Read/Write Operations, Component Roles, Commands, and Pros & Cons

Architecture Digest

Aug 19, 2019 · Big Data

Elasticsearch Cluster Architecture and Distributed Data System Design

This article explains Elasticsearch's cluster architecture, including nodes, indices, shards, replicas, deployment models, and data layer storage, and compares two types of distributed data system designs—local file‑system based and shared‑storage based—highlighting their advantages and trade‑offs.

Cluster ArchitectureElasticsearchdata storage

0 likes · 13 min read

Elasticsearch Cluster Architecture and Distributed Data System Design

MaGe Linux Operations

Aug 10, 2019 · Backend Development

Master Web Scraping in Python: From Basics to Bypassing Anti‑Scraping

Learn how to start web scraping with Python by mastering the three core steps—fetching, analyzing, and storing data—using urllib and requests, handling login, evading anti‑scraping measures like user‑agents and IP proxies, and saving results to JSON, CSV, or MongoDB.

PythonSeleniumanti-scraping

0 likes · 9 min read

Master Web Scraping in Python: From Basics to Bypassing Anti‑Scraping

NetEase Media Technology Team

Jul 2, 2019 · Backend Development

Design and Implementation of Feed Stream Architecture for NetEase Open Courses

The article details NetEase Open Courses’ feed‑stream architecture, describing how content ingestion, multi‑level filtering, vertically and horizontally split storage, Elasticsearch indexing, two‑tier caching, and micro‑service orchestration combine to deliver personalized, high‑availability course feeds while addressing scalability, consistency, and operational challenges.

Cachingbackend-architecturecontent ingestion

0 likes · 16 min read

Design and Implementation of Feed Stream Architecture for NetEase Open Courses

Alibaba Cloud Developer

Apr 18, 2019 · Big Data

How MaxCompute Evolved: 10 Years of Big Data Innovation at Alibaba

This article reviews a decade of MaxCompute development, covering its origins, core technologies, performance gains, ecosystem integration, intelligent features, competitive positioning, and commercialization, while highlighting the platform's role as Alibaba's central big‑data compute engine.

AI integrationBig DataCloud Data Warehouse

0 likes · 21 min read

How MaxCompute Evolved: 10 Years of Big Data Innovation at Alibaba

360 Quality & Efficiency

Sep 25, 2018 · Databases

Comparison of Hive, MongoDB, and Redis: Features, Use Cases, and Characteristics

This article provides a concise overview of three data storage solutions—Hive, MongoDB, and Redis—detailing their core concepts, operational principles, typical use cases, and key characteristics to help developers choose the appropriate technology for various workloads.

HiveNoSQLRedis

0 likes · 6 min read

Comparison of Hive, MongoDB, and Redis: Features, Use Cases, and Characteristics

High Availability Architecture

Sep 7, 2018 · Databases

Understanding NoSQL and Database Selection in the Big Data Era

This article analyzes the shortcomings of traditional relational databases in big‑data scenarios and introduces five major NoSQL categories—columnar, key‑value, document, full‑text search, and graph databases—detailing their principles, advantages, disadvantages, common implementations, and appropriate use cases to guide storage technology selection.

ColumnarNoSQLdata storage

0 likes · 18 min read

Understanding NoSQL and Database Selection in the Big Data Era

Architects' Tech Alliance

Aug 26, 2018 · Fundamentals

High‑Performance Computing Applications in Oil Exploration: Data Processing, Storage, and Workflow

This article explains how high‑performance computing (HPC) supports oil‑field exploration by detailing the stages of seismic data acquisition, processing, and interpretation, the demanding computational and storage requirements, parallel communication patterns, checkpointing, and data lifecycle management, illustrating the role of HPC in modern geophysical workflows.

HPCHigh-performance computingOil Exploration

0 likes · 12 min read

High‑Performance Computing Applications in Oil Exploration: Data Processing, Storage, and Workflow

dbaplus Community

Aug 25, 2018 · Databases

Why Multi-Model Databases Are the Future of Cloud Data Management

The article explains how cloud-driven demands and diverse data types have spurred the rise of multi-model databases, detailing their architecture, storage structures, compression techniques, and access methods using SequoiaDB as a concrete example.

BSONCloud DatabasesDatabase Architecture

0 likes · 14 min read

Why Multi-Model Databases Are the Future of Cloud Data Management

Meitu Technology

Aug 17, 2018 · Big Data

Meitu Distributed Bitmap System (Naix): Architecture, Implementation, and Performance Evaluation

Meitu’s Naix distributed bitmap system accelerates massive user‑data analytics by using a three‑layer architecture, sharded RoaringBitmap storage, and PalDB, delivering over 600× faster queries than Hive, supporting fast generation plugins, fault‑tolerant replication, and millisecond‑level RPC query responses while reducing storage by 67%.

Big DataNaixbitmap

0 likes · 16 min read

Meitu Distributed Bitmap System (Naix): Architecture, Implementation, and Performance Evaluation

Architects' Tech Alliance

May 25, 2018 · Fundamentals

How SMR Drives Boost Disk Density and Challenge Storage Management

Shingled Magnetic Recording (SMR) uses overlapping tracks to dramatically increase disk surface density, lowering cost per gigabyte, but it eliminates random writes, requiring new zone‑based management and exposing standards like ZBC and ZAC for host‑aware and drive‑managed implementations.

SMRShingled Magnetic RecordingZAC

0 likes · 11 min read

How SMR Drives Boost Disk Density and Challenge Storage Management

21CTO

Sep 11, 2017 · Backend Development

How We Scaled Headline Recommendation Data with MySQL, Redis, and Pipeline Optimizations

This article details the architecture and evolution of a headline recommendation system, covering data aggregation, storage strategies using MySQL and Redis, challenges with reload latency and memory usage, and the optimizations—including data separation, Redis migration, and query pipeline improvements—that enabled scalable, efficient backend operations.

Redisdata storagepipeline

0 likes · 14 min read

How We Scaled Headline Recommendation Data with MySQL, Redis, and Pipeline Optimizations

ITFLY8 Architecture Home

Aug 2, 2017 · Backend Development

Scalable Web Architecture: Layers, Load Balancing, and Storage

This article explains the layered architecture of large‑scale web systems, covering flexible component choices, load distribution strategies, business service and communication layers, storage options from file to object systems, and key evaluation criteria such as cost, scalability, security, and maintainability.

backenddata storageload balancing

0 likes · 20 min read

Scalable Web Architecture: Layers, Load Balancing, and Storage

Architects' Tech Alliance

Apr 4, 2017 · Big Data

Alluxio: Memory‑Centric Distributed File System for Big Data Storage and Compute

Alluxio, formerly Tachyon, is a memory‑centric distributed file system that unifies heterogeneous big‑data storage backends, optimizes small files, and provides a fast, unified data access layer between storage systems like S3 or HDFS and compute frameworks such as Spark or Hadoop.

AlluxioCompute FrameworksDistributed File System

0 likes · 7 min read

Alluxio: Memory‑Centric Distributed File System for Big Data Storage and Compute

Architects' Tech Alliance

Nov 30, 2016 · Big Data

Core Technologies and Challenges of Big Data: ETL, Storage, Analysis, and Cloud Integration

This article examines the core technologies of big data—including data collection, storage, management, analysis, and mining—highlighting architectural challenges, analysis techniques, storage solutions, ETL processes, and the interplay between big data and cloud computing, while emphasizing practical implementation considerations.

Cloud ComputingETLdata analysis

0 likes · 11 min read

Core Technologies and Challenges of Big Data: ETL, Storage, Analysis, and Cloud Integration

ITFLY8 Architecture Home

Nov 13, 2016 · Backend Development

Designing Scalable Web System Architecture: Layers, Load Balancing, and Storage Strategies

This article explains the layered architecture of a web system, covering flexible component choices, load‑balancing techniques, business service and communication layers, various storage options—including file, block, and object storage—and key evaluation criteria for building robust, cost‑effective solutions.

System Designdata storageweb architecture

0 likes · 20 min read

Designing Scalable Web System Architecture: Layers, Load Balancing, and Storage Strategies

Architecture Digest

Jun 9, 2016 · Databases

Understanding HBase Architecture and Core Principles

This article provides a comprehensive overview of HBase, covering its distributed architecture, component roles, data organization, read/write mechanisms, and best practices for schema and region design to ensure efficient big‑data storage and retrieval.

Big DataHBaseRegionServer

0 likes · 17 min read

Understanding HBase Architecture and Core Principles

Java High-Performance Architecture

May 6, 2016 · Backend Development

What Technologies Power Airbnb’s Massive Global Platform?

This article outlines the extensive technology stack behind Airbnb, covering its programming languages, client and server libraries, web server, data storage solutions, server services, and underlying hardware infrastructure that support its operations across 190 countries.

AirbnbBackend DevelopmentWeb Server

0 likes · 2 min read

What Technologies Power Airbnb’s Massive Global Platform?

21CTO

Mar 1, 2016 · Databases

The Evolution of Databases: From 1960s Military Roots to Modern Innovations

Databases originated in the 1960s‑1990s when the United States consolidated wartime intelligence into computer‑stored Data Bases, and a review of their development from 1962 to 2016 reveals a relentless stream of technological breakthroughs that continuously enrich everyday life.

Computer ScienceDatabasesTechnology evolution

0 likes · 1 min read

The Evolution of Databases: From 1960s Military Roots to Modern Innovations