Tag

data storage

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Jun 3, 2025 · Big Data

BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing

BiFang is a lake‑stream integrated storage engine that merges Apache Pulsar message‑queue capabilities with Iceberg data‑lake features, providing a single unified data store with full‑incremental queries, sub‑second visibility, exactly‑once semantics, and seamless integration with Flink, Spark, and StarRocks for both real‑time analytics and batch processing.

Apache IcebergApache PulsarBig Data
0 likes · 13 min read
BiFang: A Unified Lake‑Stream Storage Engine for Real‑Time and Batch Data Processing
JD Retail Technology
JD Retail Technology
Feb 20, 2025 · Big Data

Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris

JD Advertising built a petabyte‑scale ad analytics service on Apache Doris, identified a hot‑cold access pattern, and implemented a native cold‑hot tiering solution (upgrading to Doris 2.0 and optimizing schema changes) that cut storage costs by ~87% and boosted concurrent query capacity over tenfold while simplifying operations.

Apache DorisBig Datacold-hot tiering
0 likes · 18 min read
Cold‑Hot Data Tiering Solutions for JD Advertising Using Apache Doris
Architect's Guide
Architect's Guide
Jan 21, 2025 · Databases

Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Conversion Techniques

The article explains that using a 32‑bit UNSIGNED INT to store IPv4 addresses in MySQL saves space and improves index and range‑query performance, outlines the storage savings compared to VARCHAR, mentions the need for manual conversion, and provides MySQL and Java code examples for converting between string and integer representations.

IPv4MySQLSQL
0 likes · 5 min read
Why Store IPv4 Addresses as UNSIGNED INT in MySQL: Benefits, Drawbacks, and Conversion Techniques
DevOps
DevOps
Jan 8, 2025 · Artificial Intelligence

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

The article explains how generative AI agents combine language models, tool integration, self‑guided planning, prompt‑engineering frameworks, extensions, function calls, and vector‑based data storage to create adaptable, retrieval‑augmented systems that can interact with real‑world APIs and perform complex tasks.

AI agentsFunction CallingRAG
0 likes · 12 min read
Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage
Python Programming Learning Circle
Python Programming Learning Circle
Nov 19, 2024 · Databases

Overview of Lightweight Python Databases and Their Usage

This article surveys several lightweight Python databases—including PickleDB, TinyDB, ZODB, Durus, Buzhug, Gadfly, and PyTables—detailing their main features, typical use cases, cautions, and providing basic code examples to help developers choose and apply the right storage solution for small‑scale or prototype projects.

LightweightNoSQLPython
0 likes · 23 min read
Overview of Lightweight Python Databases and Their Usage
php中文网 Courses
php中文网 Courses
Sep 27, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This tutorial explains how to implement user registration and data storage in web applications using PHP functions, covering input validation, password hashing, MySQL connection, SQL insertion, and providing complete code examples for both registering users and storing generic data.

MySQLPHPUser Registration
0 likes · 4 min read
Implementing User Registration and Data Storage with PHP Functions
High Availability Architecture
High Availability Architecture
Sep 11, 2024 · Backend Development

Evolution of Ctrip Vacation Product Log System: From Single‑Table DB to ES + HBase Platform

This article details the evolution of Ctrip's vacation product log system—from a simple single‑table DB in 2019, through a platformized ES + HBase architecture with custom RowKey design, to a V3.0 version that adds business and supplier empowerment, scalable storage, advanced search, and flexible data presentation for billions of daily change records.

ESHBasearchitecture
0 likes · 13 min read
Evolution of Ctrip Vacation Product Log System: From Single‑Table DB to ES + HBase Platform
DataFunSummit
DataFunSummit
Aug 31, 2024 · Big Data

Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)

This article explains Apache Hudi's clustering service, detailing its workflow, three execution modes, and layout optimization strategies—including linear, Z‑order, and Hilbert space‑filling curves—to improve storage locality and query performance in large‑scale data lake environments.

Apache HudiBig DataClustering
0 likes · 8 min read
Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)
php中文网 Courses
php中文网 Courses
Jul 25, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This article explains how to use PHP functions to create a user registration system with input validation, password hashing, and MySQL insertion, and also demonstrates a generic data‑storage function that connects to a database and saves arbitrary data, highlighting key security and implementation steps.

MySQLPHPUser Registration
0 likes · 4 min read
Implementing User Registration and Data Storage with PHP Functions
Architect's Guide
Architect's Guide
May 25, 2024 · Fundamentals

How Data Is Stored: An Overview of RAM, DRAM, and Memory Controllers

This article explains the fundamentals of data storage in computers, covering the concepts of RAM and DRAM, the role of capacitors and transistors, and how memory controllers and CPU caches work together to manage and accelerate access to binary information.

Computer ArchitectureDRAMMemory Controller
0 likes · 7 min read
How Data Is Stored: An Overview of RAM, DRAM, and Memory Controllers
php中文网 Courses
php中文网 Courses
May 21, 2024 · Backend Development

Implementing User Registration and Data Storage with PHP Functions

This article demonstrates how to use PHP functions to implement user registration and data storage, covering input validation, password hashing, MySQL database connection, SQL insertion, and returning operation results, while highlighting security considerations and practical code examples.

MySQLPHPUser Registration
0 likes · 4 min read
Implementing User Registration and Data Storage with PHP Functions
DataFunTalk
DataFunTalk
Apr 28, 2024 · Big Data

Ant Group’s Data Governance Practices: Overview, Data Quality, and Data Storage Governance

This article shares Ant Group's extensive experience in big data governance, detailing the overall data governance framework, data quality management, data storage governance, and future considerations, illustrated with practical cases and strategies for ensuring compliance, reliability, and cost efficiency.

Ant GroupBig DataData Architecture
0 likes · 17 min read
Ant Group’s Data Governance Practices: Overview, Data Quality, and Data Storage Governance
Didi Tech
Didi Tech
Mar 5, 2024 · Databases

Migrating Didi's Log Retrieval from Elasticsearch to ClickHouse: Architecture, Challenges, and Performance Optimizations

Didi replaced its Elasticsearch‑based log platform with ClickHouse, redesigning architecture into isolated Log and Trace clusters, using hourly‑partitioned MergeTree tables and aggregating views to handle petabyte‑scale writes, diverse low‑latency queries, and high QPS, achieving over 400 nodes, 40 GB/s throughput, 30 % cost savings and four‑fold query latency reduction.

Big DataClickHouseDistributed Database
0 likes · 15 min read
Migrating Didi's Log Retrieval from Elasticsearch to ClickHouse: Architecture, Challenges, and Performance Optimizations
Sohu Tech Products
Sohu Tech Products
Feb 28, 2024 · Big Data

Why Use Zarr? Storing and Accessing Large NumPy Arrays with mmap and Zarr

Zarr provides a modern, chunked and compressed storage format that lets you treat massive NumPy arrays like in‑memory objects, offering on‑demand loading, flexible back‑ends (disk, S3, zip), automatic caching, resizing, parallel reads/writes, and superior performance compared to traditional mmap‑based memmap files.

MMAPNumPyPython
0 likes · 18 min read
Why Use Zarr? Storing and Accessing Large NumPy Arrays with mmap and Zarr
Architects' Tech Alliance
Architects' Tech Alliance
Oct 26, 2023 · Fundamentals

Types of Computer Storage and an Overview of RAID

The article explains the four main categories of computer storage—primary, secondary, tertiary, and offline—detailing their connection to the CPU, typical devices, the concept of direct‑attached storage, and an overview of RAID technology for performance and redundancy.

Computer StorageDirect-Attached StorageRAID
0 likes · 5 min read
Types of Computer Storage and an Overview of RAID
php中文网 Courses
php中文网 Courses
Jul 3, 2023 · Databases

Top 5 Revolutionary Vector Databases Transforming Machine Learning and Similarity Search (2023)

Vector databases store and search large-scale vector data, and in 2023 the five leading solutions—Chroma, Pinecone, Weaviate, Milvus, and Faiss—offer scalable, high-performance options for applications such as LLM-driven services, audio search, recommendation systems, image/video analysis, and semantic retrieval across various industries.

AILLMdata storage
0 likes · 4 min read
Top 5 Revolutionary Vector Databases Transforming Machine Learning and Similarity Search (2023)
Big Data Technology Architecture
Big Data Technology Architecture
Apr 19, 2023 · Big Data

Why the Big Data Era Is Over

The article argues that the era of big data is ending, showing that most organizations store only modest amounts of data, that storage costs outweigh benefits, and that modern cloud and analytics tools allow efficient processing without needing massive datasets.

Big DataCloud Computinganalytics
0 likes · 16 min read
Why the Big Data Era Is Over
DataFunSummit
DataFunSummit
Mar 11, 2023 · Databases

Graph Database Storage and Knowledge Graph Practices – Forum Overview

The forum explores the rapid growth and complexity of knowledge graphs, addressing storage and computation challenges through expert talks on graph database storage, query languages, practical implementation, and large‑scale financial knowledge graph platforms, offering attendees deep technical insights and hands‑on guidance.

Big Datadata storagegraph database
0 likes · 8 min read
Graph Database Storage and Knowledge Graph Practices – Forum Overview
DataFunSummit
DataFunSummit
Feb 12, 2023 · Big Data

Applying Erasure Coding in HDFS: Strategies, Performance, and Repair Techniques

This article explains how Zhihu adopted HDFS erasure coding to reduce storage costs, outlines cold‑hot file tiering policies, describes the EC conversion workflow and the custom EC Worker tool, and details methods for detecting and repairing damaged EC files in a Hadoop environment.

Big DataErasure CodingHDFS
0 likes · 16 min read
Applying Erasure Coding in HDFS: Strategies, Performance, and Repair Techniques
Architect's Guide
Architect's Guide
Jan 9, 2023 · Fundamentals

How Data Is Stored: An Introduction to RAM, DRAM, and Memory Controllers

The article explains the fundamentals of data storage in computers, describing how binary data is saved using memory modules, the differences between static SRAM and dynamic DRAM, and the role of memory controllers in translating addresses and managing refresh cycles.

Computer ArchitectureDRAMMemory Controller
0 likes · 6 min read
How Data Is Stored: An Introduction to RAM, DRAM, and Memory Controllers