Tagged articles

34 articles

Page 1 of 1

Nov 21, 2025 · Backend Development

How Uber Uses H3 Hexagonal Indexing to Power Real‑Time Driver Matching

This article explains how Uber solves the "nearby driver" problem by employing the open‑source H3 hexagonal spatial index, hierarchical grids, Cassandra for persistent storage, and Redis caching to deliver fast, accurate, and scalable real‑time location services.

Backend DevelopmentGeospatial IndexingH3

0 likes · 14 min read

How Uber Uses H3 Hexagonal Indexing to Power Real‑Time Driver Matching

360 Zhihui Cloud Developer

Jul 1, 2025 · Backend Development

How We Cut a 150‑Billion Image Migration from 120 Days to 40 Days

Facing the challenge of moving 15 billion image files from Cassandra to S3, we iteratively redesigned the migration pipeline—from a single‑process approach to multi‑process, queue‑driven, and multi‑cluster deployments—reducing the projected 120‑day effort to just 40 days while ensuring reliability and performance.

Data MigrationGoPerformance Optimization

0 likes · 16 min read

How We Cut a 150‑Billion Image Migration from 120 Days to 40 Days

dbaplus Community

Apr 20, 2025 · Databases

Why Wide Tables Fail and How to Design Them Efficiently

This article explains what wide tables are, why they are controversial, outlines three common design pitfalls with practical avoidance tips, and introduces three key technologies—ClickHouse, Cassandra, and Hudi/Iceberg—to help engineers build performant, maintainable wide‑table solutions in data warehouses.

Big DataDatabase designHudi

0 likes · 7 min read

Why Wide Tables Fail and How to Design Them Efficiently

dbaplus Community

Jul 7, 2024 · Operations

How Instagram Scales to 2.5B Users: Architecture, Consistency & Performance

Instagram grew from a simple photo‑sharing app to over 2.5 billion users, prompting engineers to adopt horizontal scaling, replace Python code with Cython, use region‑specific Cassandra clusters, employ the Akkio data‑placement service, and optimize PostgreSQL and Memcache handling to improve resource utilization, data consistency, and latency.

asynciocassandrainstagram

0 likes · 11 min read

Architect

May 13, 2024 · Backend Development

Push vs Pull: Designing a Scalable Feed Timeline with Redis and Cassandra

This article analyzes feed‑timeline architectures, compares pull‑based and push‑based models, proposes an online‑push/offline‑pull hybrid, and details practical implementations using Redis SortedSets, multi‑level caching with Cassandra, cursor‑based pagination, and large‑scale push task sharding.

Backendcachingcassandra

0 likes · 15 min read

Push vs Pull: Designing a Scalable Feed Timeline with Redis and Cassandra

dbaplus Community

Dec 26, 2023 · Databases

How Discord Scaled to Billions of Messages: From MongoDB to Cassandra to ScyllaDB

Discord’s rapid growth forced a series of massive database migrations—from MongoDB to Cassandra in 2017, then to ScyllaDB in 2023—detailing the motivations, requirements, data‑model design, performance challenges, migration tooling, and the resulting operational improvements.

DiscordRustScalability

0 likes · 25 min read

How Discord Scaled to Billions of Messages: From MongoDB to Cassandra to ScyllaDB

21CTO

May 16, 2023 · Databases

How Cassandra’s New Vector Search Transforms AI Applications

This article explains how Cassandra’s newly added vector data type and ANN search capabilities empower AI developers to store, index, and query high‑dimensional embeddings at scale, enabling use cases such as image retrieval, recommendation, and large‑language‑model integration.

AIANNcassandra

0 likes · 10 min read

How Cassandra’s New Vector Search Transforms AI Applications

Architects Research Society

Apr 15, 2023 · Databases

Cassandra Time‑Series Data Modeling at Massive Scale Using Bucketing

This article explains how to model massive time‑series data in Cassandra by using bucketing techniques to control partition size, avoid hotspots, and improve write and read performance, including practical CQL schema examples and Python code for concurrent queries.

BucketingCQLDataModeling

0 likes · 13 min read

Cassandra Time‑Series Data Modeling at Massive Scale Using Bucketing

Aikesheng Open Source Community

Jan 10, 2023 · Databases

Cassandra Multi‑Data‑Center Fault Tolerance Experiment and Analysis

This article presents a step‑by‑step experiment on a Cassandra cluster spanning two data centers, demonstrating how token ownership, data distribution, and fault‑tolerance behave when nodes fail or are removed, and explains the observed owns percentages and replication effects.

Distributed SystemsNoSQLcassandra

0 likes · 15 min read

Cassandra Multi‑Data‑Center Fault Tolerance Experiment and Analysis

ITPUB

Jan 4, 2023 · Databases

Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases

The article traces the evolution from Codd's relational model to modern RDBMS scaling limits, explains why centralized Hadoop/HBase architectures struggle with high‑concurrency workloads, and shows how Cassandra’s decentralized design—using consistent hashing, gossip, and virtual nodes—overcomes these bottlenecks while offering flexible consistency guarantees.

ConsistencyHBaseHDFS

0 likes · 22 min read

Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases

Aikesheng Open Source Community

Dec 14, 2022 · Databases

Understanding User and Role Management in Cassandra Clusters Across Data Centers

This article explains how Cassandra clusters organize nodes, racks, and data centers, describes the gossip and snitch protocols, token ring architecture, replication strategies, and provides step‑by‑step commands to create, list, and delete users and roles while highlighting cross‑data‑center visibility constraints and common errors.

CQLData centerNoSQL

0 likes · 12 min read

Understanding User and Role Management in Cassandra Clusters Across Data Centers

Architect

Jul 8, 2022 · Backend Development

Understanding Feed Stream Architecture: Models, Storage, and Optimization

This article explains the concept of feed streams, compares push and pull implementation models, discusses storage options such as MySQL, Redis SortedSet and Cassandra, and presents optimization techniques including online‑push/offline‑pull strategies, pagination methods, and deep‑paging solutions for large‑scale systems.

Backend ArchitectureSystem optimizationcassandra

0 likes · 9 min read

Understanding Feed Stream Architecture: Models, Storage, and Optimization

MaGe Linux Operations

Feb 7, 2022 · Cloud Native

Why K8ssandra Is Switching from Helm to Its Own Operator

The article explains how K8ssandra, an Apache Cassandra distribution for Kubernetes, evolved from using Helm charts to developing a dedicated Operator to overcome Helm's limitations, improve multi‑cluster support, and align more closely with Kubernetes best practices.

Cloud NativeGoK8ssandra

0 likes · 13 min read

Why K8ssandra Is Switching from Helm to Its Own Operator

Dada Group Technology

Sep 10, 2021 · Operations

Design and Implementation of JD Daojia Log System Based on Loki

This document details the motivation, architecture, components, query language, and deployment of a Loki‑based log collection and analysis platform for JD Daojia, comparing it with ELK, describing ingestion, real‑time and historical log handling, technical challenges, configuration examples, and future scaling plans.

GrafanaLog ManagementLoki

0 likes · 15 min read

Design and Implementation of JD Daojia Log System Based on Loki

Laravel Tech Community

Jul 29, 2021 · Databases

Apache Cassandra 4.0 Released: New Features, Performance Boosts, and Enhanced Security

Apache Cassandra 4.0 has been officially released after a long development cycle, offering higher speed, better scalability, improved consistency, stronger security, reduced latency, and more efficient compression, while supporting enterprise compliance and a new annual release cadence for future stability.

NoSQLSecuritycassandra

0 likes · 5 min read

Apache Cassandra 4.0 Released: New Features, Performance Boosts, and Enhanced Security

Architects Research Society

Aug 17, 2020 · Databases

Interview with JanusGraph PMC Members on Graph Database Landscape, Neo4j Comparison, and Deployment Best Practices

In this interview, JanusGraph PMC members Florian Hockmann and Jason Plurad discuss the project's origins, compare JanusGraph with Neo4j, share advice for production deployments, outline future expectations for JanusGraph and TinkerPop, and provide practical tips for graph modeling and community contribution.

ElasticsearchGraph DatabaseGremlin

0 likes · 16 min read

Interview with JanusGraph PMC Members on Graph Database Landscape, Neo4j Comparison, and Deployment Best Practices

Cloud Native Technology Community

Jun 12, 2020 · Cloud Native

Monzo’s Approach to Managing 1,600 Backend Microservices with Kubernetes and Cloud‑Native Practices

Monzo, the UK digital bank, shares how it built a Kubernetes‑based, cloud‑native platform to run over 1,600 Go‑written microservices backed by Cassandra, implements fine‑grained service isolation with network policies, and creates internal tooling to automate security and deployment at massive scale.

Cloud NativeGoKubernetes

0 likes · 7 min read

Monzo’s Approach to Managing 1,600 Backend Microservices with Kubernetes and Cloud‑Native Practices

dbaplus Community

Feb 4, 2020 · Databases

Understanding Cassandra’s Row‑Oriented Storage, Write Path, and Consistency

This article explains Cassandra’s row‑oriented storage model, the multi‑step write and read processes, how tombstones and compaction manage data growth, and the impact of its distributed architecture on high availability, fault tolerance, and configurable consistency levels.

Consistency LevelsDatabase ArchitectureRead Path

0 likes · 25 min read

Understanding Cassandra’s Row‑Oriented Storage, Write Path, and Consistency

DataFunTalk

Jan 9, 2020 · Databases

Exploring Spatiotemporal Data Management with Cassandra, GeoMesa, and GeoTrellis

This article presents a comprehensive overview of handling spatiotemporal data using Cassandra, covering data types, space‑filling curves, GeoHash encoding, the GeoMesa and GeoTrellis ecosystems, Cassandra storage schemas, and practical Spark integration for large‑scale geospatial analytics.

Big DataGeoMesaGeoTrellis

0 likes · 8 min read

Exploring Spatiotemporal Data Management with Cassandra, GeoMesa, and GeoTrellis

DataFunTalk

Dec 30, 2019 · Databases

Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases

This article summarizes a Cassandra meetup presentation that traces the database's origins from BigTable and Dynamo, outlines its key milestones, explains its peer‑to‑peer and LSM architecture, highlights current features, real‑world deployments, performance advantages, and previews upcoming 4.0 releases and community projects.

Big DataGossip ProtocolLSM

0 likes · 14 min read

Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases

DataFunTalk

Dec 23, 2019 · Databases

Cassandra Deployment and Optimization at 360 Cloud Storage

This article details how 360 adopted Cassandra for its cloud drive, describing Cassandra’s decentralized architecture, the reasons for its selection over HBase, large‑scale deployment challenges, performance optimizations, reliability improvements, disk utilization techniques, and the evolution of the system from 2010 to present.

Big DataData ReliabilityScalability

0 likes · 15 min read

Cassandra Deployment and Optimization at 360 Cloud Storage

DataFunTalk

Dec 12, 2019 · Databases

ScyllaDB Row‑Level Repair: Design, Implementation, and Performance Evaluation

ScyllaDB, a high‑performance C++ implementation of Apache Cassandra, introduces row‑level repair to replace the traditional partition‑level repair, reducing data transfer and I/O by operating on individual rows; the presentation details its architecture, multi‑stage process, experimental results, and the resulting six‑fold speedup.

Database PerformanceNoSQLRow-level repair

0 likes · 15 min read

ScyllaDB Row‑Level Repair: Design, Implementation, and Performance Evaluation

Big Data Technology & Architecture

Oct 30, 2019 · Big Data

Building a Real‑Time Data Processing Pipeline with Apache Kafka, Spark Streaming, and Cassandra

This tutorial explains how to create a highly scalable, fault‑tolerant real‑time data processing platform by configuring a Kafka topic, a Cassandra keyspace, adding Spark and connector dependencies, developing a Java‑based Spark Streaming pipeline, enabling checkpoints, and deploying the application with spark‑submit.

Big DataJavaKafka

0 likes · 8 min read

Building a Real‑Time Data Processing Pipeline with Apache Kafka, Spark Streaming, and Cassandra

Alibaba Cloud Developer

Mar 21, 2018 · Databases

Consistency Models in Distributed Storage: Cosmos DB, Cassandra, OceanBase

This article explains the fundamentals of consistency in distributed storage systems, contrasts it with database transaction consistency, and details the various consistency levels offered by Azure Cosmos DB, Apache Cassandra, and OceanBase, highlighting their guarantees, configurations, and the performance‑availability trade‑offs involved.

OceanBasecassandraconsistency models

0 likes · 22 min read

Consistency Models in Distributed Storage: Cosmos DB, Cassandra, OceanBase

ITFLY8 Architecture Home

Feb 25, 2018 · Big Data

Building Scalable Data Platforms with SMACK: Spark, Mesos, Akka, Cassandra & Kafka

Learn how to construct a scalable data processing platform using the SMACK stack—Spark, Mesos, Akka, Cassandra, and Kafka—covering storage design, processing workflows, resource management, deployment options, and fault‑tolerant task execution for both batch and streaming workloads.

AkkaKafkaMesos

0 likes · 14 min read

Building Scalable Data Platforms with SMACK: Spark, Mesos, Akka, Cassandra & Kafka

Qunar Tech Salon

Nov 14, 2017 · Backend Development

Designing Distributed Systems Inspired by McDonald’s Restaurant Operations

The article uses everyday observations from a McDonald’s restaurant to illustrate core distributed system concepts such as master‑slave architecture, two‑phase commit, microservice decomposition, task queues, and container orchestration, showing how these principles apply to backend engineering.

HBaseMaster‑Slavecassandra

0 likes · 15 min read

Designing Distributed Systems Inspired by McDonald’s Restaurant Operations

High Availability Architecture

Apr 19, 2017 · Databases

Discord’s Migration from MongoDB to Cassandra: Architecture, Data Modeling, and Lessons Learned

This article details how Discord scaled from tens of millions to over a hundred million daily messages by replacing MongoDB with Cassandra, covering the motivations, data model design, handling of tombstones, performance results, unexpected issues, and future scalability plans.

Discordcassandradata modeling

0 likes · 14 min read

Discord’s Migration from MongoDB to Cassandra: Architecture, Data Modeling, and Lessons Learned

dbaplus Community

Aug 1, 2016 · Databases

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

This article traces Facebook's evolution from a small social site to a global platform, explains how its massive data‑storage challenges led to the adoption of NoSQL solutions like Cassandra and HBase, and breaks down the core patterns, consistency models, and scaling techniques that power such large‑scale systems.

ConsistencyFacebookHBase

0 likes · 15 min read

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

dbaplus Community

Feb 28, 2016 · Databases

Choosing the Right Database: Relational vs NoSQL, MySQL, PostgreSQL, MongoDB, Cassandra, Neo4j and More

This guide compares common relational databases and NoSQL solutions, outlining ideal use‑cases for MySQL, PostgreSQL, MongoDB, key‑value stores, Cassandra, Neo4j, and maps specific business scenarios to the most suitable database technology.

MongoDBNeo4jRelational vs NoSQL

0 likes · 13 min read

Choosing the Right Database: Relational vs NoSQL, MySQL, PostgreSQL, MongoDB, Cassandra, Neo4j and More

21CTO

Nov 28, 2015 · Databases

Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?

While Hadoop enjoys a strong reputation in big‑data applications, the article argues that NoSQL databases—specifically MongoDB, Cassandra, and HBase—are more widely deployed, comparing their strengths, use cases, and market popularity to help developers decide which technology best fits their needs.

HBaseNoSQLcassandra

0 likes · 10 min read

Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?

Qunar Tech Salon

Oct 16, 2015 · Databases

Choosing the Right NoSQL Database: MongoDB, Cassandra, and HBase Compared

The article examines why enterprises should consider NoSQL over Hadoop for big data storage, compares the three leading NoSQL databases—MongoDB, Cassandra, and HBase—based on market popularity, technical strengths, scalability, and use‑case suitability, and concludes with guidance on selecting the most appropriate solution.

Big DataMongoDBNoSQL

0 likes · 11 min read

Choosing the Right NoSQL Database: MongoDB, Cassandra, and HBase Compared

Efficient Ops

Oct 14, 2015 · Big Data

Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A

During a lively “Sit and Discuss” session, experts compared Spark and Hadoop, evaluated Flink against Spark, contrasted HBase with Cassandra, explained why Kafka (and sometimes Flink) is preferred for distributed messaging, and shared insights on Tachyon’s role in modern big‑data ecosystems.

FlinkHBaseHadoop

0 likes · 10 min read

Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A

MaGe Linux Operations

Aug 29, 2014 · Databases

Which NoSQL Database Is Right for You? A Detailed Comparison of 8 Popular Options

An in‑depth overview compares eight leading NoSQL databases—CouchDB, Redis, MongoDB, Riak, Membase, Neo4j, Cassandra, and HBase—detailing their languages, licensing, protocols, features, replication models, and ideal application scenarios to help architects choose the most suitable solution.

CouchDBMongoDBNoSQL

0 likes · 14 min read

Which NoSQL Database Is Right for You? A Detailed Comparison of 8 Popular Options

MaGe Linux Operations

Jul 9, 2014 · Databases

Top 15 NoSQL Databases You Should Know in 2024

An extensive overview of fifteen popular NoSQL databases—including MongoDB, CouchDB, HBase, Cassandra, Redis, and more—detailing their architectures, key features, performance characteristics, and typical applications, helping readers choose the right solution for high‑scale, high‑concurrency data storage needs.

MongoDBNoSQLcassandra

0 likes · 33 min read

Top 15 NoSQL Databases You Should Know in 2024