Tagged articles
34 articles
Page 1 of 1
ITPUB
ITPUB
Nov 21, 2025 · Backend Development

How Uber Uses H3 Hexagonal Indexing to Power Real‑Time Driver Matching

This article explains how Uber solves the "nearby driver" problem by employing the open‑source H3 hexagonal spatial index, hierarchical grids, Cassandra for persistent storage, and Redis caching to deliver fast, accurate, and scalable real‑time location services.

Backend DevelopmentGeospatial IndexingH3
0 likes · 14 min read
How Uber Uses H3 Hexagonal Indexing to Power Real‑Time Driver Matching
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jul 1, 2025 · Backend Development

How We Cut a 150‑Billion Image Migration from 120 Days to 40 Days

Facing the challenge of moving 15 billion image files from Cassandra to S3, we iteratively redesigned the migration pipeline—from a single‑process approach to multi‑process, queue‑driven, and multi‑cluster deployments—reducing the projected 120‑day effort to just 40 days while ensuring reliability and performance.

Data MigrationGoPerformance Optimization
0 likes · 16 min read
How We Cut a 150‑Billion Image Migration from 120 Days to 40 Days
dbaplus Community
dbaplus Community
Apr 20, 2025 · Databases

Why Wide Tables Fail and How to Design Them Efficiently

This article explains what wide tables are, why they are controversial, outlines three common design pitfalls with practical avoidance tips, and introduces three key technologies—ClickHouse, Cassandra, and Hudi/Iceberg—to help engineers build performant, maintainable wide‑table solutions in data warehouses.

Big DataDatabase designHudi
0 likes · 7 min read
Why Wide Tables Fail and How to Design Them Efficiently
dbaplus Community
dbaplus Community
Jul 7, 2024 · Operations

How Instagram Scales to 2.5B Users: Architecture, Consistency & Performance

Instagram grew from a simple photo‑sharing app to over 2.5 billion users, prompting engineers to adopt horizontal scaling, replace Python code with Cython, use region‑specific Cassandra clusters, employ the Akkio data‑placement service, and optimize PostgreSQL and Memcache handling to improve resource utilization, data consistency, and latency.

asynciocassandrainstagram
0 likes · 11 min read
How Instagram Scales to 2.5B Users: Architecture, Consistency & Performance
Architect
Architect
May 13, 2024 · Backend Development

Push vs Pull: Designing a Scalable Feed Timeline with Redis and Cassandra

This article analyzes feed‑timeline architectures, compares pull‑based and push‑based models, proposes an online‑push/offline‑pull hybrid, and details practical implementations using Redis SortedSets, multi‑level caching with Cassandra, cursor‑based pagination, and large‑scale push task sharding.

Backendcachingcassandra
0 likes · 15 min read
Push vs Pull: Designing a Scalable Feed Timeline with Redis and Cassandra
21CTO
21CTO
May 16, 2023 · Databases

How Cassandra’s New Vector Search Transforms AI Applications

This article explains how Cassandra’s newly added vector data type and ANN search capabilities empower AI developers to store, index, and query high‑dimensional embeddings at scale, enabling use cases such as image retrieval, recommendation, and large‑language‑model integration.

AIANNcassandra
0 likes · 10 min read
How Cassandra’s New Vector Search Transforms AI Applications
ITPUB
ITPUB
Jan 4, 2023 · Databases

Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases

The article traces the evolution from Codd's relational model to modern RDBMS scaling limits, explains why centralized Hadoop/HBase architectures struggle with high‑concurrency workloads, and shows how Cassandra’s decentralized design—using consistent hashing, gossip, and virtual nodes—overcomes these bottlenecks while offering flexible consistency guarantees.

ConsistencyHBaseHDFS
0 likes · 22 min read
Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 14, 2022 · Databases

Understanding User and Role Management in Cassandra Clusters Across Data Centers

This article explains how Cassandra clusters organize nodes, racks, and data centers, describes the gossip and snitch protocols, token ring architecture, replication strategies, and provides step‑by‑step commands to create, list, and delete users and roles while highlighting cross‑data‑center visibility constraints and common errors.

CQLData centerNoSQL
0 likes · 12 min read
Understanding User and Role Management in Cassandra Clusters Across Data Centers
Architect
Architect
Jul 8, 2022 · Backend Development

Understanding Feed Stream Architecture: Models, Storage, and Optimization

This article explains the concept of feed streams, compares push and pull implementation models, discusses storage options such as MySQL, Redis SortedSet and Cassandra, and presents optimization techniques including online‑push/offline‑pull strategies, pagination methods, and deep‑paging solutions for large‑scale systems.

Backend ArchitectureSystem optimizationcassandra
0 likes · 9 min read
Understanding Feed Stream Architecture: Models, Storage, and Optimization
MaGe Linux Operations
MaGe Linux Operations
Feb 7, 2022 · Cloud Native

Why K8ssandra Is Switching from Helm to Its Own Operator

The article explains how K8ssandra, an Apache Cassandra distribution for Kubernetes, evolved from using Helm charts to developing a dedicated Operator to overcome Helm's limitations, improve multi‑cluster support, and align more closely with Kubernetes best practices.

Cloud NativeGoK8ssandra
0 likes · 13 min read
Why K8ssandra Is Switching from Helm to Its Own Operator
Dada Group Technology
Dada Group Technology
Sep 10, 2021 · Operations

Design and Implementation of JD Daojia Log System Based on Loki

This document details the motivation, architecture, components, query language, and deployment of a Loki‑based log collection and analysis platform for JD Daojia, comparing it with ELK, describing ingestion, real‑time and historical log handling, technical challenges, configuration examples, and future scaling plans.

GrafanaLog ManagementLoki
0 likes · 15 min read
Design and Implementation of JD Daojia Log System Based on Loki
Architects Research Society
Architects Research Society
Aug 17, 2020 · Databases

Interview with JanusGraph PMC Members on Graph Database Landscape, Neo4j Comparison, and Deployment Best Practices

In this interview, JanusGraph PMC members Florian Hockmann and Jason Plurad discuss the project's origins, compare JanusGraph with Neo4j, share advice for production deployments, outline future expectations for JanusGraph and TinkerPop, and provide practical tips for graph modeling and community contribution.

ElasticsearchGraph DatabaseGremlin
0 likes · 16 min read
Interview with JanusGraph PMC Members on Graph Database Landscape, Neo4j Comparison, and Deployment Best Practices
Cloud Native Technology Community
Cloud Native Technology Community
Jun 12, 2020 · Cloud Native

Monzo’s Approach to Managing 1,600 Backend Microservices with Kubernetes and Cloud‑Native Practices

Monzo, the UK digital bank, shares how it built a Kubernetes‑based, cloud‑native platform to run over 1,600 Go‑written microservices backed by Cassandra, implements fine‑grained service isolation with network policies, and creates internal tooling to automate security and deployment at massive scale.

Cloud NativeGoKubernetes
0 likes · 7 min read
Monzo’s Approach to Managing 1,600 Backend Microservices with Kubernetes and Cloud‑Native Practices
DataFunTalk
DataFunTalk
Jan 9, 2020 · Databases

Exploring Spatiotemporal Data Management with Cassandra, GeoMesa, and GeoTrellis

This article presents a comprehensive overview of handling spatiotemporal data using Cassandra, covering data types, space‑filling curves, GeoHash encoding, the GeoMesa and GeoTrellis ecosystems, Cassandra storage schemas, and practical Spark integration for large‑scale geospatial analytics.

Big DataGeoMesaGeoTrellis
0 likes · 8 min read
Exploring Spatiotemporal Data Management with Cassandra, GeoMesa, and GeoTrellis
DataFunTalk
DataFunTalk
Dec 30, 2019 · Databases

Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases

This article summarizes a Cassandra meetup presentation that traces the database's origins from BigTable and Dynamo, outlines its key milestones, explains its peer‑to‑peer and LSM architecture, highlights current features, real‑world deployments, performance advantages, and previews upcoming 4.0 releases and community projects.

Big DataGossip ProtocolLSM
0 likes · 14 min read
Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases
DataFunTalk
DataFunTalk
Dec 23, 2019 · Databases

Cassandra Deployment and Optimization at 360 Cloud Storage

This article details how 360 adopted Cassandra for its cloud drive, describing Cassandra’s decentralized architecture, the reasons for its selection over HBase, large‑scale deployment challenges, performance optimizations, reliability improvements, disk utilization techniques, and the evolution of the system from 2010 to present.

Big DataData ReliabilityScalability
0 likes · 15 min read
Cassandra Deployment and Optimization at 360 Cloud Storage
DataFunTalk
DataFunTalk
Dec 12, 2019 · Databases

ScyllaDB Row‑Level Repair: Design, Implementation, and Performance Evaluation

ScyllaDB, a high‑performance C++ implementation of Apache Cassandra, introduces row‑level repair to replace the traditional partition‑level repair, reducing data transfer and I/O by operating on individual rows; the presentation details its architecture, multi‑stage process, experimental results, and the resulting six‑fold speedup.

Database PerformanceNoSQLRow-level repair
0 likes · 15 min read
ScyllaDB Row‑Level Repair: Design, Implementation, and Performance Evaluation
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 30, 2019 · Big Data

Building a Real‑Time Data Processing Pipeline with Apache Kafka, Spark Streaming, and Cassandra

This tutorial explains how to create a highly scalable, fault‑tolerant real‑time data processing platform by configuring a Kafka topic, a Cassandra keyspace, adding Spark and connector dependencies, developing a Java‑based Spark Streaming pipeline, enabling checkpoints, and deploying the application with spark‑submit.

Big DataJavaKafka
0 likes · 8 min read
Building a Real‑Time Data Processing Pipeline with Apache Kafka, Spark Streaming, and Cassandra
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 21, 2018 · Databases

Consistency Models in Distributed Storage: Cosmos DB, Cassandra, OceanBase

This article explains the fundamentals of consistency in distributed storage systems, contrasts it with database transaction consistency, and details the various consistency levels offered by Azure Cosmos DB, Apache Cassandra, and OceanBase, highlighting their guarantees, configurations, and the performance‑availability trade‑offs involved.

OceanBasecassandraconsistency models
0 likes · 22 min read
Consistency Models in Distributed Storage: Cosmos DB, Cassandra, OceanBase
Qunar Tech Salon
Qunar Tech Salon
Nov 14, 2017 · Backend Development

Designing Distributed Systems Inspired by McDonald’s Restaurant Operations

The article uses everyday observations from a McDonald’s restaurant to illustrate core distributed system concepts such as master‑slave architecture, two‑phase commit, microservice decomposition, task queues, and container orchestration, showing how these principles apply to backend engineering.

HBaseMaster‑Slavecassandra
0 likes · 15 min read
Designing Distributed Systems Inspired by McDonald’s Restaurant Operations
dbaplus Community
dbaplus Community
Aug 1, 2016 · Databases

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

This article traces Facebook's evolution from a small social site to a global platform, explains how its massive data‑storage challenges led to the adoption of NoSQL solutions like Cassandra and HBase, and breaks down the core patterns, consistency models, and scaling techniques that power such large‑scale systems.

ConsistencyFacebookHBase
0 likes · 15 min read
How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond
21CTO
21CTO
Nov 28, 2015 · Databases

Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?

While Hadoop enjoys a strong reputation in big‑data applications, the article argues that NoSQL databases—specifically MongoDB, Cassandra, and HBase—are more widely deployed, comparing their strengths, use cases, and market popularity to help developers decide which technology best fits their needs.

HBaseNoSQLcassandra
0 likes · 10 min read
Choosing the Right NoSQL Database: MongoDB, Cassandra, or HBase?
Qunar Tech Salon
Qunar Tech Salon
Oct 16, 2015 · Databases

Choosing the Right NoSQL Database: MongoDB, Cassandra, and HBase Compared

The article examines why enterprises should consider NoSQL over Hadoop for big data storage, compares the three leading NoSQL databases—MongoDB, Cassandra, and HBase—based on market popularity, technical strengths, scalability, and use‑case suitability, and concludes with guidance on selecting the most appropriate solution.

Big DataMongoDBNoSQL
0 likes · 11 min read
Choosing the Right NoSQL Database: MongoDB, Cassandra, and HBase Compared
Efficient Ops
Efficient Ops
Oct 14, 2015 · Big Data

Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A

During a lively “Sit and Discuss” session, experts compared Spark and Hadoop, evaluated Flink against Spark, contrasted HBase with Cassandra, explained why Kafka (and sometimes Flink) is preferred for distributed messaging, and shared insights on Tachyon’s role in modern big‑data ecosystems.

FlinkHBaseHadoop
0 likes · 10 min read
Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A
MaGe Linux Operations
MaGe Linux Operations
Jul 9, 2014 · Databases

Top 15 NoSQL Databases You Should Know in 2024

An extensive overview of fifteen popular NoSQL databases—including MongoDB, CouchDB, HBase, Cassandra, Redis, and more—detailing their architectures, key features, performance characteristics, and typical applications, helping readers choose the right solution for high‑scale, high‑concurrency data storage needs.

MongoDBNoSQLcassandra
0 likes · 33 min read
Top 15 NoSQL Databases You Should Know in 2024