Tagged articles

kafka

1310 articles · Page 14 of 14
Architect
Architect
Jul 6, 2015 · Big Data

Understanding Logs: The Core of Distributed Systems and Data Integration

This article explains how logs—simple, append‑only, time‑ordered records—serve as the fundamental abstraction behind databases, distributed systems, data integration pipelines, and stream‑processing platforms like Kafka and Hadoop, illustrating their role in ordering, replication, scalability, and real‑time analytics.

Data IntegrationHadoopdistributed systems
0 likes · 48 min read
Understanding Logs: The Core of Distributed Systems and Data Integration

Designing a Scalable Real‑Time Mobile Analytics Platform with Kafka, Storm, and Amazon EMR

The article describes how a mobile analytics service processes billions of events daily using a Lambda‑style architecture that combines Kafka, Storm, Amazon EMR, and S3 to achieve scalable, fault‑tolerant batch and real‑time computation, while ensuring reliable event ingestion and graceful degradation.

AWSBig DataStorm
0 likes · 8 min read
Designing a Scalable Real‑Time Mobile Analytics Platform with Kafka, Storm, and Amazon EMR
MaGe Linux Operations
MaGe Linux Operations
Apr 28, 2015 · Big Data

How LinkedIn Scales Kafka to Billions of Messages Every Day

This article explains how LinkedIn uses Apache Kafka as a high‑throughput, fault‑tolerant messaging backbone, detailing its architecture, message categories, layered replication, audit mechanisms, and the engineering practices that keep billions of daily messages reliable and fast.

Big DataLinkedIndistributed systems
0 likes · 11 min read
How LinkedIn Scales Kafka to Billions of Messages Every Day

Understanding Kafka High Availability: Data Replication and Leader Election

The article explains why Kafka introduced high availability starting with version 0.8, detailing the need for data replication and leader election, describing replica distribution algorithms, replication mechanics, ISR handling, ZooKeeper structures, and the broker failover process to ensure fault‑tolerant streaming.

High AvailabilityLeader ElectionZookeeper
0 likes · 19 min read
Understanding Kafka High Availability: Data Replication and Leader Election
Meituan Technology Team
Meituan Technology Team
Jan 14, 2015 · Big Data

Kafka File Storage Mechanism and Architecture

Kafka stores each topic as partitions that are divided into sequential segment files containing paired .log data and .index files, using global offsets and sparse memory‑mapped indexes to enable fast offset‑based lookups, efficient deletions, and minimal disk I/O in real‑world deployments.

Message QueuePartitionSegment
0 likes · 9 min read
Kafka File Storage Mechanism and Architecture