Tagged articles
1273 articles
Page 13 of 13
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Dec 13, 2016 · Big Data

Umeng’s Mobile Big Data Platform: Architecture, Challenges & Insights

The article details Umeng’s mobile big‑data platform architecture, describing its Lambda‑style hybrid design, data ingestion pipeline with dual Kafka clusters, offline and real‑time processing using Hadoop, Spark, Storm, and storage layers such as HDFS, HBase, MongoDB and Elasticsearch, while also discussing challenges in data collection, cleaning, computation, security, and value‑added services.

Data ArchitectureHadoopKafka
0 likes · 13 min read
Umeng’s Mobile Big Data Platform: Architecture, Challenges & Insights
Meituan Technology Team
Meituan Technology Team
Nov 4, 2016 · Big Data

Design and Implementation of a Low-Latency App Exception Monitoring Platform Using Spark Streaming, Kafka, and Elasticsearch

The paper presents a production‑grade, low‑cost mobile‑app exception monitoring platform built on Spark Streaming, Kafka, and Elasticsearch that achieves high availability through exactly‑once processing and checkpointing, minute‑level latency by decoupling raw and symbolized logs, high throughput via reservoir sampling, and dynamic scalability without code changes.

Big DataElasticsearchException Monitoring
0 likes · 11 min read
Design and Implementation of a Low-Latency App Exception Monitoring Platform Using Spark Streaming, Kafka, and Elasticsearch
Efficient Ops
Efficient Ops
Oct 27, 2016 · Information Security

Tech World Shake‑Up: DNS Outage, Apple ARM Support, Google Strategy, Kafka Updates

A roundup of recent tech developments covering a massive US DNS outage caused by IoT‑based DDoS attacks, Apple’s addition of ARM support to macOS Sierra, Google’s evolving 20% time policy, new multi‑data‑center features in Confluent Kafka, MariaDB’s new member, a critical OpenSSL flaw, China Mobile’s OpenStack award, and Tencent’s rapid Nexus 6P hack.

AppleDNSGoogle
0 likes · 8 min read
Tech World Shake‑Up: DNS Outage, Apple ARM Support, Google Strategy, Kafka Updates
dbaplus Community
dbaplus Community
Oct 19, 2016 · Backend Development

When to Use Kafka, RabbitMQ, or ZeroMQ: A Practical MQ Guide

This article explains the true purpose of message queues, classifies them into broker‑based and broker‑less families, compares Kafka, RabbitMQ, and ZeroMQ in terms of performance, flexibility, and lightweight distribution, and clarifies that MQs can support both asynchronous and synchronous communication.

KafkaMessage QueueRabbitMQ
0 likes · 8 min read
When to Use Kafka, RabbitMQ, or ZeroMQ: A Practical MQ Guide
GF Securities FinTech
GF Securities FinTech
Sep 28, 2016 · Backend Development

How Event Sourcing and a Go DSL Power a Scalable Points System

This article explains how a financial e‑commerce platform uses the Event Sourcing architecture pattern, an asynchronous message bus, and a Go‑based domain‑specific language to build a flexible, exactly‑once points system that decouples business rules from application code and simplifies operations.

DSLEvent SourcingGo
0 likes · 17 min read
How Event Sourcing and a Go DSL Power a Scalable Points System
Architecture Digest
Architecture Digest
Sep 12, 2016 · Artificial Intelligence

Design and Implementation of a Real‑Time, Highly Available General Recommendation Platform at YHD

The article describes how YHD's precision recommendation team built a real‑time, highly available, traceable general recommendation platform, detailing its background, overall architecture, visual configuration and traceability subsystems, and reporting significant improvements in development speed, reuse and user satisfaction.

HBaseKafkaReal-Time
0 likes · 8 min read
Design and Implementation of a Real‑Time, Highly Available General Recommendation Platform at YHD
dbaplus Community
dbaplus Community
Sep 6, 2016 · Big Data

Choosing the Right Log Collection Framework for Massive Data Streams

This article reviews major open‑source log collection tools—Chukwa, Scribe, Flume, Logstash, Kafka, and TT—examining their architectures, strengths, and limitations to help engineers select the most suitable solution for high‑volume, low‑latency data pipelines.

Apache FlumeDistributed SystemsKafka
0 likes · 13 min read
Choosing the Right Log Collection Framework for Massive Data Streams
Architecture Digest
Architecture Digest
Aug 17, 2016 · Backend Development

Design and Optimization of Bilibili Live Chat (GOIM) System

The article presents a detailed overview of Bilibili's GOIM live chat architecture, covering its high‑stability, high‑availability, low‑latency design, component breakdown, memory and module optimizations, network improvements, and performance testing results to achieve scalable real‑time messaging.

Backend ArchitectureGoKafka
0 likes · 13 min read
Design and Optimization of Bilibili Live Chat (GOIM) System
Ctrip Technology
Ctrip Technology
Aug 12, 2016 · Big Data

Ctrip's Real-Time Data Platform: Architecture, Practices, and Lessons Learned

This article details Ctrip's journey building a unified real-time data platform—covering business motivations, architectural requirements, technology choices like Kafka and Storm, implementation of Avro schemas, monitoring, alerting, operational lessons, and future explorations such as Streaming CQL and JStorm.

AlertingBig DataKafka
0 likes · 15 min read
Ctrip's Real-Time Data Platform: Architecture, Practices, and Lessons Learned
Meituan Technology Team
Meituan Technology Team
Aug 5, 2016 · Big Data

Design and Implementation of a Large-Scale User Behavior Analytics Platform

The article outlines Meituan‑Dianping’s “Sensors Analytics” platform, a privately‑deployed, open‑PaaS solution that collects full‑stack user events from iOS, Android, Web and WeChat, maps IDs in near real‑time, stores detailed records in Kudu (real‑time) and Parquet (offline), and serves low‑latency queries via Impala, addressing the architectural and operational challenges of high‑throughput ingestion and data‑security requirements.

ImpalaKafkaKudu
0 likes · 8 min read
Design and Implementation of a Large-Scale User Behavior Analytics Platform
Architect
Architect
Jun 15, 2016 · Backend Development

Understanding Kafka's SocketServer: Acceptor, Processor, and RequestChannel Architecture

This article explains the internal design of Kafka's SocketServer, detailing its NIO‑based thread model with Acceptor, Processor, and Handler threads, the startup sequence, how connections are accepted and processed, and the role of RequestChannel in routing requests and responses between processors and handlers.

BackendKafkaScala
0 likes · 17 min read
Understanding Kafka's SocketServer: Acceptor, Processor, and RequestChannel Architecture
Architecture Digest
Architecture Digest
May 22, 2016 · Big Data

Design and Architecture of Youzan Unified Log Platform

The article details the design, components, and operational challenges of Youzan's unified log platform, describing its multi‑layer architecture, ingestion methods using rsyslog/logstash and Flume‑NG, Kafka‑based log center, processing pipelines with Storm/Spark, and storage in HDFS and Elasticsearch.

Distributed SystemsFlumeKafka
0 likes · 10 min read
Design and Architecture of Youzan Unified Log Platform
21CTO
21CTO
May 16, 2016 · Operations

How to Centralize Logs from Dockerized Services Using Flume and Kafka

This article explains a practical architecture for aggregating logs from distributed Docker containers by employing Flume NG as a lightweight log collector, Kafka as a high‑throughput message bus, and custom sinks to store logs per service, module and day with low latency and minimal resource impact.

DockerFlumeKafka
0 likes · 17 min read
How to Centralize Logs from Dockerized Services Using Flume and Kafka
Architect
Architect
May 16, 2016 · Operations

Centralized Log Collection for Distributed Docker Services Using Flume and Kafka

This article presents a practical architecture for centrally collecting dispersed logs from Docker‑based services in a distributed environment by leveraging Flume NG as a non‑intrusive log agent, Kafka as a high‑throughput message bus, and custom sinks to partition logs by service, module, and day.

Distributed SystemsDockerKafka
0 likes · 15 min read
Centralized Log Collection for Distributed Docker Services Using Flume and Kafka
Architect
Architect
Apr 28, 2016 · Big Data

Design and Architecture of Youzan Unified Log Platform

The article describes the design, components, and implementation details of Youzan's unified log platform, covering log ingestion via rsyslog, Logstash, and Flume, centralized processing with Kafka, real‑time analysis using Storm/Spark, and storage in HDFS, Elasticsearch, and Hawk, while also discussing challenges and future improvements.

ElasticsearchHDFSKafka
0 likes · 10 min read
Design and Architecture of Youzan Unified Log Platform
21CTO
21CTO
Apr 14, 2016 · Cloud Computing

How Netflix’s EVCache Powers Global Low‑Latency Caching Across Regions

This article explains how Netflix uses the open‑source EVCache system, built on Memcached and Kafka, to provide highly reliable, low‑latency caching for its micro‑services architecture across multiple AWS regions, handling billions of objects and millions of requests per second.

Distributed SystemsEVCacheKafka
0 likes · 9 min read
How Netflix’s EVCache Powers Global Low‑Latency Caching Across Regions
Architecture Digest
Architecture Digest
Mar 28, 2016 · Big Data

Overview of the Hadoop Ecosystem and Modern Big Data Technologies

This article provides a comprehensive overview of Hadoop and its surrounding ecosystem, detailing core components, storage principles, key algorithms, and a wide range of modern big‑data technologies such as Spark, Flink, Kafka, NoSQL databases, and cloud‑based processing platforms.

Big DataHadoopKafka
0 likes · 11 min read
Overview of the Hadoop Ecosystem and Modern Big Data Technologies
MaGe Linux Operations
MaGe Linux Operations
Mar 28, 2016 · Backend Development

Understanding JMS: Message Models, Consumption, and Popular Middleware

This article explains the JMS standard, its two messaging models (Point‑to‑Point and Publish/Subscribe), how messages are consumed synchronously or asynchronously, the core JMS programming objects, and provides an overview of common middleware such as ActiveMQ, RabbitMQ, ZeroMQ, and Kafka.

ActiveMQJMSKafka
0 likes · 17 min read
Understanding JMS: Message Models, Consumption, and Popular Middleware
Architect
Architect
Mar 22, 2016 · Backend Development

Youzan Search Engine Practice – Engineering Part: Architecture, Indexing, and Performance Optimization

This article describes the practical architecture of Youzan's commercial e‑commerce search engine, covering data source integration, distributed real‑time indexing with Elasticsearch, Hadoop and Kafka, advanced search modules, and several performance‑tuning techniques for large‑scale deployments.

Backend ArchitectureElasticsearchKafka
0 likes · 13 min read
Youzan Search Engine Practice – Engineering Part: Architecture, Indexing, and Performance Optimization
Architecture Digest
Architecture Digest
Mar 22, 2016 · Backend Development

Evolution of LinkedIn’s Backend Architecture: From the Leo Monolith to a Scalable Service‑Oriented Platform

The article chronicles LinkedIn’s journey from a single‑server Leo monolith to a highly distributed, service‑oriented backend architecture, detailing the introduction of member graphs, read‑only replicas, caching layers, Kafka pipelines, Rest.li APIs, super‑blocks, and multi‑data‑center deployments to support billions of daily requests.

Backend ArchitectureDistributed SystemsKafka
0 likes · 9 min read
Evolution of LinkedIn’s Backend Architecture: From the Leo Monolith to a Scalable Service‑Oriented Platform
21CTO
21CTO
Mar 20, 2016 · Backend Development

How LinkedIn Scaled to 350 Million Users: From Leo Monolith to 750+ Microservices

LinkedIn grew from a single monolithic Leo server handling all web requests to a complex ecosystem of over 750 independent services, employing graph databases, read replicas, caching layers, Kafka pipelines, Rest.li APIs, and multi‑data‑center deployments to support billions of daily queries.

Distributed SystemsKafkaMicroservices
0 likes · 9 min read
How LinkedIn Scaled to 350 Million Users: From Leo Monolith to 750+ Microservices
Architect
Architect
Mar 12, 2016 · Backend Development

Design and Evolution of Ctrip's Hermes Message Queue System

This article presents a detailed overview of Ctrip's Hermes message queue system, covering its architectural evolution from a simple Mongo‑based design to a broker‑centric, multi‑storage solution with meta‑server coordination, and discusses practical techniques for building high‑performance, scalable messaging infrastructure.

Cluster ManagementCtripDistributed Systems
0 likes · 21 min read
Design and Evolution of Ctrip's Hermes Message Queue System
Architect
Architect
Mar 8, 2016 · Big Data

In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark

This article provides a comprehensive technical overview of Apache Kafka, covering its architecture, core concepts, design goals, comparison with other message queues, replication, consumer groups, delivery guarantees, and performance benchmarking, making it a valuable resource for big‑data engineers.

Big DataKafkaReplication
0 likes · 30 min read
In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark
21CTO
21CTO
Mar 7, 2016 · Backend Development

When to Choose Kafka Over RabbitMQ: A Practical Comparison

This article compares Kafka and RabbitMQ, examining their design philosophies, throughput capabilities, consumer diversity, message ordering, and handling of individual messages, to help engineers decide which system suits high-volume or flexible-consumer scenarios and understand the trade-offs of each technology.

KafkaRabbitMQStreaming
0 likes · 7 min read
When to Choose Kafka Over RabbitMQ: A Practical Comparison
Architecture Digest
Architecture Digest
Mar 6, 2016 · Backend Development

Message Queue Overview, Application Scenarios, and Middleware Examples

This article introduces the fundamentals of message queues, explains common use cases such as asynchronous processing, system decoupling, traffic shaping, and log handling, and reviews popular middleware implementations including JMS, ActiveMQ, RabbitMQ, ZeroMQ, and Kafka.

BackendDistributed SystemsJMS
0 likes · 18 min read
Message Queue Overview, Application Scenarios, and Middleware Examples
Java High-Performance Architecture
Java High-Performance Architecture
Feb 29, 2016 · Backend Development

How Kafka Stores and Retrieves Messages: Inside Partitions, Segments, and Index Files

Kafka persists messages on disk by organizing each topic into multiple partitions, which are further divided into segment files containing paired .index and .log files; this structure enables efficient storage, offset-based lookup, and fast retrieval of specific messages through binary search across segment indexes.

KafkaMessage Queuestorage architecture
0 likes · 5 min read
How Kafka Stores and Retrieves Messages: Inside Partitions, Segments, and Index Files
Architecture Digest
Architecture Digest
Feb 25, 2016 · Backend Development

Ctrip's Hermes Asynchronous Messaging System: Architecture, Evolution, and High‑Performance Practices

The article presents a detailed overview of Ctrip's Hermes asynchronous messaging system, describing its architectural evolution from a simple Mongo‑based queue to a broker‑centric design with MySQL and Kafka back‑ends, and explains optimization techniques for single‑node performance, clustering, lease‑based management, and reliable delivery.

BrokerCtripHermes
0 likes · 22 min read
Ctrip's Hermes Asynchronous Messaging System: Architecture, Evolution, and High‑Performance Practices
Architect
Architect
Feb 23, 2016 · Big Data

Kafka High Availability Design: Data Replication and Leader Election

This article explains why Kafka introduced high‑availability features after version 0.8, detailing the necessity of data replication and leader election, describing Kafka’s replica distribution algorithm, replication mechanics, acknowledgment requirements, leader‑election strategies, Zookeeper structures, and the broker failover process.

KafkaReplicationZooKeeper
0 likes · 19 min read
Kafka High Availability Design: Data Replication and Leader Election
21CTO
21CTO
Feb 23, 2016 · Big Data

Why Kafka Dominates Modern Data Pipelines: Architecture, Benefits, and Guarantees

Kafka, the open‑source distributed messaging system from LinkedIn, offers O(1) persistence, high throughput, partitioned topics, and flexible delivery guarantees, making it a cornerstone for modern big‑data pipelines and real‑time processing alongside Hadoop, Spark, and Storm.

Big DataConsumerDelivery Guarantees
0 likes · 21 min read
Why Kafka Dominates Modern Data Pipelines: Architecture, Benefits, and Guarantees
21CTO
21CTO
Feb 14, 2016 · Backend Development

Unlocking High‑Performance Systems: How Message Queues Transform Backend Architecture

This article provides a comprehensive overview of message queues, covering their core concepts, key application scenarios such as asynchronous processing, system decoupling, traffic shaping, log handling, and communication, and examines popular middleware like ActiveMQ, RabbitMQ, ZeroMQ, and Kafka, along with JMS models and programming details.

JMSKafkaMessage Queue
0 likes · 22 min read
Unlocking High‑Performance Systems: How Message Queues Transform Backend Architecture
21CTO
21CTO
Feb 6, 2016 · Backend Development

How LinkedIn Scaled to 300M Users: Lessons from a Decade of Backend Architecture

This article chronicles LinkedIn's evolution from a monolithic Leo application to a massive micro‑service ecosystem, detailing the introduction of member graphs, read‑only replicas, caching layers, Kafka pipelines, Rest.li APIs, super‑blocks, and multi‑data‑center strategies that enable handling billions of requests daily.

Backend ArchitectureKafkaLinkedIn
0 likes · 8 min read
How LinkedIn Scaled to 300M Users: Lessons from a Decade of Backend Architecture
21CTO
21CTO
Jan 9, 2016 · Big Data

How We Scaled Real‑Time Log Analysis to 2 TB Daily with ELK

This article shares the author's practical experience building a real‑time log analysis platform at Sina, covering service scope, ELK architecture, performance optimizations, usability improvements, new features, common pitfalls, and a concise Q&A for engineers handling massive log streams.

ELKElasticsearchKafka
0 likes · 12 min read
How We Scaled Real‑Time Log Analysis to 2 TB Daily with ELK
Architect
Architect
Dec 30, 2015 · Big Data

Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud

This article explains how to build a large‑scale, real‑time vehicle monitoring system using Apache Storm and Kafka on Alibaba Cloud, covering the challenges of big‑data ingestion, system architecture, deployment steps, performance testing, and practical lessons learned.

Alibaba CloudBig DataKafka
0 likes · 12 min read
Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud
Qunar Tech Salon
Qunar Tech Salon
Dec 15, 2015 · Big Data

Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance

This article explains the principles of real-time computing, compares it with offline batch processing, and demonstrates a practical solution using Kafka for ingestion, Apache Storm for continuous computation, and various storage options, while also covering streaming concepts and Storm's high‑availability mechanisms.

Apache StormKafkaReal‑Time Computing
0 likes · 8 min read
Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance
21CTO
21CTO
Dec 14, 2015 · Backend Development

How Wacai Built a Scalable FinTech Architecture: 6 Key Design Strategies

Wacai’s architects outline six critical design decisions—including system layer separation, message passing, asynchronous processing, comprehensive data storage, robust security, and storage redundancy—that together enable a resilient, reactive financial platform capable of handling massive concurrent workloads.

AkkaFinTechKafka
0 likes · 8 min read
How Wacai Built a Scalable FinTech Architecture: 6 Key Design Strategies

LinkedIn’s Kafka at Scale: Architecture, Optimizations, and Operational Practices

The article details how LinkedIn has scaled Kafka from handling billions to trillions of messages daily, describing quota enforcement, a ZooKeeper‑free consumer, reliability enhancements, security plans, monitoring frameworks, fault‑injection testing, cluster balancing, and integration with other internal data systems.

Big DataKafkaLinkedIn
0 likes · 12 min read
LinkedIn’s Kafka at Scale: Architecture, Optimizations, and Operational Practices
21CTO
21CTO
Nov 21, 2015 · Big Data

Why Build a Kafka System? Core Use Cases and Design Principles

This article explains why Kafka is essential for activity and operational data pipelines, outlines key use cases such as news feeds, relevance ranking, security, monitoring, and reporting, and details its deployment topology, design decisions, and message persistence strategies.

Distributed MessagingKafkaReal-time Processing
0 likes · 14 min read
Why Build a Kafka System? Core Use Cases and Design Principles
21CTO
21CTO
Nov 19, 2015 · Big Data

Beyond Hadoop: Modern Big Data Platforms and Technologies Explained

This article surveys the evolution of Hadoop and its ecosystem, explains core storage and processing concepts, and introduces contemporary big‑data technologies such as Spark, Flink, Kafka, Lambda architecture, NoSQL databases, and cloud‑native solutions, highlighting their roles and trade‑offs.

Big DataFlinkHadoop
0 likes · 17 min read
Beyond Hadoop: Modern Big Data Platforms and Technologies Explained
Efficient Ops
Efficient Ops
Oct 14, 2015 · Big Data

Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A

During a lively “Sit and Discuss” session, experts compared Spark and Hadoop, evaluated Flink against Spark, contrasted HBase with Cassandra, explained why Kafka (and sometimes Flink) is preferred for distributed messaging, and shared insights on Tachyon’s role in modern big‑data ecosystems.

FlinkHBaseHadoop
0 likes · 10 min read
Spark vs Hadoop, Flink, HBase/Cassandra, Kafka & Tachyon: Expert Q&A
21CTO
21CTO
Sep 30, 2015 · Operations

How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily

Since 2011, LinkedIn has expanded its Kafka deployment from handling billions to over a trillion messages per day, focusing on quotas, a new ZooKeeper‑free consumer, reliability enhancements, security, monitoring frameworks, fault‑injection testing, cluster balancing, and ecosystem integrations, offering valuable lessons for large‑scale streaming systems.

KafkaLinkedInReliability
0 likes · 12 min read
How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily
21CTO
21CTO
Sep 27, 2015 · Big Data

How Weidian Built a Scalable Big Data Platform for Mobile Commerce

This article outlines the design and implementation of Weidian’s end‑to‑end big data processing platform, covering dataset definition, data collection via Flume‑based DataAgent, transmission through Databus, storage options such as HDFS, Kafka and Elasticsearch, and the monitoring and resource‑integration strategies that support massive mobile commerce logs.

ElasticsearchFlumeHadoop
0 likes · 18 min read
How Weidian Built a Scalable Big Data Platform for Mobile Commerce
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Sep 14, 2015 · Industry Insights

Why Kafka Dominates Distributed Messaging: Architecture, Features, and Best Practices

This article provides an in‑depth examination of Apache Kafka’s origins, design goals, core concepts such as brokers, topics, partitions, producers and consumers, compares it with other message queues, explains its storage format, configuration options, delivery guarantees, and includes practical Java code examples for partitioning and consumption.

Distributed SystemsKafkaMessage Queue
0 likes · 22 min read
Why Kafka Dominates Distributed Messaging: Architecture, Features, and Best Practices
21CTO
21CTO
Aug 10, 2015 · Backend Development

How Kafka’s File Storage Mechanism Achieves High Performance

Kafka’s distributed log architecture stores messages in partitioned segments with indexed data files, enabling efficient sequential writes, rapid deletions, and fast offset-based lookups, as detailed through its broker, topic, partition, segment structures, file naming rules, and real‑world performance experiments.

Kafkafile storage
0 likes · 11 min read
How Kafka’s File Storage Mechanism Achieves High Performance
Qunar Tech Salon
Qunar Tech Salon
Jul 8, 2015 · Big Data

Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing

This article explains how logs—simple, append‑only, time‑ordered records—serve as the core abstraction behind databases, distributed systems, data integration pipelines, and modern stream‑processing platforms such as Kafka and Hadoop, illustrating their design, scalability, and practical challenges.

Big DataData IntegrationDistributed Systems
0 likes · 45 min read
Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing
Architect
Architect
Jul 6, 2015 · Big Data

Understanding Logs: The Core of Distributed Systems and Data Integration

This article explains how logs—simple, append‑only, time‑ordered records—serve as the fundamental abstraction behind databases, distributed systems, data integration pipelines, and stream‑processing platforms like Kafka and Hadoop, illustrating their role in ordering, replication, scalability, and real‑time analytics.

Data IntegrationDistributed SystemsHadoop
0 likes · 48 min read
Understanding Logs: The Core of Distributed Systems and Data Integration

Designing a Scalable Real‑Time Mobile Analytics Platform with Kafka, Storm, and Amazon EMR

The article describes how a mobile analytics service processes billions of events daily using a Lambda‑style architecture that combines Kafka, Storm, Amazon EMR, and S3 to achieve scalable, fault‑tolerant batch and real‑time computation, while ensuring reliable event ingestion and graceful degradation.

AWSBig DataKafka
0 likes · 8 min read
Designing a Scalable Real‑Time Mobile Analytics Platform with Kafka, Storm, and Amazon EMR
MaGe Linux Operations
MaGe Linux Operations
Apr 28, 2015 · Big Data

How LinkedIn Scales Kafka to Billions of Messages Every Day

This article explains how LinkedIn uses Apache Kafka as a high‑throughput, fault‑tolerant messaging backbone, detailing its architecture, message categories, layered replication, audit mechanisms, and the engineering practices that keep billions of daily messages reliable and fast.

Big DataDistributed SystemsKafka
0 likes · 11 min read
How LinkedIn Scales Kafka to Billions of Messages Every Day

Understanding Kafka High Availability: Data Replication and Leader Election

The article explains why Kafka introduced high availability starting with version 0.8, detailing the need for data replication and leader election, describing replica distribution algorithms, replication mechanics, ISR handling, ZooKeeper structures, and the broker failover process to ensure fault‑tolerant streaming.

KafkaZooKeeperhigh availability
0 likes · 19 min read
Understanding Kafka High Availability: Data Replication and Leader Election
Meituan Technology Team
Meituan Technology Team
Jan 14, 2015 · Big Data

Kafka File Storage Mechanism and Architecture

Kafka stores each topic as partitions that are divided into sequential segment files containing paired .log data and .index files, using global offsets and sparse memory‑mapped indexes to enable fast offset‑based lookups, efficient deletions, and minimal disk I/O in real‑world deployments.

KafkaMessage QueuePartition
0 likes · 9 min read
Kafka File Storage Mechanism and Architecture