Tagged articles
1273 articles
Page 10 of 13
Efficient Ops
Efficient Ops
Aug 3, 2020 · Backend Development

Mastering Kafka Producer API: Tips, Configurations, and Common Pitfalls

This article provides a comprehensive guide to Kafka's producer API, covering core concepts, client‑side workflow, essential configurations, idempotent and transactional producers, and practical Java code examples to help developers avoid common pitfalls and optimize message publishing.

Distributed SystemsIdempotent ProducerKafka
0 likes · 21 min read
Mastering Kafka Producer API: Tips, Configurations, and Common Pitfalls
JavaEdge
JavaEdge
Aug 1, 2020 · Backend Development

How to Choose the Right Message Queue: RabbitMQ vs RocketMQ vs Kafka

This guide outlines key criteria for selecting a message queue—open source, ecosystem, reliability, clustering, and performance—and compares RabbitMQ, RocketMQ, and Kafka, highlighting each system's strengths, weaknesses, and ideal use‑cases.

KafkaRabbitMQRocketMQ
0 likes · 10 min read
How to Choose the Right Message Queue: RabbitMQ vs RocketMQ vs Kafka
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Jul 31, 2020 · Backend Development

Why Kafka Is Fast: Partition Parallelism, Sequential Disk Writes, Page Cache, Zero‑Copy, Batching and Compression

The article explains how Kafka achieves high throughput by using partition‑level parallelism, sequential disk writes with segment files, extensive use of the OS page cache, zero‑copy data paths, request batching and optional compression, while also discussing the underlying disk I/O principles.

BackendKafkaPartitioning
0 likes · 14 min read
Why Kafka Is Fast: Partition Parallelism, Sequential Disk Writes, Page Cache, Zero‑Copy, Batching and Compression
Tencent Cloud Developer
Tencent Cloud Developer
Jul 29, 2020 · Big Data

Case Study: Optimizing Tencent Cloud Elasticsearch for High‑Volume Game Log Analytics

To handle a gaming company's million‑QPS log stream, the team built a hot‑cold Tencent Cloud Elasticsearch cluster with ILM‑driven tiering, scaled CPU/heap, reduced shard count via shrink and replica tweaks, tuned Logstash‑Kafka pipelines, and employed COS snapshots and searchable snapshots, achieving stable performance and lower cost.

Big DataElasticsearchILM
0 likes · 29 min read
Case Study: Optimizing Tencent Cloud Elasticsearch for High‑Volume Game Log Analytics
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 25, 2020 · Big Data

Kafka Transactions, Replication Issues, HW/LEO Evolution, and Reliability Mechanisms

This article explains how Kafka implements transactions, handles under‑replicated partitions, manages high‑watermark and log‑end‑offset evolution, uses leader epochs for consistency, discusses read‑committed isolation, explains why read‑write separation is not supported, and describes delay queues, dead‑letter/retry queues, auditing, tracing, lag calculation, key metrics, and performance‑optimising design features.

DelayQueueHighWatermarkKafka
0 likes · 25 min read
Kafka Transactions, Replication Issues, HW/LEO Evolution, and Reliability Mechanisms
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 24, 2020 · Big Data

Key Concepts and Internal Mechanisms of Apache Kafka

This article provides an in‑depth overview of Apache Kafka’s internal topics, preferred replicas, partition allocation mechanisms, log directory structure, index files, offset and timestamp lookup, log retention and compaction policies, storage architecture, delayed operations, controller role, consumer rebalance process, and producer idempotence.

Consumer RebalanceDistributed SystemsIdempotence
0 likes · 18 min read
Key Concepts and Internal Mechanisms of Apache Kafka
Java Captain
Java Captain
Jul 24, 2020 · Operations

Enterprise Log Monitoring System Architecture and Implementation

To address the challenges of managing logs across hundreds of microservices in production, the article presents an enterprise log monitoring solution that centralizes collection via Filebeat, processes logs with Kafka Streams, visualizes data using Grafana and Kibana, and integrates Elastic APM for tracing and performance metrics.

ELKKafkaLog Monitoring
0 likes · 8 min read
Enterprise Log Monitoring System Architecture and Implementation
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 23, 2020 · Big Data

Comprehensive Kafka FAQ: Uses, Architecture, Offsets, and Partition Management

This article provides an extensive overview of Apache Kafka, covering its use cases, key concepts such as ISR, AR, HW, LEO, and LW, message ordering, the roles of partitioners, serializers and interceptors, producer and consumer client architecture, offset handling, multithreaded consumption, and topic partition management.

Big DataKafkaMessage Queue
0 likes · 16 min read
Comprehensive Kafka FAQ: Uses, Architecture, Offsets, and Partition Management
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 22, 2020 · Big Data

Kafka Architecture and Core Concepts: Producers, Brokers, and Consumers

This article explains Kafka's fundamental architecture, including the roles of producers, brokers, and consumers, key concepts such as topics, partitions, replicas, ISR, and controller, as well as detailed mechanisms of producer client structure, interceptors, serializers, partitioners, and consumer group rebalancing strategies.

Big DataDistributed SystemsKafka
0 likes · 22 min read
Kafka Architecture and Core Concepts: Producers, Brokers, and Consumers
Programmer DD
Programmer DD
Jul 22, 2020 · Big Data

How to Sync Billions of MySQL Records to HBase: 3 Powerful Methods Using Hadoop, Kafka, and Flink

This comprehensive guide walks you through setting up a pseudo‑distributed Hadoop environment, loading massive MySQL data with LOAD DATA, Python scripts, and multithreading, and then synchronizing the data to HBase using three approaches—Sqoop, a Kafka‑Thrift pipeline, and a real‑time Kafka‑Flink pipeline—while also comparing query performance of HBase and Phoenix.

FlinkHBaseKafka
0 likes · 28 min read
How to Sync Billions of MySQL Records to HBase: 3 Powerful Methods Using Hadoop, Kafka, and Flink
Tencent Cloud Developer
Tencent Cloud Developer
Jul 20, 2020 · Cloud Native

Tencent Eagle Eye Distributed Logging System Cloud Migration Practice

Tencent’s Eagle Eye distributed real‑time monitoring and log analysis platform was migrated to the cloud by rebuilding its LogSender and Kafka‑to‑ES components, switching to cloud CKafka and Elasticsearch, which boosted throughput fourfold, cut resource usage by about half, saved roughly 20 million RMB annually, and set the stage for further enhancements such as comprehensive monitoring and exactly‑once delivery.

ElasticsearchKafkaTencent
0 likes · 9 min read
Tencent Eagle Eye Distributed Logging System Cloud Migration Practice
Architects Research Society
Architects Research Society
Jul 19, 2020 · Backend Development

Comparing Kafka and Mosquitto for Microservice Communication

This article examines the challenges of microservice communication, explains why REST APIs are unsuitable, and compares two messaging broker solutions—Kafka and Mosquitto—highlighting their architectures, persistence, scalability, and suitability for high‑traffic, reliable event‑driven systems.

BackendEvent-drivenKafka
0 likes · 6 min read
Comparing Kafka and Mosquitto for Microservice Communication
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 15, 2020 · Big Data

Root Causes and Solutions for Kafka Duplicate Consumption

This article analyzes the common causes of Kafka duplicate consumption, such as uncommitted offsets due to forced thread termination, auto‑commit settings, session timeouts, rebalancing, and slow processing, and provides practical solutions including disabling auto‑commit, adjusting consumer configurations, and using new consumer groups.

Consumer ConfigurationDuplicate ConsumptionKafka
0 likes · 7 min read
Root Causes and Solutions for Kafka Duplicate Consumption
Architects Research Society
Architects Research Society
Jul 15, 2020 · Big Data

Introduction to Apache Kafka: A Distributed Streaming Platform

This article provides a comprehensive overview of Apache Kafka, explaining its distributed, fault‑tolerant architecture, horizontal scalability, disk‑based commit log, replication mechanisms, Streams API, KSQL, and why it is widely adopted as the backbone of event‑driven, high‑throughput systems.

Distributed SystemsKafkaMessage Queue
0 likes · 15 min read
Introduction to Apache Kafka: A Distributed Streaming Platform
Selected Java Interview Questions
Selected Java Interview Questions
Jul 10, 2020 · Backend Development

Message Queue Interview Questions and Technical Guide

This article provides a comprehensive overview of message queue concepts, covering usage scenarios, advantages, drawbacks, technology selection, high‑availability architectures, duplicate handling, data loss prevention, ordering guarantees, latency management, and design principles, supplemented with interview‑style questions and code examples.

KafkaMQMessage Queue
0 likes · 20 min read
Message Queue Interview Questions and Technical Guide
Selected Java Interview Questions
Selected Java Interview Questions
Jul 8, 2020 · Backend Development

Message Queue Applications and Comparison of Common MQs (ActiveMQ, RabbitMQ, RocketMQ, Kafka)

This article explains the role of message queues in distributed systems, illustrates four typical scenarios—asynchronous processing, application decoupling, traffic shaping, and message communication—and compares the features of popular middleware such as ActiveMQ, RabbitMQ, RocketMQ, and Kafka.

KafkaMessage QueueRabbitMQ
0 likes · 9 min read
Message Queue Applications and Comparison of Common MQs (ActiveMQ, RabbitMQ, RocketMQ, Kafka)
Big Data Technology Architecture
Big Data Technology Architecture
Jun 29, 2020 · Fundamentals

Kafka Storage Mechanism and Reliability Guarantees

This article explains Kafka's internal storage architecture—including topics, partitions, segments, .log and .index files—how data is read, and the various reliability mechanisms such as ISR/OSR, LEO/HW, producer acknowledgment levels, leader election strategies, and delivery semantics.

KafkaProducer AcksReliability
0 likes · 9 min read
Kafka Storage Mechanism and Reliability Guarantees
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Jun 23, 2020 · Backend Development

Common Kafka Interview Questions and Answers

This article reviews common Kafka interview questions, covering delay queues, idempotence, replica states, offsets, message ordering, and handling duplicate consumption, and includes example code for enabling idempotent producers along with explanations of time‑wheel mechanisms and practical solutions to consumer rebalance issues.

ConsumerIdempotenceKafka
0 likes · 9 min read
Common Kafka Interview Questions and Answers
Suning Technology
Suning Technology
Jun 19, 2020 · Big Data

How Suning’s Big Data Engine Powered a Record‑Breaking 618 Sale

Suning’s 618 shopping festival showcased a massive sales surge backed by its big‑data platform, which processed over 200 billion requests, handled 38.5 PB of daily data, and delivered 31.5 trillion computations, while Kafka and HBase sustained tens of millions of TPS to ensure a seamless consumer experience.

618 SaleHBaseKafka
0 likes · 5 min read
How Suning’s Big Data Engine Powered a Record‑Breaking 618 Sale
High Availability Architecture
High Availability Architecture
Jun 19, 2020 · Backend Development

Design and Implementation of a Traffic Replay System for Bilibili Membership Purchase Service

This article describes how Bilibili's Membership Purchase team built a Java‑based traffic replay platform using JVM‑Sandbox AOP, Kafka, and MySQL to capture real‑world request/response data, serialize it with JSON, and replay it for comprehensive regression testing of complex backend services.

KafkaMicroservicesSpring Cloud
0 likes · 14 min read
Design and Implementation of a Traffic Replay System for Bilibili Membership Purchase Service
Architect
Architect
Jun 18, 2020 · Backend Development

Applying Message Queues for Decoupling in E‑commerce Architecture

The article explains why and how to use message queues to achieve low‑coupling, better performance, fault tolerance, and eventual consistency in an e‑commerce order‑processing flow, discusses common pitfalls such as message loss and duplication, and compares popular queue products like RabbitMQ, Kafka, and RocketMQ.

Backend ArchitectureDecouplingDistributed Systems
0 likes · 10 min read
Applying Message Queues for Decoupling in E‑commerce Architecture
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Jun 18, 2020 · Big Data

Kafka Interview Questions: High Availability, Reliability, Consistency, Performance, and Usage Rationale

This article explains common Kafka interview questions by analyzing the system's high‑availability design, reliability mechanisms, consistency model, performance tricks such as sequential writes and zero‑copy, and the reasons for using Kafka and message queues, providing both conceptual insight and practical details.

ConsistencyDistributed SystemsKafka
0 likes · 12 min read
Kafka Interview Questions: High Availability, Reliability, Consistency, Performance, and Usage Rationale
58 Tech
58 Tech
Jun 10, 2020 · Big Data

Real‑time Data Warehouse Practices at 58 Tongcheng Bao: From Spark Streaming 1.0 to Flink‑based 2.0

This article details the evolution of 58 Tongcheng Bao's real‑time data warehouse, describing the initial Spark‑Streaming architecture, its limitations, and the redesign using Flink with a layered ODS‑DWD‑DWS‑APP model, data‑quality monitoring, join techniques, and the resulting improvements in latency and accuracy.

Big DataData QualityFlink
0 likes · 9 min read
Real‑time Data Warehouse Practices at 58 Tongcheng Bao: From Spark Streaming 1.0 to Flink‑based 2.0
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 4, 2020 · Big Data

Kafka for Data Ingestion and Event Distribution: Production‑Consumer and Publish‑Subscribe Patterns

This article explains how Kafka can be used for data ingestion and event distribution by illustrating production‑consumer and publish‑subscribe models, describing core concepts such as topics, partitions and consumer groups, and offering practical design options for handling different event scenarios.

Big DataEvent DistributionKafka
0 likes · 9 min read
Kafka for Data Ingestion and Event Distribution: Production‑Consumer and Publish‑Subscribe Patterns
Programmer DD
Programmer DD
Jun 4, 2020 · Backend Development

How I Traced a Sudden Data Drop After a Feature Release: 14 Debugging Steps

After a new feature caused a sharp decline in data volume, I walked through a fourteen‑step troubleshooting process—verifying the issue, inspecting code, consulting DBAs, testing locally, checking configurations, logging, packet capture, load testing, and finally identifying a Kafka partition bottleneck—to restore normal operation.

Kafkatroubleshooting
0 likes · 9 min read
How I Traced a Sudden Data Drop After a Feature Release: 14 Debugging Steps
Big Data Technology Architecture
Big Data Technology Architecture
Jun 3, 2020 · Big Data

Comprehensive Kafka Interview Questions and Answers

This article compiles essential Kafka interview topics, covering cluster sizing, partition and replica configuration, offset management, topic creation, log structure, election mechanisms, partition assignment strategies, handling data backlog, exactly‑once semantics, idempotence, transactions, and performance tuning with practical command examples.

KafkaMessaginginterview
0 likes · 15 min read
Comprehensive Kafka Interview Questions and Answers
MaGe Linux Operations
MaGe Linux Operations
May 30, 2020 · Big Data

What Is Kafka? A Deep Dive into Distributed Streaming and Messaging

Kafka is an Apache‑hosted distributed streaming platform that provides high‑throughput, durable, publish‑subscribe messaging, originally developed by LinkedIn; this article explains its core concepts, message system classifications, architecture components, APIs, replication, consumer groups, and guarantees, comparing it with other messaging solutions.

Big DataDistributed StreamingKafka
0 likes · 17 min read
What Is Kafka? A Deep Dive into Distributed Streaming and Messaging
Architects' Tech Alliance
Architects' Tech Alliance
May 27, 2020 · Fundamentals

Message Queue Overview, Models, and Comparison of ActiveMQ, RabbitMQ, RocketMQ, and Kafka

This article introduces the fundamentals of message queues, explains their characteristics, delivery models, transmission modes, and push‑pull patterns, then compares four popular implementations—ActiveMQ, RabbitMQ, RocketMQ, and Kafka—highlighting each system's strengths, weaknesses, deployment requirements, and typical use cases.

ActiveMQKafkaMessage Queue
0 likes · 17 min read
Message Queue Overview, Models, and Comparison of ActiveMQ, RabbitMQ, RocketMQ, and Kafka
MaGe Linux Operations
MaGe Linux Operations
May 27, 2020 · Operations

Key DevOps Interview Q&A: Git, MySQL Replication, Kafka, Kubernetes

This article compiles essential DevOps interview questions covering version control differences between Git and SVN, MySQL master‑slave replication mechanics, Kafka versus traditional MQ, Kubernetes service types, pod communication, health checks, resource limits, link types, and permanent mounting techniques.

DevOpsGitKafka
0 likes · 17 min read
Key DevOps Interview Q&A: Git, MySQL Replication, Kafka, Kubernetes
Big Data Technology & Architecture
Big Data Technology & Architecture
May 24, 2020 · Big Data

Analyzing and Resolving Kafka Consumer Rebalance Errors Caused by max.poll.interval.ms

The article examines a Kafka consumer rebalance error caused by exceeding max.poll.interval.ms, explains the underlying mechanics of poll intervals, offset handling, and provides practical solutions such as adjusting max.poll.interval.ms, limiting poll records, and committing offsets per message to prevent frequent rebalances.

Kafkajavamax.poll.interval.ms
0 likes · 9 min read
Analyzing and Resolving Kafka Consumer Rebalance Errors Caused by max.poll.interval.ms
macrozheng
macrozheng
May 21, 2020 · Big Data

Mastering Kafka: Core Concepts, Architecture, and Reliability Guarantees

This comprehensive guide covers Kafka's definition, publish/subscribe model, key components, storage mechanisms, producer and consumer strategies, and reliability features such as ACK levels, ISR, and exactly‑once semantics, providing a solid foundation for real‑time big‑data processing.

Big DataDistributed SystemsKafka
0 likes · 16 min read
Mastering Kafka: Core Concepts, Architecture, and Reliability Guarantees
Big Data Technology Architecture
Big Data Technology Architecture
May 19, 2020 · Big Data

Design and Implementation of a Unified Data Lake Platform Using HBase, Kafka, and Elasticsearch

This article summarizes the design, architecture, and key modules of a company-wide data lake platform—named “Tianchi”—built on HBase, Kafka, and Elasticsearch, detailing data ingestion, strategy output, metadata management, indexing, monitoring, and offline analysis, and shares lessons learned and future plans.

Data PlatformElasticsearchHBase
0 likes · 11 min read
Design and Implementation of a Unified Data Lake Platform Using HBase, Kafka, and Elasticsearch
Selected Java Interview Questions
Selected Java Interview Questions
May 16, 2020 · Big Data

How Reddit Counts Page Views at Scale Using HyperLogLog and Kafka

The article explains Reddit's large‑scale page‑view counting system, detailing its real‑time requirements, the challenges of naive hash‑set storage, and how a hybrid approach using linear probability and HyperLogLog algorithms together with Kafka, Redis, and Cassandra achieves accurate, low‑memory, near‑real‑time analytics.

Big DataHyperLogLogKafka
0 likes · 7 min read
How Reddit Counts Page Views at Scale Using HyperLogLog and Kafka
Top Architect
Top Architect
May 14, 2020 · Big Data

Kafka Overview, Architecture, Installation, and Operational Guide

This article provides a comprehensive introduction to Kafka, covering its definition, message queue concepts, architecture components, installation steps, configuration details, startup procedures, operational commands, producer and consumer mechanisms, reliability guarantees, partition strategies, offset management, and performance optimizations.

Big DataConsumerInstallation
0 likes · 22 min read
Kafka Overview, Architecture, Installation, and Operational Guide
Architecture Digest
Architecture Digest
May 3, 2020 · Big Data

Kafka Concept Overview

This article provides a comprehensive introduction to Kafka, covering its definition, message‑queue models, architecture components, installation steps, configuration details, producer and consumer mechanisms, reliability guarantees, partition assignment strategies, offset management, and high‑performance read/write techniques.

Big DataConsumerKafka
0 likes · 20 min read
Kafka Concept Overview
Tencent Cloud Developer
Tencent Cloud Developer
Apr 28, 2020 · Big Data

Evolution of Ctrip Vacation Pricing Engine: Architecture, Challenges, and Optimizations

Ctrip’s vacation pricing engine evolved from a MySQL‑based synchronous queue to a Kafka‑driven, Spark‑parallelized architecture using HBase, dramatically cutting task generation from five hours to 1.5 hours, boosting price‑accuracy above 90 % while handling billions of calculations and external API constraints.

Distributed SystemsKafkaSpark
0 likes · 18 min read
Evolution of Ctrip Vacation Pricing Engine: Architecture, Challenges, and Optimizations
Tencent Cloud Developer
Tencent Cloud Developer
Apr 24, 2020 · Backend Development

Mask Reservation Mini‑Program: From Perfect Experience to Lossy Service – Architecture and Design

During the COVID‑19 pandemic, Tencent and Guangzhou built the “Suikang” mask‑reservation mini‑program in two days, handling 1.7 billion visits by shifting from real‑time inventory checks to a four‑layer “lossy” architecture—CDN caching, batch releases, Redis, Kafka queues, and asynchronous processing—to trade consistency for high availability and rapid response.

CAP theoremKafkaLossy Service
0 likes · 23 min read
Mask Reservation Mini‑Program: From Perfect Experience to Lossy Service – Architecture and Design
Big Data Technology Architecture
Big Data Technology Architecture
Apr 22, 2020 · Big Data

Key Kafka Producer Configuration Parameters and Tuning Recommendations

This article explains the most important Kafka producer configuration parameters—such as acks, max.request.size, retries, compression.type, buffer.memory, batch.size, linger.ms, request.timeout.ms, and max.in.fight.requests.per.connection—provides practical tuning advice, and presents a recommended setup to achieve high throughput without message loss.

ConfigurationKafkaMessage Reliability
0 likes · 9 min read
Key Kafka Producer Configuration Parameters and Tuning Recommendations
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 19, 2020 · Big Data

Understanding the Backpressure Mechanism in Spark Streaming

This article explains Spark Streaming's backpressure mechanism, detailing how batch intervals can cause data accumulation, the role of Receivers versus DirectKafkaInputDStream, configuration to enable backpressure, and the internal workings of RateController, ReceiverRateController, ReceiverSupervisor, BlockGenerator, and rate calculations for Kafka streams.

Big DataKafkaRateController
0 likes · 12 min read
Understanding the Backpressure Mechanism in Spark Streaming
Big Data Technology Architecture
Big Data Technology Architecture
Apr 15, 2020 · Big Data

Real-Time Data Warehouse Practices: Case Studies from Meituan, NetEase, Zhihu, and OPPO

This article reviews the evolution of data warehouses from traditional offline models to modern real‑time architectures, presenting detailed case studies of Meituan, NetEase, Zhihu, and OPPO, and discusses layer designs, technology choices such as Flink, Kafka, and storage options, and key lessons for building scalable real‑time warehouses.

Big DataFlinkKafka
0 likes · 13 min read
Real-Time Data Warehouse Practices: Case Studies from Meituan, NetEase, Zhihu, and OPPO
Big Data Technology Architecture
Big Data Technology Architecture
Apr 13, 2020 · Backend Development

Understanding Kafka Producer: Architecture, Data Structures, Serialization, Partitioning, and Buffering

This article provides a comprehensive overview of Kafka's Producer side, covering its architecture, the ProducerRecord data structure, serialization mechanisms, partitioning logic, and the accumulator buffer, while comparing old and new Producer clients and illustrating key configurations with code examples.

AccumulatorKafkaPartitioning
0 likes · 9 min read
Understanding Kafka Producer: Architecture, Data Structures, Serialization, Partitioning, and Buffering
dbaplus Community
dbaplus Community
Apr 12, 2020 · Databases

Why and How to Migrate from MongoDB to Elasticsearch: A Practical Guide

This article explains the motivations for moving a high‑volume operation‑log system from MongoDB to Elasticsearch, outlines the existing architecture, details capacity planning, index design, and a step‑by‑step migration process using Kafka, DataX, and Spring Boot, and shares the performance gains and lessons learned.

Data MigrationDataXDatabase Architecture
0 likes · 14 min read
Why and How to Migrate from MongoDB to Elasticsearch: A Practical Guide
Tencent Cloud Middleware
Tencent Cloud Middleware
Apr 9, 2020 · Operations

Scaling Kafka to Support Millions of Partitions Without Downtime

This article explains the metadata, controller, and Zookeeper challenges of supporting a million‑plus Kafka partitions and presents practical solutions such as parallel ZK fetching, metadata‑via‑topic redesign, logical cluster assembly, and physical cluster splitting to achieve large‑scale, stable Kafka deployments.

KafkaZooKeepercluster operations
0 likes · 15 min read
Scaling Kafka to Support Millions of Partitions Without Downtime
Qunar Tech Salon
Qunar Tech Salon
Apr 9, 2020 · Backend Development

RabbitMQ vs Kafka: Key Differences and How to Choose Between Them

This article compares RabbitMQ and Apache Kafka across architecture, message ordering, routing, timing, retention, fault handling, scalability, and consumer complexity, providing guidance on when to prefer each system based on functional and non‑functional requirements.

ComparisonKafkaMessage Queue
0 likes · 17 min read
RabbitMQ vs Kafka: Key Differences and How to Choose Between Them
Programmer DD
Programmer DD
Mar 28, 2020 · Backend Development

Why Is Kafka So Fast? Uncover the 11 Performance Secrets

Kafka achieves its remarkable speed by combining sequential I/O, batch processing, compression, zero‑copy, careful client‑side work, and a design that avoids costly fsync and garbage collection, while maintaining durability, ordering, and at‑least‑once delivery, making it a high‑throughput, low‑latency event streaming platform.

Batch ProcessingDistributed SystemsKafka
0 likes · 15 min read
Why Is Kafka So Fast? Uncover the 11 Performance Secrets
Programmer DD
Programmer DD
Mar 26, 2020 · Backend Development

How Zhihu Built a Scalable Long‑Connection Gateway for Real‑Time Messaging

Zhihu’s infrastructure team designed a high‑performance, scalable long‑connection gateway that decouples business logic via publish‑subscribe, leverages OpenResty, Kafka, and Redis, implements fine‑grained ACL, sliding‑window flow control, and ensures message reliability and horizontal scalability for millions of concurrent devices.

KafkaMessage ReliabilityOpenResty
0 likes · 15 min read
How Zhihu Built a Scalable Long‑Connection Gateway for Real‑Time Messaging
Qunar Tech Salon
Qunar Tech Salon
Mar 19, 2020 · Big Data

Apache Kafka Overview: Architecture, Features, and Usage

This article provides a comprehensive introduction to Apache Kafka, covering its high‑throughput distributed architecture, core concepts such as topics, partitions, brokers, producers and consumers, design goals, performance characteristics, deployment steps, configuration, and example code for producers, consumers, and Spring Boot integration.

Big DataDistributed SystemsKafka
0 likes · 39 min read
Apache Kafka Overview: Architecture, Features, and Usage
Top Architect
Top Architect
Mar 13, 2020 · Big Data

Three Billion‑Scale MySQL‑to‑HBase Synchronization Solutions and Practical Implementation

This article presents a comprehensive guide for synchronizing massive MySQL datasets to HBase, covering environment preparation, fast MySQL data loading techniques, and three practical pipelines—Sqoop, Kafka‑Thrift, and Kafka‑Flink—along with performance comparisons and optimization tips for large‑scale data processing.

Big DataFlinkHBase
0 likes · 24 min read
Three Billion‑Scale MySQL‑to‑HBase Synchronization Solutions and Practical Implementation
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 13, 2020 · Backend Development

Kafka Idempotent and Transactional Messaging Overview

This article explains how Kafka implements idempotent producers and transactional messaging to achieve exactly‑once semantics, detailing the producer session identifiers, sequence numbers, broker checks, two‑phase commit workflow, consumer isolation levels, and the limitations of atomic reads.

Idempotent ProducerKafkaTransactional Messaging
0 likes · 9 min read
Kafka Idempotent and Transactional Messaging Overview
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 7, 2020 · Big Data

How Kafka’s Reactor Thread Model Powers High‑Throughput Messaging

Kafka’s high‑throughput network architecture—built on NIO, a Reactor thread model, and a TCP‑based protocol—evolves from a simple synchronous processor design to a decoupled handler‑pool system, offering valuable lessons for designing scalable backend communication layers in big‑data applications.

High ThroughputKafkaReactor Model
0 likes · 7 min read
How Kafka’s Reactor Thread Model Powers High‑Throughput Messaging
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 6, 2020 · Big Data

Mastering Kafka: Build Producers, Consumers, and Custom Partitioners

This extensive tutorial walks through Kafka producer and consumer fundamentals, demonstrates how to configure key‑based and custom partitioning, explains offset management, consumer group coordination, and essential configuration parameters, and includes complete Java code examples for real‑world e‑commerce scenarios.

ConsumerKafkaPartitioning
0 likes · 24 min read
Mastering Kafka: Build Producers, Consumers, and Custom Partitioners
58 Tech
58 Tech
Mar 4, 2020 · Big Data

Applying Flink State Management to Real‑Time Recommendation Scenarios

This article explains how Flink's flexible state management, including Broadcast, Keyed, and Operator states, can be used to solve real‑time recommendation challenges such as per‑minute UV, click, and exposure counting, while addressing locality mapping and data‑delay issues with Druid as the downstream store.

Broadcast StateDruidFlink
0 likes · 13 min read
Applying Flink State Management to Real‑Time Recommendation Scenarios
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 4, 2020 · Big Data

Understanding Kafka: Core Concepts, Architecture, and Performance Secrets

This article introduces Kafka's role as a message system, explains its fundamental components such as topics, partitions, producers, consumers, and replicas, and dives into cluster architecture, consumer groups, Zookeeper coordination, and performance optimizations like sequential writes, zero‑copy, log segmentation, and network design.

KafkaMessage Queueperformance
0 likes · 13 min read
Understanding Kafka: Core Concepts, Architecture, and Performance Secrets
dbaplus Community
dbaplus Community
Mar 3, 2020 · Big Data

How MaFengWo Scaled Kafka for Real‑Time Big Data: Lessons and Best Practices

This article details MaFengWo's practical experience with Kafka in its big‑data platform, covering three core usage scenarios, a four‑stage evolution roadmap—including version upgrades, resource isolation, security and monitoring—and future plans such as transaction‑based deduplication and consumer throttling.

Big DataKafkaResource Isolation
0 likes · 17 min read
How MaFengWo Scaled Kafka for Real‑Time Big Data: Lessons and Best Practices
Mafengwo Technology
Mafengwo Technology
Feb 28, 2020 · Backend Development

How We Achieve Real‑Time MySQL‑to‑Elasticsearch Sync with Binlog and Kafka

This article explains how a large e‑commerce platform replaced a MySQL‑centric intermediate table with a binlog‑driven pipeline that streams changes through Kafka into Elasticsearch, ensuring ordered, complete, and low‑latency data synchronization while addressing schema evolution and operational monitoring.

BackendBinlogElasticsearch
0 likes · 11 min read
How We Achieve Real‑Time MySQL‑to‑Elasticsearch Sync with Binlog and Kafka
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 27, 2020 · Fundamentals

Message Queue Usage, Advantages, Disadvantages, and High‑Availability Design

The article explains why message queues are used, outlines their core scenarios—decoupling, asynchronous processing, and traffic shaping—compares Kafka, ActiveMQ, RabbitMQ, and RocketMQ, and details how to ensure high availability, reliability, idempotency, ordering, and scalability in production systems.

IdempotencyKafkaMessage Queue
0 likes · 34 min read
Message Queue Usage, Advantages, Disadvantages, and High‑Availability Design
Big Data Technology Architecture
Big Data Technology Architecture
Feb 26, 2020 · Big Data

Comprehensive Guide to Kafka Architecture, Messaging Mechanisms, Replication, Controllers, and Consumer Rebalance

This article provides an in‑depth yet approachable overview of Kafka's core concepts—including its architecture, terminology, message‑sending pipeline, replication strategy, controller role, and consumer group rebalance mechanisms—helping readers quickly grasp how Kafka works as a high‑throughput distributed messaging and streaming platform.

Consumer RebalanceDistributed MessagingKafka
0 likes · 21 min read
Comprehensive Guide to Kafka Architecture, Messaging Mechanisms, Replication, Controllers, and Consumer Rebalance
Tencent Cloud Developer
Tencent Cloud Developer
Feb 18, 2020 · Backend Development

Technical Overview of Tencent Cloud CKafka for High-Scale Online Classroom Messaging

Tencent Cloud CKafka powers Tencent Classroom’s pandemic‑era online teaching by replacing a custom queue with a high‑performance, highly available, partition‑based message bus that scales to millions of real‑time interactions, offers configurable replication and tuning for reliability, and integrates with big‑data and streaming tools for analytics.

CKafkaKafkaMessage Queue
0 likes · 15 min read
Technical Overview of Tencent Cloud CKafka for High-Scale Online Classroom Messaging
Java Captain
Java Captain
Feb 12, 2020 · Backend Development

Comprehensive Comparison of Kafka, RabbitMQ, ZeroMQ, RocketMQ, and ActiveMQ

This article provides a detailed side‑by‑side comparison of five popular message‑queue systems—Kafka, RabbitMQ, ZeroMQ, RocketMQ and ActiveMQ—covering documentation, supported languages, protocols, storage, transactions, load balancing, clustering, management interfaces, availability, throughput, subscription models, ordering, acknowledgments, replay, retry, concurrency and more.

ActiveMQComparisonKafka
0 likes · 21 min read
Comprehensive Comparison of Kafka, RabbitMQ, ZeroMQ, RocketMQ, and ActiveMQ
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 11, 2020 · Backend Development

Message Queue Interview Guide: Benefits, Drawbacks, High Availability, Idempotency, Ordering and Design Strategies

This article provides a comprehensive interview‑oriented overview of message queues, explaining why they are used, their core advantages and disadvantages, comparing Kafka, RabbitMQ, ActiveMQ and RocketMQ, and detailing high‑availability, reliability, idempotency, ordering, backlog handling and architectural design considerations.

IdempotencyKafkaMessage Queue
0 likes · 33 min read
Message Queue Interview Guide: Benefits, Drawbacks, High Availability, Idempotency, Ordering and Design Strategies
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 10, 2020 · Big Data

Real‑time MySQL Binlog Capture with Canal: Principles, Architecture, Deployment and Comparison with Maxwell

This article explains how to use Alibaba's Canal to capture MySQL binlog changes in real time, covering its underlying protocol, component architecture, HA design with ZooKeeper, configuration steps, deployment examples, and a detailed comparison with alternative tools such as Maxwell and mysql_streamer.

Big DataBinlogCanal
0 likes · 17 min read
Real‑time MySQL Binlog Capture with Canal: Principles, Architecture, Deployment and Comparison with Maxwell
Yanxuan Tech Team
Yanxuan Tech Team
Feb 10, 2020 · Backend Development

How Yanxuan’s Unified Message Center Scales with RocketMQ, Kafka, and K8s

This article details Yanxuan's evolution from a chaotic, multi‑queue setup to a unified, cloud‑native message center built on RocketMQ and Kafka, describing current services, scheduling mechanisms, publish‑subscribe implementation, and future plans for platformization and Kubernetes‑based resource management.

KafkaKubernetesMessaging
0 likes · 13 min read
How Yanxuan’s Unified Message Center Scales with RocketMQ, Kafka, and K8s
Big Data Technology Architecture
Big Data Technology Architecture
Feb 1, 2020 · Big Data

Apache Hudi 0.5.1 Release Highlights and Upgrade Guide

The Apache Hudi 0.5.1 release introduces upgraded Spark, Avro, Parquet and Kafka dependencies, new Scala support, timeline layout changes, CLI enhancements, DeltaStreamer parameter updates, Kafka offset enum revisions, key‑generator package relocation, Hive sync options, dynamic Bloom filter, bulk‑insert support, and AWS cloud storage compatibility.

Apache HudiDeltaStreamerKafka
0 likes · 6 min read
Apache Hudi 0.5.1 Release Highlights and Upgrade Guide
Architecture Digest
Architecture Digest
Jan 18, 2020 · Backend Development

Comprehensive Comparison of Kafka, RabbitMQ, ZeroMQ, RocketMQ, and ActiveMQ Across 17 Aspects

This article provides a detailed side‑by‑side comparison of five popular message‑queue systems—Kafka, RabbitMQ, ZeroMQ, RocketMQ and ActiveMQ—covering documentation, language support, protocols, storage, transactions, load balancing, clustering, management UI, availability, duplication handling, throughput, subscription models, ordering, acknowledgments, replay, retry mechanisms and concurrency.

KafkaMessage QueueRabbitMQ
0 likes · 25 min read
Comprehensive Comparison of Kafka, RabbitMQ, ZeroMQ, RocketMQ, and ActiveMQ Across 17 Aspects
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 16, 2020 · Big Data

Kafka Interview Guide: Core Concepts, Architecture, and Practical Tips

This article compiles essential Kafka interview material, covering its role as a message queue, usage scenarios, architectural components, storage mechanisms, consumer group rebalancing, high‑availability features, replication details, ordering guarantees, producer/consumer client design, topic management, log retention, performance optimizations, and key monitoring metrics.

Big DataDistributed SystemsKafka
0 likes · 16 min read
Kafka Interview Guide: Core Concepts, Architecture, and Practical Tips
Programmer DD
Programmer DD
Jan 13, 2020 · Backend Development

How Kafka’s Broker Controller Keeps Your Data Flowing – Inside the Replication Engine

This article dives deep into Kafka’s internal mechanics, explaining how brokers replicate data, how the controller coordinates the cluster via ZooKeeper, the roles of leader and follower replicas, ISR management, request handling, fail‑over strategies, and consumer group rebalancing, all illustrated with diagrams.

BackendBroker ControllerISR
0 likes · 36 min read
How Kafka’s Broker Controller Keeps Your Data Flowing – Inside the Replication Engine
Architect's Tech Stack
Architect's Tech Stack
Jan 12, 2020 · Backend Development

Comprehensive Guide to Spring‑Kafka Integration and Advanced Features

This article provides a systematic tutorial on using Spring‑Kafka, covering basic setup, embedded Kafka for testing, topic creation methods, message sending with KafkaTemplate, transactional messaging, request‑reply patterns, advanced @KafkaListener configurations, manual acknowledgment, listener lifecycle control, SendTo forwarding, and retry with dead‑letter queues, all illustrated with complete code examples.

EmbeddedKafkaKafkaMessaging
0 likes · 19 min read
Comprehensive Guide to Spring‑Kafka Integration and Advanced Features
ITPUB
ITPUB
Jan 10, 2020 · Big Data

How MaFengWo Scales Kafka for Real‑Time Big Data: Lessons and Best Practices

This article details MaFengWo’s practical experience using Kafka across three core scenarios—real‑time storage, analytical data source, and business data subscription—while describing a four‑stage evolution that includes version upgrades, resource isolation, security and monitoring enhancements, and a comprehensive subscription platform, followed by future improvement plans.

Big DataData ReplayKafka
0 likes · 16 min read
How MaFengWo Scales Kafka for Real‑Time Big Data: Lessons and Best Practices
Java High-Performance Architecture
Java High-Performance Architecture
Jan 7, 2020 · Backend Development

How to Build a Scalable Reporting Service in a Microservice Architecture

To generate a user‑enriched order report in a microservice system, the article compares four approaches—direct DB access, REST data aggregation, batch pulling, and an event‑driven model—highlighting their trade‑offs in coupling, performance, scalability, and resilience, and recommends the event‑push solution.

Data IntegrationEvent-drivenKafka
0 likes · 5 min read
How to Build a Scalable Reporting Service in a Microservice Architecture