Tagged articles
1273 articles
Page 4 of 13
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jan 17, 2024 · Backend Development

Mastering Message Middleware: From Basics to Choosing the Right Solution

This article explains what message middleware is, outlines its key use cases such as asynchronous communication and decoupling, details core principles and models like point‑to‑point and publish/subscribe, reviews popular solutions (Kafka, RabbitMQ, RocketMQ, Pulsar, etc.), and offers selection guidance.

Distributed SystemsKafkaMessage Middleware
0 likes · 8 min read
Mastering Message Middleware: From Basics to Choosing the Right Solution
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 13, 2024 · Big Data

What Is Kafka? Overview, Architecture, Features, Deployment, and Sample Code

Kafka, an Apache‑developed distributed publish/subscribe messaging system, provides reliable, high‑throughput real‑time data streaming with producers, consumers, brokers, streams, and connectors, and the article explains its core concepts, architecture, advantages, deployment methods, use cases, and includes Java code examples for producers and consumers.

Big DataKafkaMessage Queue
0 likes · 8 min read
What Is Kafka? Overview, Architecture, Features, Deployment, and Sample Code
Architect
Architect
Jan 4, 2024 · Backend Development

RabbitMQ vs Kafka: How to Choose the Right Messaging System

This article explains asynchronous messaging patterns, compares RabbitMQ and Apache Kafka in depth, and provides a step‑by‑step decision guide that highlights their architectural differences, strengths, weaknesses, and suitable use‑cases for modern software systems.

Backend ArchitectureKafkaRabbitMQ
0 likes · 11 min read
RabbitMQ vs Kafka: How to Choose the Right Messaging System
Programmer DD
Programmer DD
Jan 2, 2024 · Backend Development

How to Seamlessly Upgrade Spring Boot 2.x to 3.x – Step-by-Step Guide

This guide walks developers through upgrading from Spring Boot 2.x to 3.x, covering JDK 17 migration, pom.xml changes, configuration property updates, Jakarta EE transition, security reconfiguration, Kafka template adjustments, and OpenAPI integration, with code examples for each step.

KafkaSpring Bootjakarta-ee
0 likes · 9 min read
How to Seamlessly Upgrade Spring Boot 2.x to 3.x – Step-by-Step Guide
Architect
Architect
Dec 30, 2023 · Big Data

Designing a Scalable Log Collection Agent: Lessons from Vivo’s Bees‑Agent

This article details the end‑to‑end design of Vivo’s custom log‑collection agent, covering file discovery with inotify, unique file identification using inode and content hash, real‑time reading via RandomAccessFile, checkpointing, Kafka integration, offline HDFS ingestion, resource throttling, and platform‑wide management, while comparing it with open‑source alternatives.

Agent DesignBig DataKafka
0 likes · 26 min read
Designing a Scalable Log Collection Agent: Lessons from Vivo’s Bees‑Agent
Cloud Native Technology Community
Cloud Native Technology Community
Dec 29, 2023 · Cloud Native

Mastering Strimzi Kafka Operator: Architecture, Deployment & Tuning on K8s

This article provides an in‑depth analysis of the Strimzi Kafka Operator, covering its core features, multi‑layer architecture, detailed installation steps on Kubernetes, Kafka cluster creation, production/consumption workflows, and the internal reconciliation mechanisms that enable automated scaling, storage tuning, and fault‑recovery.

Cloud NativeDeploymentKafka
0 likes · 11 min read
Mastering Strimzi Kafka Operator: Architecture, Deployment & Tuning on K8s
Alibaba Cloud Native
Alibaba Cloud Native
Dec 28, 2023 · Cloud Computing

How to Set Up No‑Code Data Dump from Alibaba Cloud Kafka to OSS

This guide explains how to use Alibaba Cloud Message Queue Kafka's no‑code, fully managed, serverless dump feature to transfer data to OSS, covering its benefits, typical scenarios, required prerequisites, step‑by‑step configuration, testing, and verification of the resulting objects.

Alibaba CloudData IntegrationKafka
0 likes · 9 min read
How to Set Up No‑Code Data Dump from Alibaba Cloud Kafka to OSS
Efficient Ops
Efficient Ops
Dec 27, 2023 · Big Data

Why ClickHouse Beats Elasticsearch for Log Analytics – Performance, Cost & Deployment

This article compares ClickHouse and Elasticsearch for log analytics, highlighting ClickHouse’s superior write throughput, query speed, and lower server costs, then details a cost‑effective deployment architecture—including Zookeeper, Kafka, FileBeat, and ClickHouse setup—and shares optimization tips and visualization using ClickVisual.

Big DataElasticsearchKafka
0 likes · 13 min read
Why ClickHouse Beats Elasticsearch for Log Analytics – Performance, Cost & Deployment
ITPUB
ITPUB
Dec 24, 2023 · Backend Development

Why Kafka Is the Backbone of Modern Messaging, Streaming, and Data Pipelines

This article explains how Kafka serves as a high‑throughput, durable messaging system, a reliable storage layer, a log‑aggregation hub, a stream‑processing engine, and a core component for CDC, system migration, monitoring, and event‑sourcing architectures.

CDCEvent SourcingKafka
0 likes · 9 min read
Why Kafka Is the Backbone of Modern Messaging, Streaming, and Data Pipelines
dbaplus Community
dbaplus Community
Dec 20, 2023 · Operations

Scaling Kafka to 1000+ Nodes: Governance, Auto‑Balancing & Tiered Storage

This article outlines how a large‑scale Kafka deployment of over a thousand machines across dozens of clusters was engineered for stability and efficiency through a custom Guardian controller that adds partition‑level throttling, automatic balancing, multi‑tenant isolation, cross‑IDC management, tiered storage, audit capabilities, and fully automated operational workflows.

Cluster ManagementKafkaOperations
0 likes · 21 min read
Scaling Kafka to 1000+ Nodes: Governance, Auto‑Balancing & Tiered Storage
MaGe Linux Operations
MaGe Linux Operations
Dec 18, 2023 · Backend Development

How to Prevent Kafka Message Loss in Critical Transaction Systems

This article explains why Kafka can lose messages in production, broker, and consumer stages, analyzes root causes such as asynchronous batch sends, JVM crashes, and network failures, and provides practical solutions including callbacks, retry mechanisms, replication settings, and manual offset commits to ensure reliable delivery.

Distributed SystemsKafkaMessage Reliability
0 likes · 10 min read
How to Prevent Kafka Message Loss in Critical Transaction Systems
IT Xianyu
IT Xianyu
Dec 16, 2023 · Backend Development

Understanding CQRS and Event Sourcing with Spring Microservices

This article explains the CQRS pattern, its origins, benefits, and pitfalls, then details how to implement CQRS and event sourcing in Spring‑based microservices using Axon and Kafka, while discussing architectural considerations, scalability, consistency, and tooling.

AxonCQRSEvent Sourcing
0 likes · 12 min read
Understanding CQRS and Event Sourcing with Spring Microservices
Tencent Cloud Middleware
Tencent Cloud Middleware
Dec 12, 2023 · Cloud Native

How Tencent Cloud Implements Tiered Storage for Kafka: Architecture, Challenges, and Evolution

This article examines the challenges of Kafka's traditional architecture, explains why local‑state heavy deployments cause operational difficulty and resource waste, and details Tencent Cloud's elastic, storage‑compute‑separated designs—including tiered storage, segment state machines, offset constraints, and performance optimizations—while sharing practical implementation insights and future directions.

Cloud NativeData LifecycleDistributed Systems
0 likes · 17 min read
How Tencent Cloud Implements Tiered Storage for Kafka: Architecture, Challenges, and Evolution
Architect's Guide
Architect's Guide
Dec 12, 2023 · Backend Development

Understanding Kafka Consumer: Delivery Guarantees, Rebalance Mechanisms, Partition Assignment, and Thread Safety

This article provides a comprehensive guide to KafkaConsumer, covering message delivery semantics (at‑most‑once, at‑least‑once, exactly‑once), practical exactly‑once implementations, consumer rebalance triggers and strategies, partition assignment algorithms, thread‑safety considerations, and detailed Java code examples of the consumer workflow.

ConsumerKafkaMessage Delivery
0 likes · 14 min read
Understanding Kafka Consumer: Delivery Guarantees, Rebalance Mechanisms, Partition Assignment, and Thread Safety
ITPUB
ITPUB
Dec 2, 2023 · Backend Development

Why Did My Flink Kafka Job Lose Data? Uncovering Misconfigured Bootstrap Servers

A Flink job that reads from Kafka and writes to Elasticsearch was losing data because the bootstrap.servers list mixed production and pre‑release clusters, causing random server selection, partition discovery failures, and offset mismatches, which were resolved by correcting the server configuration.

Bootstrap ServersData lossFlink
0 likes · 8 min read
Why Did My Flink Kafka Job Lose Data? Uncovering Misconfigured Bootstrap Servers
JavaEdge
JavaEdge
Nov 24, 2023 · Backend Development

Why Kafka Is the Ultimate Backbone for Modern Backend Systems

This article explores how Kafka serves as a versatile backbone for messaging, durable storage, log aggregation, monitoring, commit logs, recommendation pipelines, stream processing, CDC, system migration, and event sourcing, highlighting its performance, reliability, and practical deployment patterns.

BackendKafkaMessage Queue
0 likes · 10 min read
Why Kafka Is the Ultimate Backbone for Modern Backend Systems
Programmer DD
Programmer DD
Nov 24, 2023 · Backend Development

What’s New in Spring Boot 3.2? Explore Java 21 Features and Virtual Threads

Spring Boot 3.2, released shortly after Java 21, brings a host of enhancements such as virtual thread support, CRaC checkpoint restore, SSL bundle reloading, improved observability, new RestClient and JdbcClient, Jetty 12, Pulsar, Kafka and RabbitMQ SSL, redesigned nested JAR handling, Docker image build upgrades, and a comprehensive video walkthrough by Josh Long.

DockerJava 21Kafka
0 likes · 7 min read
What’s New in Spring Boot 3.2? Explore Java 21 Features and Virtual Threads
Selected Java Interview Questions
Selected Java Interview Questions
Nov 21, 2023 · Backend Development

Design and Implementation of a Generic Asynchronous Processing SDK for Spring Applications

This article introduces a generic asynchronous processing SDK for Spring-based backend systems, explaining its purpose, advantages, underlying principles, components such as Kafka, XXL‑Job, MySQL, and Vue, design patterns employed, database schema, configuration via Apollo, usage examples, and deployment details.

KafkaSDKdesign-patterns
0 likes · 8 min read
Design and Implementation of a Generic Asynchronous Processing SDK for Spring Applications
IT Architects Alliance
IT Architects Alliance
Nov 17, 2023 · Backend Development

Design and Implementation of a Generic Asynchronous Processing SDK

This article introduces a generic asynchronous processing SDK for Java Spring applications, covering its purpose, advantages, underlying principles, component choices such as Kafka, XXL‑Job, MySQL, design patterns, database schema, configuration via Apollo, usage instructions, safety considerations, and provides code examples and a GitHub repository link.

AsyncKafkaSDK
0 likes · 9 min read
Design and Implementation of a Generic Asynchronous Processing SDK
DeWu Technology
DeWu Technology
Nov 15, 2023 · Backend Development

Thread Profiling: Design and Implementation of Client‑Server Performance Analysis

Thread profiling uses threshold‑triggered tasks on business threads to capture stack snapshots, which a dedicated profiler thread sends via high‑performance gRPC to a server that queues them in Kafka, enriches and stores them in ClickHouse, correlates with OpenTelemetry traces, and provides metrics that let developers quickly pinpoint latency bottlenecks and improve system stability.

GoKafkaOpenTelemetry
0 likes · 11 min read
Thread Profiling: Design and Implementation of Client‑Server Performance Analysis
Architect
Architect
Nov 13, 2023 · Backend Development

Designing a Robust Asynchronous Processing SDK with Spring, Kafka, and MySQL

This article explains why asynchronous processing is needed in evolving systems, outlines the goals of guaranteeing execution without blocking the main flow, and walks through a complete SDK design that uses Spring transaction events, Kafka, XXL‑Job, MySQL, and a Vue UI, including configuration, code snippets, and deployment details.

AsynchronousBackendDesign
0 likes · 10 min read
Designing a Robust Asynchronous Processing SDK with Spring, Kafka, and MySQL
Selected Java Interview Questions
Selected Java Interview Questions
Nov 10, 2023 · Backend Development

Design and Implementation of a Generic Asynchronous Processing SDK for Java Backend Systems

This article introduces a generic asynchronous processing SDK for Java back‑ends, explaining its purpose, advantages, underlying principles, components, design patterns, database schema, configuration, usage, and operational considerations, and provides complete code and configuration examples.

Kafkabackend-developmentjava
0 likes · 9 min read
Design and Implementation of a Generic Asynchronous Processing SDK for Java Backend Systems
macrozheng
macrozheng
Nov 9, 2023 · Big Data

7 Real-World Kafka Use Cases Every Engineer Should Know

This article explains Kafka's core components and features, then details seven practical scenarios—including log processing, recommendation streams, monitoring, CDC, system migration, event sourcing, and message queuing—showing how Kafka powers modern distributed systems.

Big DataKafkaMessage Queue
0 likes · 12 min read
7 Real-World Kafka Use Cases Every Engineer Should Know
ITPUB
ITPUB
Nov 7, 2023 · Big Data

7 Real-World Kafka Use Cases That Power Modern Distributed Systems

This article introduces Apache Kafka’s core components and key features, then details seven practical use cases—including log processing, recommendation streams, monitoring, CDC, system migration, event sourcing, and message queuing—illustrated with diagrams and step‑by‑step workflows for distributed systems.

Big DataKafkaMessage Queue
0 likes · 10 min read
7 Real-World Kafka Use Cases That Power Modern Distributed Systems
Selected Java Interview Questions
Selected Java Interview Questions
Nov 5, 2023 · Backend Development

Design and Implementation of a High‑Performance Distributed Reconciliation System for Large‑Scale Payment Orders

This article presents a comprehensive design of a distributed reconciliation system that handles tens of millions of daily payment orders by using a six‑module architecture, Kafka for decoupled state transitions, Hive for large‑scale data processing, and Java‑based plug‑in patterns to achieve six‑nine accuracy and significant operational cost savings.

Big DataDistributed SystemsKafka
0 likes · 15 min read
Design and Implementation of a High‑Performance Distributed Reconciliation System for Large‑Scale Payment Orders
Bilibili Tech
Bilibili Tech
Nov 3, 2023 · Big Data

Comprehensive Governance and Optimization Strategies for Large‑Scale Kafka Clusters

To tame a petabyte‑scale Kafka deployment of over 1,000 brokers, the team built a Raft‑based federation controller (Guardian) that adds per‑partition I/O throttling, disk‑aware automatic balancing, multi‑tenant isolation, cross‑IDC migration, request‑queue splitting, tiered storage, auditing, and fully automated rolling upgrades, enabling stable, self‑healing operations.

Big DataCluster GovernanceDistributed Systems
0 likes · 21 min read
Comprehensive Governance and Optimization Strategies for Large‑Scale Kafka Clusters
Top Architect
Top Architect
Nov 2, 2023 · Big Data

Understanding Distributed Systems and Kafka: Concepts, Architecture, and Ensuring Ordered Message Consumption

This article introduces the fundamentals of distributed systems, provides an overview of Apache Kafka’s architecture and core components, explains how Kafka ensures message ordering within partitions, and outlines Java‑based strategies to guarantee ordered consumption, including single‑partition consumption, partition assignment, and key‑based partitioning.

Big DataKafkaMessage Ordering
0 likes · 10 min read
Understanding Distributed Systems and Kafka: Concepts, Architecture, and Ensuring Ordered Message Consumption
HelloTech
HelloTech
Oct 31, 2023 · Big Data

Investigation of Data Loss in a Flink Kafka Consumer Caused by Mixed Kafka Cluster Configuration

The data loss in a Flink‑Kafka job was caused by a mis‑configured bootstrap.servers list that mixed production and pre‑release Kafka clusters, leading different subtasks to connect to different clusters, resulting in inconsistent partition discovery and offset fetching, which omitted several partitions until the list was corrected.

Cluster ConfigurationData lossElasticsearch
0 likes · 8 min read
Investigation of Data Loss in a Flink Kafka Consumer Caused by Mixed Kafka Cluster Configuration
JD Cloud Developers
JD Cloud Developers
Oct 25, 2023 · Backend Development

Master Kafka: Core Concepts, Architecture, and Practical Tips

This article explains Kafka's fundamentals, including topics, partitions, brokers, replication, producer‑consumer workflow, consumer groups, offset management, and common exception handling, while providing code examples and diagrams to help developers understand and effectively use this distributed messaging system.

Distributed SystemsKafkaMessage Queue
0 likes · 21 min read
Master Kafka: Core Concepts, Architecture, and Practical Tips
Architect
Architect
Oct 19, 2023 · Industry Insights

How Vivo Built a Highly Available Push System: Multi‑Region Architecture, Real‑Time Traffic Scheduling, and Disaster‑Recovery Strategies

This article analyzes the design of Vivo's push notification platform, detailing its high‑concurrency requirements, three‑region long‑connection deployment, traffic‑scheduling bypass layer, and layered storage disaster‑recovery solutions, while explaining the trade‑offs and performance metrics behind each architectural decision.

Cloud NativeKafkaSystem Architecture
0 likes · 14 min read
How Vivo Built a Highly Available Push System: Multi‑Region Architecture, Real‑Time Traffic Scheduling, and Disaster‑Recovery Strategies
Wukong Talks Architecture
Wukong Talks Architecture
Oct 13, 2023 · Backend Development

7 Common Message Queue Scenarios and Their Implementations

This article explains seven typical message‑queue use cases—including ordinary, ordered, delayed, transactional, trace, dead‑letter, and priority messages—detailing their business motivations, implementation challenges, and concrete code examples for Kafka, RocketMQ, Pulsar, and RabbitMQ.

Distributed SystemsKafkaMessage Queue
0 likes · 11 min read
7 Common Message Queue Scenarios and Their Implementations
MaGe Linux Operations
MaGe Linux Operations
Oct 8, 2023 · Big Data

Understanding Kafka: Core Concepts, Architecture, and Performance Secrets

This article explains Kafka’s fundamental role as a message system, detailing topics, partitions, producers, consumers, replica management, consumer groups, the controller, Zookeeper coordination, and performance optimizations such as sequential writes, zero‑copy, log segmentation, and network design, providing a comprehensive overview for big‑data practitioners.

Big DataDistributed SystemsKafka
0 likes · 11 min read
Understanding Kafka: Core Concepts, Architecture, and Performance Secrets
Efficient Ops
Efficient Ops
Oct 7, 2023 · Big Data

Master Kafka Basics: Topics, Partitions, Producers, and Cluster Architecture

This article explains Kafka's role as a messaging system, covering core concepts such as topics, partitions, producers, consumers, messages, cluster architecture, replicas, consumer groups, controller coordination with Zookeeper, and performance optimizations like sequential writes and zero‑copy networking.

Big DataDistributed SystemsKafka
0 likes · 11 min read
Master Kafka Basics: Topics, Partitions, Producers, and Cluster Architecture
Sanyou's Java Diary
Sanyou's Java Diary
Oct 6, 2023 · Backend Development

Inside Kafka Broker: How Its Network Architecture Handles Millions of Requests

This article deeply dissects Kafka Broker's network architecture and request‑processing pipeline, covering sequential, multithreaded, and event‑driven designs, the Reactor pattern, Acceptor and Processor threads, core request flow, and practical tuning parameters for high‑throughput, low‑latency deployments.

KafkaReactor Patternbackend-development
0 likes · 22 min read
Inside Kafka Broker: How Its Network Architecture Handles Millions of Requests
Java High-Performance Architecture
Java High-Performance Architecture
Sep 28, 2023 · Databases

How to Use Debezium for MySQL CDC in Spring Boot Without Adding Extra Middleware

Learn how to capture MySQL data changes using Debezium's CDC capabilities within a Spring Boot application, avoiding heavyweight message brokers by leveraging binlog monitoring, configuring connectors, handling snapshots, and processing change events for use cases like cache invalidation, data integration, and simplifying monolithic architectures.

CDCData IntegrationDebezium
0 likes · 24 min read
How to Use Debezium for MySQL CDC in Spring Boot Without Adding Extra Middleware
HomeTech
HomeTech
Sep 27, 2023 · Backend Development

Design and Evolution of a High‑Availability SMS Platform at AutoHome

This article details the architectural evolution, high‑availability strategies, fault‑monitoring mechanisms, and performance optimizations of AutoHome's enterprise SMS platform, covering its migration from .Net to Java, service decomposition with Kafka, multi‑datacenter deployment, and operational safeguards for large‑scale events.

BackendKafkaSMS
0 likes · 9 min read
Design and Evolution of a High‑Availability SMS Platform at AutoHome
Top Architect
Top Architect
Sep 25, 2023 · Backend Development

RabbitMQ vs Kafka: Detailed Comparison and When to Use Each

This article provides an in‑depth technical comparison of RabbitMQ and Apache Kafka, covering their core architectural differences, message ordering, routing, timing, retention, fault handling, scalability, consumer complexity, and offers guidance on selecting the appropriate platform for various backend scenarios.

KafkaMessage QueueRabbitMQ
0 likes · 18 min read
RabbitMQ vs Kafka: Detailed Comparison and When to Use Each
Efficient Ops
Efficient Ops
Sep 24, 2023 · Big Data

Mastering Kafka: From Basics to Advanced Operations and Performance Tuning

This article provides a comprehensive overview of Apache Kafka, covering its architecture, core concepts such as topics, partitions, and replicas, common operational commands, and practical performance‑tuning tips for high‑throughput, low‑latency streaming workloads.

Distributed SystemsKafkaOperations
0 likes · 23 min read
Mastering Kafka: From Basics to Advanced Operations and Performance Tuning
Sanyou's Java Diary
Sanyou's Java Diary
Sep 21, 2023 · Big Data

Understanding Kafka: Core Concepts, Architecture, and Reliability Explained

This article provides a comprehensive overview of Kafka, covering its overall architecture, key components such as brokers, producers, consumers, topics, partitions, replicas, and ZooKeeper, as well as logical and physical storage mechanisms, producer and consumer workflows, configuration parameters, partition assignment strategies, rebalancing, and the replication model that ensures data reliability.

Data StreamingDistributed SystemsKafka
0 likes · 18 min read
Understanding Kafka: Core Concepts, Architecture, and Reliability Explained
Wukong Talks Architecture
Wukong Talks Architecture
Sep 21, 2023 · Backend Development

Detecting and Preventing Message Loss in Kafka Message Queues

This article explains how to detect, diagnose, and prevent message loss in Kafka-based message queue systems by covering system decoupling, traffic control, data consistency challenges, producer, broker, and consumer issues, and offering configuration, monitoring, and operational best‑practice solutions.

Data ConsistencyDistributed SystemsKafka
0 likes · 12 min read
Detecting and Preventing Message Loss in Kafka Message Queues
ITPUB
ITPUB
Sep 15, 2023 · Databases

Importing Billions of Kafka Rows into Doris and Benchmarking Against ClickHouse

This article explains Doris's various data import methods, focuses on the routine load approach for Kafka streams, describes how to handle mixed‑schema topics using the max_error_number parameter, and compares query performance of a 130 million‑row dataset against ClickHouse, highlighting each system's strengths and limitations.

KafkaRoutine Loadclickhouse
0 likes · 10 min read
Importing Billions of Kafka Rows into Doris and Benchmarking Against ClickHouse
ITPUB
ITPUB
Sep 13, 2023 · Backend Development

Why Is Kafka So Fast? 7 Core Techniques Behind Its High Throughput

This article explains how Kafka achieves million‑message‑per‑second throughput by leveraging zero‑copy I/O, an append‑only log, batch processing, compression, consumer pull optimization, unflushed memory buffers, and JVM garbage‑collection tuning, detailing each mechanism and its impact on performance.

Batch ProcessingGC optimizationKafka
0 likes · 14 min read
Why Is Kafka So Fast? 7 Core Techniques Behind Its High Throughput
MaGe Linux Operations
MaGe Linux Operations
Sep 7, 2023 · Backend Development

Why Message Order Matters: Solving MQ Chaos in MySQL Binlog Sync

This article explains how ordering issues in message queues like RabbitMQ and Kafka can break MySQL binlog synchronization, illustrates common pitfalls, and offers practical solutions to guarantee correct processing order in high‑throughput backend systems.

KafkaMessage QueueRabbitMQ
0 likes · 5 min read
Why Message Order Matters: Solving MQ Chaos in MySQL Binlog Sync
Sanyou's Java Diary
Sanyou's Java Diary
Sep 7, 2023 · Operations

How to Keep Kafka Stable: Proven Practices for Prevention, Monitoring, and Recovery

This comprehensive guide explains how to ensure Kafka stability by applying proactive prevention, continuous runtime monitoring, and effective fault‑resolution strategies, covering producer and consumer tuning, cluster configuration, performance optimization, alerting, and idempotent consumption to prevent message loss and service disruption.

Kafkafault-recoveryperformance tuning
0 likes · 30 min read
How to Keep Kafka Stable: Proven Practices for Prevention, Monitoring, and Recovery
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Sep 7, 2023 · Backend Development

Comprehensive Overview of Message Queues: Types, Core Concepts, and Comparison of Kafka, RocketMQ, and RabbitMQ

This article provides a detailed overview of popular message queue systems, explains their core concepts such as decoupling and eventual consistency, and compares the advantages and disadvantages of Kafka, RocketMQ, RabbitMQ, and other notable MQ solutions for high‑concurrency scenarios.

Backend ArchitectureDistributed SystemsKafka
0 likes · 7 min read
Comprehensive Overview of Message Queues: Types, Core Concepts, and Comparison of Kafka, RocketMQ, and RabbitMQ
ITPUB
ITPUB
Aug 26, 2023 · Operations

When to Choose Kafka Over RabbitMQ? A Detailed Comparison

This article compares Kafka and RabbitMQ across scalability, durability, latency, data flow, ordering, reliability, persistence, extensibility, and complexity, then outlines ideal use cases for each and offers practical guidance on selecting the right message queue for a project.

ComparisonKafkaMessage Queue
0 likes · 8 min read
When to Choose Kafka Over RabbitMQ? A Detailed Comparison
Bilibili Tech
Bilibili Tech
Aug 11, 2023 · Backend Development

Designing a High‑Performance Asynchronous Event System for Bilibili’s Like Service

Bilibili’s railgun platform transforms its high‑traffic like service into a scalable, fault‑tolerant system by moving writes to an asynchronous, Kafka‑driven pipeline, applying CQRS, partitioned processing, idempotency, hot‑key isolation, rate‑limiting, and unified SDKs, dramatically reducing database load and achieving ten‑fold throughput gains.

AsynchronousBackendCQRS
0 likes · 21 min read
Designing a High‑Performance Asynchronous Event System for Bilibili’s Like Service
37 Interactive Technology Team
37 Interactive Technology Team
Jul 26, 2023 · Backend Development

Investigation and Resolution of CPU Spike in a Kafka-Go Consumer Using pprof

Using Go’s pprof, the team traced a gradual CPU spike in a high‑throughput kafka‑go consumer to a saturated commit queue and repeatedly nested context values, which forced costly lookups; eliminating the unnecessary trace‑id context injection (or recreating a fresh context each loop) resolved the issue and reduced CPU usage to under 2 %.

CPU profilingConsumerGo
0 likes · 10 min read
Investigation and Resolution of CPU Spike in a Kafka-Go Consumer Using pprof
Code Ape Tech Column
Code Ape Tech Column
Jul 21, 2023 · Backend Development

Implementing Distributed WebSocket Messaging with Redis and Kafka in Spring

This article explains how to enable cross‑node WebSocket communication in a distributed Spring application by publishing messages to a Redis or Kafka topic, tracking user connections, and routing messages to the appropriate server instance, complete with full code examples and configuration details.

Distributed MessagingKafkaWebSocket
0 likes · 16 min read
Implementing Distributed WebSocket Messaging with Redis and Kafka in Spring
Tencent Cloud Middleware
Tencent Cloud Middleware
Jul 20, 2023 · Operations

Why CKafka Cross‑Region Sync Stalled at 64KB/s: TCP Window Scaling & Kernel Tuning

This article details a real‑world investigation of severe latency in CKafka cross‑region data synchronization, tracing the issue from high message backlog through network bandwidth tests, kernel parameter adjustments, and finally uncovering a TCP window‑scaling failure caused by SYN‑cookie protection and missing timestamp options.

CKafkaCross-Region SyncKafka
0 likes · 15 min read
Why CKafka Cross‑Region Sync Stalled at 64KB/s: TCP Window Scaling & Kernel Tuning
Sanyou's Java Diary
Sanyou's Java Diary
Jul 17, 2023 · Backend Development

How Kafka’s Broker Handles Millions of Requests: Inside Its Network Architecture

This article deeply analyzes Kafka broker’s network architecture and request‑handling pipeline, walking through simple sequential models, multithreaded async designs, the Reactor pattern with Java NIO, key thread roles, core processing flow, and practical tuning parameters for high‑throughput, low‑latency deployments.

Java NIOKafkaReactor Pattern
0 likes · 21 min read
How Kafka’s Broker Handles Millions of Requests: Inside Its Network Architecture
Architects Research Society
Architects Research Society
Jul 16, 2023 · Big Data

Four Innovation Phases of Netflix’s Trillion‑Scale Real‑Time Data Infrastructure

The article chronicles Netflix’s evolution from a failing batch pipeline to a cloud‑native, self‑service streaming platform, detailing four development phases, the technical challenges faced, the stream‑processing patterns introduced, key learnings, and future opportunities for real‑time data and machine‑learning workloads.

Data PlatformFlinkKafka
0 likes · 30 min read
Four Innovation Phases of Netflix’s Trillion‑Scale Real‑Time Data Infrastructure
21CTO
21CTO
Jul 15, 2023 · Backend Development

From ActiveMQ to Pulsar: The Evolution of Message Queues Explained

This article traces the development of message queues from early decoupling solutions like ActiveMQ and RabbitMQ, through high‑throughput designs such as Kafka and RocketMQ, to modern platform‑centric systems like Pulsar, while detailing core concepts, architecture diagrams, storage mechanisms and trade‑offs.

BackendKafkaMessage Queue
0 likes · 15 min read
From ActiveMQ to Pulsar: The Evolution of Message Queues Explained
Tencent Cloud Developer
Tencent Cloud Developer
Jul 12, 2023 · Backend Development

Evolution and Architecture of Message Queues: Kafka, RocketMQ, and Pulsar

The article traces two decades of message‑queue evolution—from early decoupling tools like ActiveMQ, through Kafka’s high‑throughput log model, to Pulsar’s layered, cloud‑native architecture—explaining core concepts, storage designs, and trade‑offs that guide choosing the right MQ for modern workloads.

KafkaPulsarRocketMQ
0 likes · 16 min read
Evolution and Architecture of Message Queues: Kafka, RocketMQ, and Pulsar
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jul 10, 2023 · Big Data

Design and Implementation of the Log Reporting, Collection, and Distribution Pipeline in NetEase Cloud Music's Corona Front‑end Monitoring System

The article details NetEase Cloud Music’s Corona monitoring pipeline, explaining how SDKs report logs via an HTTP service, how a transmission layer normalizes and stores them, how a Flume‑like collector forwards logs to HBase and Kafka, and how Flink tasks shard and filter streams for various monitoring services while handling traffic spikes and offering an independent Node.js channel for other business units.

Distributed SystemsFlinkKafka
0 likes · 10 min read
Design and Implementation of the Log Reporting, Collection, and Distribution Pipeline in NetEase Cloud Music's Corona Front‑end Monitoring System
Architects Research Society
Architects Research Society
Jul 8, 2023 · Backend Development

System Design of Hotel Booking Applications (Airbnb, Booking.com, OYO)

This article explains how large hotel‑booking platforms such as Airbnb, Booking.com and OYO use a micro‑service architecture—including hotel management, customer search/booking, and view‑booking services—combined with load balancers, Kafka, Elasticsearch, Redis, Cassandra and Hadoop to achieve a seamless, high‑throughput booking flow.

Backend ArchitectureElasticsearchKafka
0 likes · 7 min read
System Design of Hotel Booking Applications (Airbnb, Booking.com, OYO)
MaGe Linux Operations
MaGe Linux Operations
Jun 30, 2023 · Operations

How Cloudflare Scaled Kafka to Process Over 1 Trillion Messages: Ops Lessons and Architecture

This article explains how Cloudflare’s engineering teams built and evolved a highly scalable Kafka‑based messaging backbone, adopted protobuf for structured communication, created a generic message‑bus, implemented connectors, added observability with Prometheus and OpenTelemetry, and refined health‑checks and batch processing to support trillions of inter‑service messages.

CloudflareKafka
0 likes · 13 min read
How Cloudflare Scaled Kafka to Process Over 1 Trillion Messages: Ops Lessons and Architecture
Sanyou's Java Diary
Sanyou's Java Diary
Jun 26, 2023 · Big Data

Master Kafka Interview Questions: Architecture, Partitioning, and Reliability Explained

This article provides a comprehensive overview of Kafka, covering its core architecture, message queue models, communication process, partition selection, consumer groups, rebalancing strategies, partition assignment algorithms, reliability guarantees, replica synchronization, and reasons for removing Zookeeper in newer versions.

KafkaPartitioningReliability
0 likes · 20 min read
Master Kafka Interview Questions: Architecture, Partitioning, and Reliability Explained
MaGe Linux Operations
MaGe Linux Operations
Jun 20, 2023 · Big Data

What Is Kafka? A Beginner’s Guide to Distributed Streaming and Messaging

Kafka is an open‑source, distributed streaming platform that uses a publish/subscribe message queue architecture to provide high‑throughput, fault‑tolerant real‑time data processing, featuring topics, partitions, replicas, consumer groups, and multiple APIs for producers, consumers, streams, connectors, and administration.

Big DataDistributed StreamingKafka
0 likes · 20 min read
What Is Kafka? A Beginner’s Guide to Distributed Streaming and Messaging
FunTester
FunTester
Jun 19, 2023 · Big Data

Kafka Architecture and Core Concepts: Brokers, Producers, Consumers, Topics, Partitions, Replicas, and Reliability

This article provides a comprehensive overview of Kafka's architecture and fundamental concepts, covering its overall structure, key components such as brokers, producers, consumers, topics, partitions, replicas, leader‑follower synchronization, offset handling, message storage at both logical and physical layers, as well as producer and consumer workflows, partition assignment strategies, rebalancing, log management, zero‑copy I/O, and reliability mechanisms.

Distributed SystemsKafkaLog Management
0 likes · 22 min read
Kafka Architecture and Core Concepts: Brokers, Producers, Consumers, Topics, Partitions, Replicas, and Reliability
JD Tech
JD Tech
Jun 16, 2023 · Big Data

Comprehensive Introduction to Apache Kafka: Architecture, Features, and Best Practices

This article provides a detailed overview of Apache Kafka, covering its distributed streaming architecture, storage mechanisms, replication, consumer groups, compression techniques, exactly‑once semantics, configuration tips, and performance optimizations for building reliable high‑throughput data pipelines.

Big DataDistributed StreamingExactly-Once
0 likes · 19 min read
Comprehensive Introduction to Apache Kafka: Architecture, Features, and Best Practices
Didi Tech
Didi Tech
Jun 14, 2023 · Big Data

Real-Time Data Development Practices and Component Selection at Didi

Didi’s unified real‑time data stack outlines best‑practice component choices for four key scenarios—metric monitoring, BI analysis, online services, and feature/tag systems—detailing pipelines from source to sink, resource‑usage guidelines, and a one‑stop development platform to build stable, high‑performance streaming solutions.

DruidFlinkKafka
0 likes · 17 min read
Real-Time Data Development Practices and Component Selection at Didi
Architect's Guide
Architect's Guide
Jun 6, 2023 · Backend Development

Kafka Core Concepts, Architecture, Performance Tuning, and Cluster Capacity Planning

This article provides a comprehensive overview of Kafka, covering its core value for decoupling and asynchronous processing, fundamental concepts such as producers, consumers, topics, partitions and replication, high‑performance mechanisms like zero‑copy and OS cache, detailed resource evaluation for CPU, memory, disk and network, operational tools, consumer‑group rebalance strategies, LEO/HW offsets, controller management, and delayed‑task scheduling.

BackendCluster PlanningKafka
0 likes · 29 min read
Kafka Core Concepts, Architecture, Performance Tuning, and Cluster Capacity Planning
Top Architect
Top Architect
Jun 5, 2023 · Big Data

Deep Dive into Kafka’s High Reliability and High Performance Mechanisms

This article comprehensively explores Kafka’s core concepts, architecture, and the techniques it employs—such as ack strategies, replica synchronization, high‑watermark, leader‑epoch, zero‑copy, batch sending, compression, and reactor‑based networking—to achieve both strong reliability and high throughput in distributed messaging systems.

Distributed SystemsKafkaMessage Queue
0 likes · 31 min read
Deep Dive into Kafka’s High Reliability and High Performance Mechanisms
Architects Research Society
Architects Research Society
Jun 5, 2023 · Backend Development

Panel Discussion on Large‑Scale Event‑Driven Architectures and Practical Lessons

A multi‑expert panel shares experiences, challenges, and best practices for building, operating, and evolving large‑scale event‑driven systems using technologies like Kafka, covering architecture decisions, domain modeling, observability, handling unordered events, and advice for day‑two operations.

Event-drivenKafkaMicroservices
0 likes · 29 min read
Panel Discussion on Large‑Scale Event‑Driven Architectures and Practical Lessons
Architects Research Society
Architects Research Society
Jun 2, 2023 · Backend Development

Comparing RabbitMQ, Kafka, and Redis for Asynchronous Microservice Communication

This article examines synchronous versus asynchronous microservice communication, outlines the benefits of async messaging, and compares three popular message brokers—RabbitMQ, Kafka, and Redis—by evaluating their scale, persistence, consumer models, and ideal use cases to help developers choose the right solution.

KafkaRabbitMQasynchronous communication
0 likes · 12 min read
Comparing RabbitMQ, Kafka, and Redis for Asynchronous Microservice Communication
Inke Technology
Inke Technology
May 25, 2023 · Backend Development

How to Build a Scalable Task System with Kafka and Pub/Sub

This article explains how to design and implement a basic task system—covering task configuration, display, progress tracking, and reward collection—while using Kafka for decoupling services and a publish/subscribe event layer to keep task logic independent of event sources.

Event-drivenKafkaPublish-Subscribe
0 likes · 8 min read
How to Build a Scalable Task System with Kafka and Pub/Sub
Top Architect
Top Architect
May 15, 2023 · Backend Development

Comprehensive Guide to Kafka: Architecture, Performance Tuning, and Operational Practices

This article provides an in-depth overview of Kafka, covering its core value as a message queue, fundamental concepts, cluster architecture, producer and consumer configurations, scaling strategies, monitoring tools, and practical operational commands for building and maintaining high‑throughput, highly available streaming systems.

BackendKafkaMessage Queue
0 likes · 31 min read
Comprehensive Guide to Kafka: Architecture, Performance Tuning, and Operational Practices

How Netflow Powers Real‑Time Network Traffic Monitoring and Analysis

This article explains Netflow’s principles, its three‑component architecture, implementation details using open‑source tools like nfdump and Kafka, and showcases practical applications such as load balancing, anomaly detection, and traffic engineering, providing a comprehensive guide for building robust network flow monitoring solutions.

KafkaNetflowNetwork Monitoring
0 likes · 11 min read
How Netflow Powers Real‑Time Network Traffic Monitoring and Analysis
DataFunTalk
DataFunTalk
May 5, 2023 · Big Data

NetEase Cloud Music Real-Time Data Warehouse Architecture and Low-Code Platform Practices

This article presents NetEase Cloud Music's real-time data warehouse architecture, covering its streaming and batch scenarios, layered design (ODS, CDM, ADS), technology stack choices, consistency mechanisms, the FastX low-code platform, and future development plans, offering a comprehensive technical overview for data engineers and architects.

Big DataFlinkKafka
0 likes · 18 min read
NetEase Cloud Music Real-Time Data Warehouse Architecture and Low-Code Platform Practices
FunTester
FunTester
May 5, 2023 · Big Data

Kafka Client API Performance Testing with Producer and Consumer Examples

This article introduces Kafka, a high‑performance distributed messaging system, and provides step‑by‑step Java/Groovy examples for configuring a producer and consumer, demonstrating how to benchmark their throughput using the FunQpsConcurrent framework, along with necessary Gradle dependencies and server setup instructions.

ConsumerDistributed MessagingKafka
0 likes · 8 min read
Kafka Client API Performance Testing with Producer and Consumer Examples
Top Architect
Top Architect
Apr 30, 2023 · Backend Development

Kafka Core Concepts, Architecture, Performance, and Operational Practices

This article provides a comprehensive overview of Kafka, covering its core value as a message queue, fundamental concepts, cluster architecture, log storage mechanisms, zero‑copy data transfer, high‑throughput and high‑availability design, consumer group behavior, rebalance strategies, and practical operational commands for managing topics, partitions, and offsets.

BackendDistributed SystemsKafka
0 likes · 31 min read
Kafka Core Concepts, Architecture, Performance, and Operational Practices
DataFunSummit
DataFunSummit
Apr 28, 2023 · Big Data

Building a Unified Streaming‑Batch Storage Architecture at Xiaohongshu

This article presents Xiaohongshu's design and implementation of a unified streaming‑batch storage system that integrates Lambda architecture, Kafka, Flink, Iceberg, and modern OLAP engines to solve real‑time data warehouse pain points and enable consistent, exactly‑once analytics across streaming and batch workloads.

Batch ProcessingFlinkIceberg
0 likes · 16 min read
Building a Unified Streaming‑Batch Storage Architecture at Xiaohongshu
vivo Internet Technology
vivo Internet Technology
Apr 26, 2023 · Operations

Disaster Recovery Design and Practices for Vivo Push System

Vivo’s push platform achieves high‑availability disaster recovery by deploying multi‑region broker clusters, implementing dual‑active logic nodes across two data centers, adding a Kafka‑backed buffering layer for traffic spikes, and using a hybrid Redis‑plus‑disk KV storage scheme to ensure durable, real‑time message delivery.

KafkaPush Systemdisaster recovery
0 likes · 11 min read
Disaster Recovery Design and Practices for Vivo Push System