Mastering Kafka: Producer‑Consumer vs Pub/Sub Patterns for Scalable Backend Design
This article explains Kafka's core concepts and compares producer‑consumer and publish‑subscribe models, illustrating how to apply each pattern for data ingestion and event distribution in distributed backend systems, and offers practical design alternatives when Kafka’s native capabilities fall short.
Producer‑Consumer model refers to producers continuously pushing data to a message hub while various consumers retrieve and process the data; all consumers in the same category receive identical data. Publish‑Subscribe model is also a producer‑consumer pattern, but subscribers first declare interest in specific data, so each subscriber receives only the subset it cares about. Both patterns are common when using message middleware for functional decoupling and inter‑service communication.
The article uses two scenarios—"data ingestion" and "event distribution"—to explore high‑level Kafka usage. Understanding Kafka’s basic concepts is a prerequisite for system design; coding implements the design, while debugging and performance tuning occur later. The focus is on application methods, not a one‑size‑fits‑all solution.
Data Ingestion Imagine a user‑behavior collection system that gathers click data from an app. The reporting API and data processing are separated: the app posts data via a REST API, the backend immediately enqueues the data and returns, while workers later consume the queue for processing. This separation offers functional isolation, buffering of variable reporting rates, and easy scalability by adding more workers—an example of the producer‑consumer model where the reporting API is the producer and the processing workers are consumers.
Event Distribution In an e‑commerce system, events such as "favorite", "order", and "payment" trigger additional actions (e.g., SMS notifications, points recording). Rather than embedding these actions in each module, an event distribution system publishes events to a message hub, and interested handlers subscribe to them. This follows the publish‑subscribe model.
Kafka Basic Concepts
Kafka is a distributed streaming platform that uses Zookeeper for cluster management. Like other messaging systems, it consists of producers, broker servers, and consumers. The article highlights three key concepts:
Topic – a logical category for messages, analogous to an exchange in RabbitMQ, used to separate different streams of data.
Partition – the physical storage unit of a topic; data for a single topic is spread across multiple partitions, which can reside on one or many machines, enabling horizontal scaling and fault tolerance.
Consumer Group – a logical grouping that implements both unicast and broadcast delivery models. Each group receives a full copy of a topic’s data, but within a group only one consumer instance processes a given partition, allowing parallelism while preserving order.
Producer‑Consumer Pattern
After grasping Kafka’s basics, the producer‑consumer pattern can be applied to the "data ingestion" scenario: a producer receives front‑end reports and writes them to a topic, while multiple consumer groups independently read the data, scaling workers as needed.
Publish‑Subscribe Pattern
For the "event distribution" scenario, Kafka does not natively support routing events to specific subscribers based on content, so several work‑arounds are suggested:
Option 1 : Use the producer‑consumer model and let each consumer group filter out irrelevant events; simple but generates unwanted traffic.
Option 2 : Create a separate topic for each event type and have consumers subscribe only to the topics they need; feasible when the number of event types is small.
Option 3 : Insert a stream‑processing layer that classifies events according to subscription rules and writes them to dedicated topics, reducing noise for consumers while handling large volumes.
Option 4 : Manually manage partition assignment, which is complex and generally discouraged.
Source: http://blog.csdn.net/zwgdft/article/details/54633105
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
