Kafka for Data Ingestion and Event Distribution: Production‑Consumer and Publish‑Subscribe Patterns
This article explains how Kafka can be used for data ingestion and event distribution by illustrating production‑consumer and publish‑subscribe models, describing core concepts such as topics, partitions and consumer groups, and offering practical design options for handling different event scenarios.
In message‑oriented middleware, the producer‑consumer model continuously pushes data from producers to a message hub, where multiple consumers retrieve and process the same data; the publish‑subscribe model adds a subscription step so each subscriber receives only the events it is interested in. Both models are common for decoupling functionality and enabling communication in distributed systems.
The article uses two scenarios—data ingestion and event distribution—to demonstrate high‑level Kafka usage. Understanding Kafka’s basic concepts is a prerequisite for system design; coding implements the design, while debugging and performance tuning occur later.
Data Ingestion Consider a user‑behavior collection system where a mobile app reports click events via a REST API. The API immediately enqueues the data and returns, while a separate worker pool consumes the queue for processing. This separation provides functional isolation, buffering of bursty traffic, and easy scalability by adding more workers, exemplifying the producer‑consumer pattern.
Event Distribution In an e‑commerce system, actions such as "favorite", "order", and "payment" generate events that trigger additional processes (e.g., SMS notifications, point accrual). Embedding these processes in each service leads to tight coupling; instead, an event‑distribution system lets services publish events to a central hub, while interested handlers subscribe to them.
Kafka Basic Concepts
Topic : Logical category of messages, similar to an exchange in RabbitMQ; producers write to a topic and consumers read from it.
Partition : Physical storage unit of a topic; partitions are distributed across brokers to enable horizontal scaling, replication, and fault tolerance. The number of partitions is usually a multiple of the broker count.
Consumer Group : Logical grouping that determines how messages are delivered. Within a group, only one consumer receives a given record, enabling both broadcast (multiple groups) and unicast (single group) semantics.
Designing the Production‑Consumer Pattern After grasping Kafka fundamentals, the data‑ingestion scenario can be implemented by having a producer push incoming app data to a topic, while one or more consumer groups (each possibly with multiple workers) process the data according to business needs.
Designing the Publish‑Subscribe Pattern For the event‑distribution scenario, three events ("favorite", "order", "payment") may have different interested services. Kafka does not natively support routing keys, so several approaches are possible:
Use the same production‑consumer model and let each consumer group filter unwanted events (simple but generates noise).
Create a separate topic per event type; consumers subscribe only to topics they care about (works for a limited number of events).
Introduce a stream‑processing layer that routes events to appropriate topics based on subscription rules (suitable for high volume with limited consumer groups).
Manually assign partitions to events, though this is hard to manage and generally discouraged.
Finally, the article encourages readers to like, bookmark, and share the post.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
