Scaling Event‑Driven Messaging at Wix with Kafka: Key Patterns

This article explains how Wix uses Kafka‑based event‑driven messaging to decouple microservices, improve scalability, and achieve exactly‑once processing through patterns such as consume‑and‑project, end‑to‑end event streams, in‑memory KV stores, scheduled jobs, transactional events, and event aggregation.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Scaling Event‑Driven Messaging at Wix with Kafka: Key Patterns

Over the past year I have been part of Wix's event‑driven messaging infrastructure team, built on Kafka and serving more than 1,400 microservices. During this time we implemented several key patterns that make the distributed system robust and able to handle growing traffic and storage demands.

1. Consume and Project

When a service becomes a bottleneck because it stores large domain objects, we create a materialized view that only contains the data needed by downstream services. At Wix the MetaSite service stores site metadata (versions, owners, installed apps) and receives over 1 M RPM. To offload read traffic we stream all site metadata to a Kafka topic, then:

Consume the stream and write the full objects into a database (or use CDC tools like Debezium for change data capture).

Build a write‑only service that projects only the "installed‑apps" context into a separate database table.

Expose a read‑only service that serves requests by querying the projected view.

This split reduces load on the original service and its database, allows independent scaling of read replicas, and provides an eventually consistent, highly optimized data projection.

2. End‑to‑End Event‑Driven Flow

By combining Kafka with WebSockets we replace the traditional request‑response polling model with a fully distributed, fault‑tolerant pipeline. In the example of importing all Wix user contacts, the browser opens a WebSocket channel, sends a CSV import request with the channel ID, and receives real‑time status updates from the contacts import service via Kafka‑driven events. This design eliminates stateful polling, improves resilience, and enables geographic replication of each stage.

3. In‑Memory KV Store

For zero‑latency configuration data we use Kafka compacted topics as an in‑memory key/value store. The compacted topic retains the latest value for each key, allowing services such as Wix Business Manager and Wix Bookings to subscribe to updates with 0 ms read latency. Unlike external stores (e.g., HBase, DynamoDB) the data is instantly available in memory after the service starts.

4. Schedule‑and‑Forget

Recurring jobs (e.g., subscription payment renewals) are scheduled by publishing a request to Kafka instead of repeatedly calling a REST endpoint. Kafka guarantees ordered processing per key (user ID), eliminating the need for complex locking. Retry logic is handled by Kafka’s built‑in retry topics, and dead‑letter queues capture permanently failing messages.

5. Transactional Events in Order Processing

To avoid duplicate processing in e‑commerce checkout flows, the checkout service wraps the emission of the "Order Checkout Completed" event and the offset commit in a single Kafka transaction. Downstream services (delivery, inventory, invoicing) see the event only after the transaction commits, ensuring exactly‑once semantics.

6. Event Aggregation

When large CSV imports are split into many small jobs, we track completion using a Kafka compacted topic as an atomic KV store. Each job writes a "Job Completed" event; a consumer‑producer pair aggregates these updates, increments a counter, and when the total matches the expected job count, a final notification is emitted via WebSocket.

All of these patterns rely on Kafka’s ability to stream data, provide ordering per key, support compacted topics for KV storage, and enable transactional processing, making the system highly scalable, resilient, and loosely coupled.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed systemsKafkaEvent-Driven ArchitectureData Streaming
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.