Mastering Apache Pulsar Geo‑Replication: Modes, Configs, and Common Pitfalls
Apache Pulsar’s built‑in Geo‑Replication lets multiple clusters across different regions synchronize data, offering both synchronous and asynchronous modes; this guide explains the three asynchronous patterns—full‑mesh, unidirectional, and failover—detailing required configurations, operational principles, and current limitations.
Overview
Apache Pulsar is a multi‑tenant, high‑performance messaging platform that supports low latency, read/write separation, cross‑region replication, rapid scaling, and flexible fault tolerance. Its native Geo‑Replication lets clusters in different physical locations replicate data.
Why Geo‑Replication Matters
With Geo‑Replication, services can be spread across multiple data centers, providing resilience against a whole‑site failure; if one site goes down, traffic can be switched to another site without interruption.
Replication Modes
Based on whether replication is synchronous or asynchronous, two high‑level approaches exist:
Synchronous mode : Guarantees strong durability by writing to replicas in different cities before acknowledging the client, but network jitter can hurt performance.
Asynchronous mode : Writes locally first, then copies to remote sites, preserving producer latency at the cost of extra storage and eventual consistency.
Asynchronous Geo‑Replication Options
The article focuses on asynchronous replication and lists three architectural patterns:
Full‑mesh (all clusters replicate to each other)
Unidirectional replication
Failover mode
These patterns can be further divided by the presence of a global configuration store (configurationStoreServers, i.e., a global ZooKeeper):
With configurationStoreServers – only Full‑mesh is supported.
Without configurationStoreServers – Unidirectional and Failover are available.
Key Configuration Items
When initializing a Pulsar cluster, the following parameters must be supplied:
cluster (cluster name)
zookeeper (local ZooKeeper servers)
configuration-store (global ZooKeeper servers, optional)
web-service-url / web-service-url-tls
broker-service-url / broker-service-url-tls
bin/pulsar initialize-cluster-metadata \
--cluster pulsar-cluster-1 \
--zookeeper zk1.us-west.example.com:2181 \
--configuration-store zk1.us-west.example.com:2181 \
--web-service-url http://pulsar.us-west.example.com:8080 \
--web-service-url-tls https://pulsar.us-west.example.com:8443 \
--broker-service-url pulsar://pulsar.us-west.example.com:6650 \
--broker-service-url-tls pulsar+ssl://pulsar.us-west.example.com:6651Full‑Mesh Replication
In a full‑mesh, every cluster can read and write to all others. Data flow is illustrated in the diagram below. To avoid infinite loops, Pulsar tags replicated messages with a replication_from label, allowing brokers to ignore messages that originated from the target cluster.
Unidirectional Replication
When a global ZooKeeper is not used, unidirectional replication can be configured by pointing configurationStoreServers to the local ZooKeeper address. This allows data to flow only from a source cluster to a downstream cluster, reducing network traffic and storage overhead.
Failover Mode
Failover is a special case of unidirectional replication. The remote cluster acts as a standby replica without producers or consumers. If the active cluster fails, producers and consumers are switched to the standby cluster, and subscription state is also replicated via the replication subscription.
Current Limitations
Only per‑data‑center message ordering is guaranteed; global ordering across sites is not supported.
Cursor snapshots are periodic, so exact timing cannot be guaranteed.
Only the “mark delete position” is synchronized; individual message acknowledgments are not.
All clusters must be online for a cursor snapshot to succeed.
Snapshotting introduces cache overhead that can affect backlog calculations.
References
Further reading on Pulsar storage model, retention policies, and client performance can be found in the linked articles.
Tencent Cloud Middleware
Official account of Tencent Cloud Middleware. Focuses on microservices, messaging middleware and other cloud‑native technology trends, publishing product updates, case studies, and technical insights. Regularly hosts tech salons to share effective solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
