How to Build NSQ Multi‑Data‑Center Deployment with Lookup‑Migrate
This article explains the design and implementation of NSQ dual‑ and multi‑data‑center architectures using a lookup‑migrate proxy, covering deployment scenarios, routing strategies, migration phases, JSON response transformations, and practical lessons learned for reliable message publishing and consumption across data centers.
Overview
This article explains how Youzan extended NSQ to support dual‑ and multi‑data‑center deployments. The solution adds a proxy called lookup-migrate in front of the existing nsqlookupd service‑discovery component, allowing producers and consumers to operate transparently across multiple rooms (data centers) without any code changes in the business layer.
Scenario and Requirements
In a single‑room deployment, producers publish to an NSQ cluster and consumers subscribe via nsqlookupd. In a dual‑room scenario both production and consumption may be spread across two rooms, requiring:
Transparent failover when a room becomes unavailable.
Minimal latency impact for the business side.
Ability to switch topics between rooms without data loss.
Dual‑Data‑Center Design
The design keeps the existing nsqlookupd API but inserts lookup-migrate as a forward proxy. The proxy merges the lookup results from the two rooms and returns a combined list of nsqd nodes. Topics are mirrored in both rooms; messages are persisted only in the local room, so a failure of the local room does not lose already stored messages.
Phase 1 – Basic Dual‑Room Routing
All read/write traffic is routed through lookup-migrate to a single (primary) room, while the other room acts as a cold standby. This provides a simple failover path at the cost of cross‑room latency.
Phase 2 – Local Production with Dual‑Room Consumption
After Phase 1 stabilises, write requests are sent directly to the local nsqd nodes. lookup-migrate still returns nsqd nodes from both rooms for consumers, eliminating write latency while allowing consumers to read from the full set of messages.
Lookup‑Migrate Routing and Migration Strategy
The proxy distinguishes read and write operations by adding an access flag to the lookup request. It then aggregates the JSON response, merging the partitions (local room) and producers (remote room) arrays. Example merged response:
{
"channels": ["default"],
"meta": {"extend_support": true, "ordered": false, "partition_num": 1, "replica": 2},
"partitions": {
"0": {"broadcast_address": "11.0.0.1", "hostname": "nsq1", "http_port": 4151, "tcp_port": 4150, "version": "0.3.7-HA.1.9.5"},
"1": {"broadcast_address": "21.0.0.2", "hostname": "nsq2", "http_port": 4151, "tcp_port": 4150, "version": "0.3.7-HA.1.9.5"}
},
"producers": [
{"broadcast_address": "11.0.0.1", "hostname": "nsq1", "http_port": 4151, "tcp_port": 4150, "version": "0.3.7-HA.1.9.5"},
{"broadcast_address": "21.0.0.2", "hostname": "nsq2", "http_port": 4151, "tcp_port": 4150, "version": "0.3.7-HA.1.9.5"}
]
}Migration proceeds in three steps:
Proxy consumer lookup requests to both rooms’ nsqd nodes.
After the consumer establishes connections, proxy producer requests to the target nsqd (the destination room).
Detach the source‑room connections once the migration succeeds.
For ordered topics the production side is switched first; consumption is moved only after confirming that the source channel has no backlog.
Extension to Multi‑Data‑Center
When more than two rooms are required, the lookup‑migrate configuration uses three fields per topic: #C – an array of lookup addresses for consumers. #P – a single lookup address for producers. #D – a default mapping used when a topic does not have an explicit entry.
Example configuration:
{
"topics": [
{
"topicA": {
"#C": ["lookup_addr1", "lookup_addr2", "lookup_addr3"],
"#P": "lookup_addr1",
"#D": "default lookup_addr"
}
}
]
}A lookup‑schema maps short names to full URLs, reducing configuration size:
{
"lookupSchema": {
"nsq1": "http://nsq1.example.com:4161",
"nsq2": "http://nsq2.example.com:4161",
"nsq3": "http://nsq3.example.com:4161"
}
}Practical Experience and Lessons Learned
Proxy deployment: Choose between transparent request forwarding (single port) or a dedicated reverse‑proxy that exposes the lookup-migrate port.
Deterministic ordering: Sort the list of lookup addresses before merging so that partition selection is stable and duplicate reconnects are avoided.
Partial failures: If a lookup request to one room fails, merge the successful results and still return a usable list to the client.
Discovery optimisation: Use listlookup to obtain all nsqlookupd instances in a cluster, enabling load‑balanced proxying.
Conclusion
The lookup-migrate approach provides a transparent, low‑cost way to run NSQ across dual‑ or multi‑data‑center topologies. It supports seamless failover, smooth migration of producers and consumers, and works for both ordered and unordered topics while keeping existing applications unchanged.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
