How Apache Pulsar Solved Our Financial Messaging Challenges
Facing limited visibility, routing, and security in traditional MQ-based financial systems, a company evaluated its needs for identity control, routing, auditing, low latency, scalability, ordering, and replay, and chose Apache Pulsar for its multi‑cluster, compute‑storage separation, pluggable authentication, rich API, and functions, outlining practical experiences and solutions.
Background
Traditional financial systems often use a single unified access component that forwards external requests to internal services via a message queue (MQ). The request is written to the MQ, processed downstream, and the response is sent back through the same queue, resulting in a closed, tightly‑coupled workflow.
Challenges of the traditional MQ architecture
MQ behaves as a black‑box, making observability difficult.
Direct‑exchange routing prevents flexible topic‑based routing.
Weak authentication and validation increase security risk.
Custom client libraries support only a few programming languages, limiting extensibility.
Business requirements in financial scenarios
Identity & security control
Clients and producers must be identified, authenticated, and authorized based on system, IP, and business policies. Illegal access must be rejected.
Routing & distribution
Messages need to be routed from a write queue to appropriate destination queues. The existing MQ lacks native topic routing, stream processing, and efficient distribution, leading to high latency and complex adapters.
Auditing
Full audit of message publishers and consumers is required for compliance, anomaly detection, and downstream traceability.
System requirements for new workloads
High availability & low latency
Financial services demand sub‑second latency and ultra‑high availability, even across multi‑data‑center deployments.
Rapid scaling & recovery
Workloads can spike dramatically; the system must scale horizontally on demand without over‑provisioning.
Ordered delivery & de‑duplication
Some workflows require strict ordering and idempotent processing to reduce downstream pressure.
Message replay & serialization
Ability to replay messages from a specific time window aids debugging and gray‑scale testing.
Why Apache Pulsar was selected
Cluster mode
Supports cross‑cluster synchronization, enabling active‑active deployments with seamless geographic replication.
Compute‑storage separation
Storage and compute can be scaled independently; secondary storage enables analytics and audit use cases.
Pluggable authentication
Custom authentication plugins enforce strict client identity verification.
Rich REST API
Provides observability, metrics, and management capabilities missing in the previous MQ.
Functions
Built‑in serverless functions allow message routing, filtering, and aggregation without external services.
Message replay
Configurable persistence and expiration allow replay of persisted messages.
Multi‑language support
Clients in Java, Go, Python, C++, and other languages can connect easily.
Pulsar in practice
Request routing – simplified architecture
All components write to a fixed topic (e.g., topic-A). A routing module consumes from topic-A and forwards each message to one or more downstream topics based on business rules. Downstream services subscribe only to the topics they need, eliminating point‑to‑point coupling.
Drawbacks of this pattern include additional troubleshooting steps (verify write to topic-A, routing success, downstream subscription), increased end‑to‑end latency due to the extra hop, storage redundancy because each hop persists messages, and a learning curve for the new routing module.
Data broadcast – reducing latency
Market‑data distribution switched from a database‑pull model to a publish/subscribe broadcast. Consumers subscribe directly to the relevant market‑data topics, eliminating the database read latency and delivering data in near real‑time.
Secure message notification
An audit module intercepts messages after they are published, applies security policies, filters illegal signals, and logs audit information. The filtered messages are then forwarded to downstream tasks. This separation clarifies service boundaries and enables real‑time policy updates.
Issues discovered and solutions
REQ‑REP pattern over a bus
Implementing request‑reply by broadcasting responses caused every node to receive all replies, inflating traffic. The team evaluated broker‑side filtering so that only the originating node processes its own response. This reduces unnecessary network load and prevents each node from handling the full request rate.
Read‑write separation
Pulsar 2.7.2 did not support explicit read‑write separation, causing all consumers to connect to the same broker that owned the topic. Upgrading to Pulsar 2.8 introduced native read‑write separation, allowing producers and consumers to be balanced across different brokers and improving broadcast scalability.
Multi‑NIC connectivity
Because the internal network spans multiple subnets and NICs, registering brokers by a single IP caused segmentation failures. Deploying a high‑availability proxy layer that abstracts broker addresses resolved cross‑subnet connectivity and simplified client configuration.
Disaster recovery strategy
The initial deployment runs in a single data‑center. The roadmap includes building active‑active Pulsar clusters across multiple cities, with geo‑replication to ensure continuous availability and seamless failover.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
