Design and Implementation of a Distributed Data Transmission System for Financial Services
Baidu built a high‑availability, low‑latency distributed transmission system that connects directly to global securities exchanges via dedicated optical lines and multicast support, synchronizes data across multiple cloud regions, provides Internet fallback for disaster recovery, and delivers sub‑60 ms market data to billions of daily financial service requests with five‑nine reliability.
Background
In Baidu Search, financial services generate tens of millions of user search requests daily. Before 2021, financial data was fetched via traditional Internet methods, which suffered from poor timeliness, frequent data loss, and high maintenance costs. To address these issues, a securities data direct‑connect project was launched to build a high‑availability, low‑latency distributed transmission system that connects to global securities exchanges.
Design Goals
Business : Connect to Level‑1 market data from major exchanges (FIX/FAST, TXT, binary streams) to cover stocks, forex, futures, ETFs, etc., achieving competitive timeliness.
Technical : Infrastructure – Deploy physical dedicated lines to Baidu Cloud data centers (Shanghai, Shenzhen, Hong Kong, Nasdaq) and adapt single‑ and multicast protocols. Latency & Stability – 99th‑percentile query time ≤200 ms, stability >99.99 %. Data Security – Use Baidu security capabilities to enforce strict firewall and security‑group policies.
Architecture Layers
Access Layer : Adapt exchange protocols (unicast, multicast) and ingest binary/text streams via physical lines or Internet.
Network Layer : Build VPCs in Baidu Cloud (subnets, routes, gateways) for South‑China, North‑China, East‑China, and Hong‑Kong regions.
Transport Layer : Parse, store, synchronize, and forward data within each data‑management cluster (raw, decoded, processed).
Application Layer : Implement load/traffic scheduling, monitoring, and user‑facing services.
Key Challenges & Solutions
Challenge 1 – Public vs. Private Network Integration
Public Internet offers low cost and quick deployment (HTTP/HTTPS, RPC, FTP) but suffers from instability and security risks. Private network (dedicated optical fiber) provides point‑to‑point, high‑security, low‑latency transmission at higher cost and longer deployment time.
Solution: Use dedicated lines for primary data flow and retain Internet as a disaster‑recovery backup.
Challenge 2 – Multiprotocol Adaptation in Private Networks
Private networks must support unicast, broadcast, and multicast. Unicast is straightforward via static routes. Multicast requires physical routers or specialized software (PIM, IGMPv3) and may involve converting multicast to unicast using IGMP snooping and application‑level proxies.
Challenge 3 – Cross‑Region Data Synchronization
Data must be replicated across data centers with minimal latency. Direct peer connections are used when two regions have physical links; otherwise, a bridge region employs NAT and tunneling to create a logical mesh.
Disaster Recovery & Latency Improvements
Deploy dual dedicated lines per exchange (primary + backup) plus an Internet fallback.
Achieve SLA >99.999 % (5 9’s) on dedicated lines versus ~99 % on Internet.
Typical end‑to‑end latency is ~60 ms (99.99‑percentile).
Traffic Scheduling
Single‑region: Use Baidu Load Balance (BLB) with weighted backend clusters. Multi‑region: Deploy per‑region BLB instances and coordinate health checks across regions, or build a custom VIP‑based scheduler.
Overall System Design
The system consists of six main clusters:
Source Data Ingestion Cluster – supports Internet and dedicated‑line inputs, various protocols.
Source Data Forwarding Cluster – ensures consistency across regions.
Data Parsing Cluster – normalizes raw streams.
Business Data Cluster – provides real‑time and delayed streams to downstream services.
Gateway Cluster – handles user traffic.
Monitoring Cluster – aggregates logs and health metrics.
Only a few hundred machines are needed to support over a billion requests per day.
Summary & Outlook
The distributed transmission system has dramatically improved financial data timeliness (from minutes to <60 ms) and reliability (SLA from 2 9’s to >5 9’s). It now powers multiple Baidu financial products and is being further enhanced with AI capabilities to provide smarter, faster investment decisions.
Baidu Geek Talk
Follow us to discover more Baidu tech insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.