Practical Big Data Architecture Evolution and Lessons Learned
The article reviews the evolution of big‑data architectures from a simple RDB‑centric pipeline to a SaaS‑based solution, highlighting common bottlenecks such as scaling, integration, cost, and operational complexity, and shares practical experiences and best‑practice recommendations for building efficient, maintainable data platforms.
Big data has become mainstream, with many enterprises establishing dedicated big‑data teams even though their actual data volumes are often small or medium; the real challenge is applying big‑data techniques to create business value, a topic explored by chief architect Zhang Song at the Qiniu Cloud "Architect Practice Day".
Zhang Song, a Xiamen University computer science master, is chief architect at Qiniu, focusing on micro‑service health, continuous delivery, Go, Kubernetes, and machine‑learning applications.
While traditional RDB‑based analytics (using logs, point‑in‑time metrics, and Redis) can meet many needs, they encounter bottlenecks such as difficult onboarding of new metrics, exploding result sets, high query latency, and complex cluster maintenance.
Three‑step exploration is proposed to simplify the journey from data source to business insight.
First‑generation architecture (see image below) routes server logs to an RDB/Redis layer, performs secondary computation, and finally presents results; this simple design suffers from hard‑to‑add new metrics, massive intermediate datasets, high time‑complexity, and cumbersome cluster scripts.
Second‑generation architecture (image) adopts Alibaba EMR, MaxCompute, LogStash, LogHub, and OSS. Although it reduces some manual effort, new problems appear: high indexing costs, complex component integration, persistent resource contention, and rapid product deprecation.
Third‑generation architecture leverages Qiniu’s Pandora SaaS big‑data platform, offloading operational responsibilities to the provider and dramatically simplifying the pipeline.
Practical sharing includes daily data growth of 500 GB, billions of log entries, and workloads such as e‑commerce, community, live streaming, and operations, covering analysis, statistics, monitoring, and alerting.
Data sources range from Nginx, Tomcat, FPM, gateways, business logs, to SLA and audit logs; a unified log format and a unique request ID are essential for end‑to‑end tracing across micro‑services.
Log query interfaces enable filtering by device ID or request ID, while dashboards visualize API status codes, SOA call metrics, and slow‑query statistics, helping identify performance regressions after each release.
Version control in micro‑services creates maintenance overhead; by analyzing traffic per API version, obsolete versions can be retired, reducing patch workload and enabling automated capacity scaling based on historical bandwidth trends.
Real‑time attack detection (e.g., CC attacks) uses WAF and rate‑limiting; abnormal spikes in device traffic are identified via live statistics, allowing targeted throttling without impacting legitimate high‑traffic scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
