JavaEdge
Feb 20, 2024 · Big Data
Designing a Scalable Data Quality Center for Offline Big‑Data Pipelines
This article describes the design and implementation of a platform‑wide Data Quality Center for offline big‑data pipelines, covering research of existing solutions, design goals, system architecture based on DolphinScheduler, rule definition language, binding and execution mechanisms, and future enhancements such as lineage monitoring and real‑time checks.
Apache GriffinBig DataData Quality
0 likes · 18 min read
