Tagged articles
3 articles
Page 1 of 1
JavaEdge
JavaEdge
Feb 20, 2024 · Big Data

Designing a Scalable Data Quality Center for Offline Big‑Data Pipelines

This article describes the design and implementation of a platform‑wide Data Quality Center for offline big‑data pipelines, covering research of existing solutions, design goals, system architecture based on DolphinScheduler, rule definition language, binding and execution mechanisms, and future enhancements such as lineage monitoring and real‑time checks.

Apache GriffinBig DataData Quality
0 likes · 18 min read
Designing a Scalable Data Quality Center for Offline Big‑Data Pipelines