What Is a Data Middle Platform and How Does It Transform Enterprise Data Management?
This article explains the concept, design principles, and core components of a data middle platform, detailing its overall, functional, layered, logical, and data architectures, as well as the specific platforms for data collection, processing, organization, governance, quality, sharing, and visualization, illustrated with diagrams.
Data Middle Platform Overview
A data middle platform refers to a set of technologies that collect, compute, store, and process massive amounts of data while unifying standards and definitions, turning raw data into standardized assets that can be efficiently served to customers.
Design Principles
Data consistency and standardization
Data usability and service orientation
Data independence and scalability
Data security
Hierarchical data management mechanisms
Core Functions
Overall Architecture
The platform consists of a functional architecture, a layered architecture, a logical architecture, and a data architecture.
Functional Architecture
Key subsystems include a unified data collection and access platform, a centralized processing platform, an organization management platform, a global governance platform, a data sharing platform, an analytics and mining platform, a knowledge graph platform, a unified management platform, and a visualization platform.
Layered Architecture
Based on business needs and vision, the platform integrates a data acquisition perception system, a data resource integration system, and an information sharing service system, embedding security and standards throughout to continuously enhance capabilities for data ingestion, processing, organization, mining, governance, and service.
Logical Architecture
The platform supports standardized data ingestion, modular processing, fine‑grained organization, full‑dimensional fusion, precise sharing, and secure integration, forming three major systems: big‑data perception, data resource integration, and data sharing services.
Data Architecture
For heterogeneous multi‑source scenarios, the data architecture provides stable, efficient support for data access, fusion, and intelligent applications, covering raw, resource, topic, and business libraries, as well as a knowledge base, index library, metadata repository, and experiment space.
Data Unified Collection and Access Platform
Platform Architecture
The platform adopts a unified data access model, providing modular, standardized ingestion for multi‑source heterogeneous data, supporting comprehensive collection, dynamic configuration, task scheduling, encryption, and breakpoint‑resume capabilities while maintaining data catalogs and lineage.
Data Flow
The platform offers one‑stop migration, preliminary cleaning, visual task scheduling, and supports downstream analytics and governance.
Platform Functions
Data ingestion supporting relational databases, NoSQL, distributed storage, streaming systems, message middleware, and file systems.
Configurable ingestion strategies for structured, semi‑structured, and unstructured data, with plugin‑based parsers and adapters.
Breakpoint‑resume for reliable transmission and recovery.
Task management with monitoring, exception handling, and rescheduling.
Data cleaning to filter non‑conforming records.
Statistical reporting on ingestion volume, file counts, latency, sources, and destinations.
Data reconciliation to verify completeness and consistency.
Quality assessment and reporting.
Data Centralized Processing Platform
Processing standardizes data through extraction, cleaning, linking, identification, and objectification, supporting both real‑time and batch computation, and incorporates AI techniques such as graph computation, in‑memory processing, model inference, and knowledge graph construction.
Platform Architecture
The open architecture enables dynamic workflow orchestration and integrates NLP, multimedia processing, and machine learning for intelligent perception.
Data Flow
Key functions include data extraction (producing EXF files), cleaning, linking heterogeneous sources, comparison (structured and unstructured), identification, error correction, and task scheduling with fault‑tolerant, breakpoint‑resume capabilities.
Data Organization Management Platform
Platform Architecture
The platform manages data through layered libraries: raw library (stores original formats), resource library (standardized public datasets), topic library (customized per business domain), and business library (supports specific operational needs). A knowledge library aggregates domain expertise and rules.
Data Global Governance Platform
Platform Architecture
Governance covers standard management, metadata management, asset management, quality control, operations monitoring, and lifecycle management, ensuring data security, classification, and access control throughout the data lifecycle.
Data Flow
Metadata and lineage management enable traceability, while quality management provides monitoring, analysis, and continuous improvement.
Data Quality Management Platform
Standard management enforces uniform data definitions; lifecycle management spans acquisition, storage, integration, analysis, presentation, archiving, and destruction. Quality metrics quantify data health, guiding cleaning, integration, and value extraction.
Data Sharing Service Platform
Platform Architecture
The service bus exposes APIs for data services, supporting registration, approval, publishing, discovery, and secure access. It enables subscription, push, download, and exchange services across nodes.
Data Flow
Use cases include service subscription, data push, and download, with authentication and cross‑domain capabilities.
Data Visualization Platform
Platform Architecture
The platform offers over 50 visualization components, including charts, maps, dashboards, and 3D graphics, supporting ad‑hoc queries, data insights, mobile reports, large‑screen displays, and interactive analysis.
Data Flow
Features include flexible layout, what‑if analysis, drill‑down, multi‑dimensional queries, 2D/3D linkage, historical playback, and report generation.
Source: Data Academy
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data Thinking Notes
Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
