Big Data 14 min read

Understanding Data Middle Platform Architecture and Its Core Components

The article explains the concept of a data middle platform, describing its architecture, the essential big‑data foundation, metadata management, data service components such as BI and tag systems, and how these layers together enable unified data access, governance, and business intelligence across enterprises.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Understanding Data Middle Platform Architecture and Its Core Components

Data middle platform (DMP) is not merely a single system or software tool; it represents a comprehensive architecture and data flow model that integrates data collection, processing, modeling, storage, and service delivery to empower business operations.

The DMP workflow starts with raw data acquisition, followed by data cleaning, transformation, and modeling, then categorised storage, and finally the provision of various data services—including data application platforms—to accelerate business empowerment.

Across different enterprises, the overall DMP architecture is largely consistent, typically progressing through four stages: data acquisition & ingestion, data processing & storage, unified data management, and service-oriented applications.

According to the book "Data Middle Platform Product Manager: From Data System to Practical Implementation," this architecture is sufficiently generic to be adapted for both internet and traditional industries, allowing each organization to design and build its own middle‑platform framework.

The functional architecture of a DMP consists of three major layers: a big‑data platform, a data asset management platform, and a data service platform. Among these, the self‑service analytics platform and the tag management system are the most widely used service components.

The big‑data platform serves as the foundation of the DMP, providing capabilities such as data storage, cleaning, computation, query, visualization, and permission management. Most implementations rely on the Hadoop ecosystem—using components like HBase or Hive for storage and Spark or Flink for distributed computation—though some companies develop proprietary solutions.

A successful big‑data platform is judged not by the number of cutting‑edge technologies it incorporates, but by its ability to solve complex data challenges, break data silos, and offer user‑friendly tools such as self‑service data ingestion and cleaning.

The data asset management platform unifies metadata across diverse components (Hive tables, HBase tables, Druid data sources, Kafka streams) to provide a single view of data resources. It handles three types of metadata: business metadata (describing business meaning and rules), technical metadata (describing data sources, structures, ETL scripts, and SQL), and management metadata (describing ownership and access control).

Metadata, often described as "data about data," includes information such as table names, creation details, field definitions, and relationships—essentially a dictionary that enables users to understand and trace data origins.

Data model management builds on metadata to create logical data models that map source tables, define relationships, and serve as implementation blueprints for thematic data domains, thereby ensuring data consistency and reducing redundancy.

The data service platform delivers business value through two key applications: a self‑service analytics (BI) platform and a tag management system. The BI platform supports data source integration, data modeling, processing, analysis, visualization, and content distribution, while also providing role‑based access control and operational management.

The tag management system acts as the backbone of user‑profile construction, unifying disparate user identifiers across business lines, managing tag taxonomies, and exposing tag data services via standardized APIs to enable personalized marketing and other downstream applications.

Beyond BI and tag management, enterprises should tailor data applications to their specific industry characteristics to fully exploit the value of the data middle platform.

big datametadatabusiness intelligencedata platformTag Managementdata architecture
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.