Formal Naming of Data Schemas, Structures, and Models: Distinctions and Hierarchies
The article explains how to formally name data schemas, structures, and models, clarifies the differences between internal, external, logical, and physical schemas, and proposes a systematic hierarchy—including strategic and tactical schemas—to improve data architecture and reduce confusion in enterprise data design.
After discussing data architecture and data structures, the article asks what distinguishes data architecture from data structure and how data schemas, structures, and models should be formally named.
Historically, many attempts to name specific data structures and the models that contain them—often referencing the Zachman framework—have produced a confusing and chaotic set of terms.
A better method is to formally name each data schema (outline), then name data structures based on the schema name and domain, and finally name data models according to the structures they contain.
A schema is a diagrammatic representation, a structured framework or outline; a data schema is the diagram of a data structure. A data structure represents the arrangement, relationships, and content of data resources and must be documented with formal names, comprehensive definitions, and precise integrity rules.
With the advent of databases in the mid‑20th century, two schemas were identified: an internal (physical) schema describing how data is stored, and an external (view) schema describing how applications use the data.
Because internal and external schemas often differ greatly, a third “conceptual” schema was defined as the common denominator, though its meaning has been ambiguous and misused.
To clarify, the internal schema was renamed physical schema, the external schema business schema, and the conceptual schema logical schema, establishing a development order from business to logical to physical.
As business professionals become involved in data design, questions about what data is actually being normalized arise; traditional normalization techniques lack business context, leading to the conclusion that business data schemas are normalized into logical schemas.
However, normalizing business data schemas does not directly yield logical schemas because normalization separates data, and there is no formal technique to group similar data (e.g., all employee data), creating data gaps.
Introducing data view schemas—produced by normalization—allows grouping similar data (employee, student, etc.) and optimizing the data set to prevent proliferation of disparate structures.
Before distributed data processing, the four‑step sequence (business → logical → physical) worked well; adding a deployment schema between logical and physical resolves confusion, with logical schemas deployed to deployment schemas via a “data de‑optimization” process.
The resulting sequence is: business schema → normalized to data view schema → optimized to logical schema → de‑optimized to deployment schema → denormalized to physical schema, forming five basic schemas used in formal data resource design.
Conceptual schema terminology remains problematic, so strategic and tactical data schemas are introduced, which are meaningful to business users: strategic schemas represent executive‑level perspectives, tactical schemas represent management‑level perspectives, both sitting on the logical schema and capable of specialization or generalization.
These seven formally named schemas can be prefixed with domain areas (e.g., facility strategic data structure) to provide clear, standardized names that aid both business and data‑management professionals.
A data model must include a formal name, a comprehensive definition, and precise integrity rules; combined with structures, it yields named models such as “facility strategic model” or “employee logical data model”.
Data‑management professionals must formally name schemas, structures, and models, follow rigorous processes of normalization, optimization, de‑optimization, and specialization, otherwise they risk chaos, increased data discrepancies, and insufficient support for business information needs.
Source: http://jiagoushi.pro/node/1019
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.