Databases 12 min read

Formal Naming of Data Schemas, Structures, and Models: Distinctions and Methodology

The article explains the differences between data schemas, data structures, and data models, proposes a systematic naming approach, and outlines a five‑schema architecture—including business, view, logical, deployment, and physical schemas—while addressing terminology challenges and normalization processes.

Architects Research Society
Architects Research Society
Architects Research Society
Formal Naming of Data Schemas, Structures, and Models: Distinctions and Methodology

After discussing data architecture and data structures, the article raises the question of how data schemas, data structures, and data models should be formally named and distinguished.

Historically, many attempts have been made to name specific data structures and the models that contain them, often referencing the Zachman framework, which resulted in a confusing and chaotic set of names.

A better method is proposed: first formally name each data schema (outline), then name data structures based on the schema and domain, and finally name data models according to the structures they contain.

A schema is defined as a diagrammatic representation, a structured framework or outline; a data schema is essentially a diagram of a data structure. A data structure represents the arrangement, relationships, and content of data resources and is tied to formal names, comprehensive definitions, and precise integrity rules.

With the emergence of databases, two schemas were identified: an internal (physical) schema describing how data is stored, and an external (business) schema describing how applications use the data.

Because internal and external schemas often differ, multiple external schemas can be derived from a single internal schema, leading to the concept of a conceptual (logical) schema as a common denominator.

The internal schema was renamed to physical schema, the external schema to business schema, and the conceptual schema to logical schema, establishing a development order of business → logical → physical.

The article notes that normalizing a business data schema into a logical schema separates data, making it difficult to group similar entities (e.g., employees, students), and the lack of techniques for regrouping creates data gaps.

Introducing data view schemas as the result of normalization allows similar data to be recombined; the optimization process yields the sequence: business schema → data view schema → logical schema → physical schema.

Before distributed data processing, the four‑schema sequence worked well; adding a deployment schema between logical and physical schemas resolves confusion, with logical schemas deployed to deployment schemas via a de‑optimization process.

The final sequence is: business schema, normalized to data view schema, optimized to logical schema, de‑optimized to deployment schema, and finally denormalized to physical schema, forming five basic schemas used in formal data resource design.

These five basic schemas are effective for detailed design, but terminology for the conceptual schema remains problematic; the conceptual schema can be interpreted in many ways, lacking clear meaning for business professionals.

To solve this, the article proposes strategic and tactical data schemas: strategic schemas represent executive‑level perspectives, while tactical schemas represent management‑level perspectives.

Strategic and tactical schemas are logical constructs based on an organization’s perception of its business world; they sit atop the logical schema, can be refined into each other, and through specialization generate logical schemas, while promotion can reverse the process.

The two main parts of the data schema landscape are: general schemas (strategic and tactical) and detailed schemas (business, data view, logical, deployment, and physical), together forming a three‑layer, five‑schema concept.

The article discusses the possibility of eight generic schemas but concludes they may be meaningless compared to the formal five‑schema approach.

Formal naming of the seven schemas (five detailed plus two general) can be prefixed with domain areas to create clear data structure names such as "facility strategic data structure" or "employee logical data structure," aiding both business and data management professionals.

A data model must include formal names, comprehensive definitions, and precise integrity rules; when combined with data structures, models follow the same naming conventions (e.g., "facility strategic model").

The article emphasizes that data management professionals must formally name schemas, structures, and models, develop them within a unified architecture, and apply formal processes for normalization, optimization, de‑optimization, and specialization; otherwise, organizations face confusion, increased data discrepancies, and insufficient support for business information needs.

Source: http://jiagoushi.pro/node/1019

Data ModelingDatabase DesignData Structurenormalizationdata schema
Architects Research Society
Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.