Why Choosing the Right Data Model Matters: Relational vs Document vs Graph
This article explains how different data models—from relational tables to JSON documents and graph structures—affect software design, storage, querying, and scalability, illustrating concepts with a resume example and discussing trade‑offs such as impedance mismatch, normalization, and multi‑entity relationships.
Data models shape not only how software is written but also how developers conceptualize the problems they solve.
Most applications are built by stacking layers of data models, each layer needing to represent the one below it.
Developers observe the real world (people, organizations, goods, actions, money flows, sensors) and model it using objects or data structures specific to the application.
When these structures must be stored, a generic data model such as JSON, XML, relational tables, or a graph is used.
DBAs decide how to represent JSON/XML/relational/graph data on memory, disk, or network, supporting queries, searches, and operations.
Hardware engineers ultimately represent bytes as electrical signals, optical pulses, or magnetic fields.
1 Relational and Document Models
SQL, based on Edgar Codd’s 1970 relational model, organizes data into relations (tables) composed of unordered tuples (rows). Although initially theoretical, relational DBMSs became the dominant tool for storing and querying structured data in the mid‑1980s and have remained prevalent for over three decades.
Typical relational use cases include transaction processing (e.g., sales, banking, ticketing) and batch processing (e.g., invoices, payroll).
Relational databases hide implementation details behind a simple interface, outlasting competing network and hierarchical models.
Modern relational systems have expanded beyond pure business data to support online publishing, forums, e‑commerce, games, and SaaS.
2 NoSQL
In the 21st century, NoSQL emerged as a challenger to relational dominance. The term originally served as a catchy label for open‑source, distributed, non‑relational databases and now is often interpreted as “Not only SQL.”
Drivers for adopting NoSQL include:
Better scalability for massive datasets or high write throughput.
Preference for free, open‑source software over commercial products.
Inability of relational models to support certain queries efficiently.
Desire for more dynamic and expressive data models.
Because different applications have varied requirements, relational databases will continue to coexist with non‑relational stores in hybrid persistence architectures.
3 Object‑Relational Mismatch
Most modern applications use OOP languages, but mapping objects to relational tables often requires a cumbersome conversion layer, known as impedance mismatch. ORM frameworks like Hibernate reduce boilerplate code but cannot fully hide the differences between object and relational models.
Resume Case
A LinkedIn‑style resume illustrates the mismatch. In a relational schema, a user is identified by a unique user_id. Core fields such as first_name and last_name appear once in the users table, while each user may have multiple jobs, education entries, and contacts, represented in separate tables linked by foreign keys.
Traditional relational representation (pre‑SQL‑1999) normalizes these one‑to‑many relationships into separate tables with foreign‑key references. Later SQL standards added support for structured types and XML/JSON columns, allowing multi‑value data to be stored in a single row, though querying such columns can be limited.
Storing the resume as a self‑contained JSON document places all related information in one place, enabling a single query to retrieve the entire profile.
Using IDs instead of raw text reduces redundancy: the meaningful string (e.g., a region name) is stored once, and all references use the numeric ID. This approach simplifies updates—changing the name in one place automatically propagates to all references, avoiding data inconsistency.
However, many‑to‑one relationships (e.g., multiple users living in the same region) are naturally expressed in relational databases via joins, whereas document stores lack strong join capabilities.
If a database does not support joins, the application must perform multiple queries and assemble the results, effectively moving join work from the database to the application layer.
As applications evolve, new features may require richer relationships. For example, treating organizations and schools as separate entities with their own metadata (logos, news feeds) or adding a “recommendation” feature where users can endorse others, necessitating many‑to‑many links.
4 Many‑to‑One and Many‑to‑Many
Standardizing identifiers (e.g., region_id, industry_id) offers several benefits:
Consistent styling and input values across all resumes.
Elimination of ambiguity (e.g., distinguishing cities with the same name).
Ease of bulk updates when names change.
Facilitated localization by mapping IDs to language‑specific labels.
Improved search capabilities through structured filters.
Storing IDs reduces duplication, ensuring that human‑readable information resides in a single location. When that information changes, only the ID mapping needs updating, preventing stale copies and reducing write overhead.
Conversely, storing raw text in every record leads to redundancy, higher storage costs, and the risk of inconsistent data.
When a database lacks native join support, applications must simulate joins with multiple queries or in‑memory caching, shifting relational logic from the DBMS to the codebase.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JavaEdge
First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
