Databases 21 min read

Mastering One-to-N Relationships in MongoDB: Practical Design Patterns and Tips

This multi‑part guide explains how to model One‑to‑N relationships in MongoDB, covering basic patterns for one‑to‑few, one‑to‑many, and one‑to‑squillions, then advancing to two‑way referencing and denormalization, and finally offering a concise set of rules of thumb for choosing the right schema design.

Aotu Lab
Aotu Lab
Aotu Lab
Mastering One-to-N Relationships in MongoDB: Practical Design Patterns and Tips

Translated from the official MongoDB blog, this article answers the common question of how to implement One‑to‑N relationships in MongoDB, a topic that has many possible approaches.

Part 1

Basics: Modeling One‑to‑Few

Embedding an array of sub‑documents is ideal for small cardinalities such as a person's addresses. Example:

db.person.findOne()
{
  name: 'Kate Monster',
  ssn: '123-456-7890',
  addresses: [
    { street: '123 Sesame St', city: 'Anytown', cc: 'USA' },
    { street: '123 Avenue Q', city: 'New York', cc: 'USA' }
  ]
}

Embedding avoids extra queries but prevents independent access to the embedded documents.

Basics: One‑to‑Many

When a product can have hundreds of parts, store an array of ObjectID references in the product document:

db.parts.findOne()
{
  _id: ObjectID('AAAA'),
  partno: '123-aff-456',
  name: '#4 grommet',
  qty: 94,
  cost: 0.94,
  price: 3.99
}

db.products.findOne()
{
  name: 'left-handed smoke shifter',
  manufacturer: 'Acme Corp',
  catalog_number: 1234,
  parts: [ ObjectID('AAAA'), ObjectID('F17C'), ObjectID('D2AA') ]
}

Application‑level joins retrieve the related parts, and appropriate indexes on products.catalog_number and parts._id keep queries efficient.

Basics: One‑to‑Squillions

For massive log data, use parent‑referencing: each log entry stores the host's ObjectID, while the host document stores minimal identifying fields.

db.hosts.findOne()
{
  _id: ObjectID('AAAB'),
  name: 'goofy.example.com',
  ipaddr: '127.66.66.66'
}

db.logmsg.findOne()
{
  time: ISODate('2014-03-28T09:42:41.382Z'),
  message: 'cpu is on fire!',
  host: ObjectID('AAAB')
}

Application‑level joins can fetch the latest 5,000 messages for a host efficiently.

Part 2

This section introduces more advanced designs: two‑way referencing and denormalization.

Intermediate: Two‑Way Referencing

Store references in both directions so that a Person document contains an array of Task ObjectIDs and each Task document also stores the owning Person's ObjectID. This enables fast look‑ups from either side but requires two updates when reassigning a task.

db.person.findOne()
{
  _id: ObjectID('AAF1'),
  name: 'Kate Monster',
  tasks: [ ObjectID('ADF9'), ObjectID('AE02'), ObjectID('AE73') ]
}

db.tasks.findOne()
{
  _id: ObjectID('ADF9'),
  description: 'Write lesson plan',
  due_date: ISODate('2014-04-01'),
  owner: ObjectID('AAF1')
}

Intermediate: Denormalizing with One‑to‑Many

Denormalize fields that are read often but updated rarely. For the product‑parts example, embed part names directly in the product document to avoid a join when displaying names, while still keeping ObjectID references for full part details.

db.products.findOne()
{
  name: 'left-handed smoke shifter',
  manufacturer: 'Acme Corp',
  catalog_number: 1234,
  parts: [
    { id: ObjectID('AAAA'), name: '#4 grommet' },
    { id: ObjectID('F17C'), name: 'fan blade assembly' },
    { id: ObjectID('D2AA'), name: 'power switch' }
  ]
}

When the read‑to‑write ratio is high, denormalization improves performance; otherwise the extra update cost outweighs the benefit.

Denormalizing from Many to One

For log messages, embed the host's IP address or hostname directly in each log document, allowing a single query to filter by IP without a join.

db.logmsg.findOne()
{
  time: ISODate('2014-03-28T09:42:41.382Z'),
  message: 'cpu is on fire!',
  ipaddr: '127.66.66.66',
  host: ObjectID('AAAB')
}

When a host needs to keep only the most recent 1,000 messages, use the $push/$each/$sort/$slice operators to maintain a bounded array in the host document.

// Insert log entry and push to host's array, keeping only the latest 1000
logmsg = { time: new Date(), message: log_message_here, ipaddr: log_ip, host: host_id };
db.logmsg.save(logmsg);

db.hosts.update(
  { _id: host_id },
  { $push: { logmsgs: { $each: [ { time: logmsg.time, message: logmsg.message } ], $sort: { time: 1 }, $slice: -1000 } } }
);

Part 3

The final part revisits the three basic patterns (embed, child‑reference, parent‑reference) and expands on two‑way referencing and denormalization, emphasizing that the choice depends on cardinality, independent access needs, and read/write ratios.

Rules of Thumb: Your Guide Through the Rainbow

One: Prefer embedding unless there is a compelling reason not to.

Two: Need independent access to N‑side objects? Use referencing.

Three: Avoid unbounded array growth; high‑cardinality arrays should not be embedded.

Four: Application‑level joins are cheap when proper indexes and projection specifications are used.

Five: Denormalize only when the read‑to‑write ratio is high for that field.

Six: Model data based on the specific application's access patterns.

Your Guide To The Rainbow

When modeling One‑to‑N relationships in MongoDB, first determine the cardinality (one‑to‑few, one‑to‑many, or one‑to‑squillions), whether N‑side documents need independent access, and the read/write ratio for each field.

For one‑to‑few, embed an array of sub‑documents.

For one‑to‑many or when N‑side documents must exist independently, use a reference array; consider parent‑referencing if it fits the access pattern.

For one‑to‑squillions, store a parent reference in each N‑side document.

After establishing the overall structure, you may denormalize selective fields from the "One" side to the "N" side (or vice‑versa) when those fields are read far more often than they are updated and do not require strong consistency.

Productivity and Flexibility

MongoDB gives you the flexibility to design schemas that match your application's needs, allowing you to evolve the data model as requirements change while keeping queries and updates efficient.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

EmbeddingMongoDBschema designDenormalizationOne-to-NReferencing
Aotu Lab
Written by

Aotu Lab

Aotu Lab, founded in October 2015, is a front-end engineering team serving multi-platform products. The articles in this public account are intended to share and discuss technology, reflecting only the personal views of Aotu Lab members and not the official stance of JD.com Technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.