Databases 6 min read

MongoDB MMAPv1 Storage Engine: Data Organization and Record Management

This article explains how MongoDB's MMAPv1 storage engine organizes databases, namespaces, data files, extents, and records, detailing the structures, write, delete, update, and query processes, and how space reclamation and fragmentation are handled.

Architect
Architect
Architect
MongoDB MMAPv1 Storage Engine: Data Organization and Record Management

Database

Each MongoDB database consists of a .ns file and a series of data files (mydb.0, mydb.1, …) whose sizes start at 64 MB and double up to a maximum of 2 GB.

Namespace

Every database contains multiple namespaces (MongoDB collections). The .ns file is a hash table that maps a namespace name to its metadata; each entry occupies 628 bytes and can store up to 26 715 namespaces in a 16 MB file.

The namespace metadata includes a fixed‑length 128‑byte key, a hash value, and a value structure that holds further details such as DiskLoc pointers to data file offsets.

Data Files

Data files are divided into extents, each belonging to a single namespace and linked together as a doubly‑linked list. The file header stores version, size, free space information, and pointers to the first and last extents.

Extent

Each extent contains multiple records (MongoDB documents) organized as a doubly‑linked list.

Record

A record represents a MongoDB document and begins with a fixed 16‑byte descriptor. Deleted records are stored as DeleteRecord structures that share the first two fields with normal records.

Writing a Record

Check the namespace’s deleted‑record list for a suitable free slot.

If none, look for a free extent in the file’s free list.

If still none, allocate a new extent (or a new data file if needed) and write the record.

Deleting a Record

Deleted records are inserted into the namespace’s deleted‑record list; they may be reused later, but if future writes never match the size class, the space remains fragmented. Running a compact operation can reclaim such fragmentation.

Updating a Record

If the new record is smaller, update in place and possibly add the leftover space to the deleted‑record list.

If larger, treat as delete + insert; the old space becomes a DeletedRecord.

Frequent updates can cause fragmentation; setting appropriate Record Padding can mitigate this.

Querying a Record

Without indexes, a query must scan the entire collection; creating indexes on frequently queried fields improves performance.

Source: Database Kernel Monthly

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Storage EngineDatabase ArchitectureMongoDBdata organizationMMAPv1
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.