Fundamentals 21 min read

How Do Journaling File Systems Compare? Ext3, ReiserFS, XFS, and JFS Explained

This article explains the design and operation of journaling file systems, compares Ext3, ReiserFS, XFS, and JFS across their logging modes, internal structures, and performance results from Postmark and Bonnie++ tests, and offers practical conclusions for different workload scenarios.

Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
How Do Journaling File Systems Compare? Ext3, ReiserFS, XFS, and JFS Explained

Overview

A journaling file system records every modification in a sequential log before applying the change to the main file system structures. In the event of a crash, the log can be replayed to restore a consistent state. Linux provides several journaled file systems: Ext3 (derived from Ext2), ReiserFS (object‑oriented, B+‑tree based), XFS (high‑performance 64‑bit), and JFS (IBM’s 64‑bit journal).

Ext3

Ext3 is fully compatible with Ext2 and adds a journaling layer. Each high‑level modification is performed in two steps: the block is first copied to the journal; after the journal write completes, the block is written to its final location. The original copy in the journal is then discarded.

Journaling modes

Journal : both data and metadata are logged. This provides the highest safety but incurs the most I/O.

Ordered (default): only metadata are logged, but data blocks are flushed to disk before the corresponding metadata commit, protecting against torn writes.

Writeback : only metadata are logged; data may be written after the metadata commit, offering the best performance at the cost of data integrity.

Journaling Block Device (JBD)

Ext3 relies on the generic kernel layer JBD. JBD groups log records into transactions, ensures atomicity of system calls, and writes all records of a transaction to contiguous log blocks before they can be reclaimed.

ReiserFS

ReiserFS was designed from scratch using object‑oriented principles. It consists of a semantic layer (namespace and interfaces) and a storage layer (disk management) linked by globally unique keys. Data are stored in a B+‑tree.

Item structure

Each B+‑tree node contains items that act as the basic storage unit. An item comprises:

Item_body   // data payload
Item_key    // unique key for the item
Item_offset // offset of the data within the node
Item_length // length of the data payload
Item_Plugin_id // identifier of the item type

Different item types store static statistics, directory entries, pointers, or file fragments.

Journaling optimizations

Copy‑on‑capture : when two concurrent transactions modify the same block, the block is duplicated so both transactions can proceed independently.

Steal‑on‑capture : if multiple transactions modify the same block, only the latest transaction actually writes the block back, reducing redundant writes.

XFS

XFS is a 64‑bit high‑performance file system originally from SGI. The underlying device is divided into multiple allocation groups; each group manages its own inode tables and free‑space maps, enabling parallel I/O.

Metadata log

XFS uses a logical log that records only metadata changes. The log can be placed on a separate device to reduce contention.

Delayed allocation

Writes are buffered until the file is closed or the buffer is flushed. At that point XFS allocates a contiguous region for the data, improving write throughput and reducing fragmentation. For short‑lived temporary files, allocation may be avoided entirely.

JFS

JFS (IBM) provides byte‑level transactional semantics and is fully 64‑bit. Disk space is organized into aggregates (pools) and file sets (sub‑trees). Each aggregate contains an Aggregate Inode Table (AIT) and an allocation map.

Directory organization

Small directories are stored directly in the inode (up to eight entries).

Large directories use a B+‑tree indexed by name, offering faster lookup, insertion, and deletion.

Addressing

File data are described by a three‑tuple <logical offset, length, physical address>. These tuples are stored in a B+‑tree rooted in the inode, allowing efficient mapping from logical file offsets to physical disk locations.

Performance test

Test environment

Two benchmark suites were used:

Postmark : simulates small‑file workloads typical of mail or e‑commerce systems.

Bonnie++ : measures sequential and random I/O with large files.

Key results

Across all file systems, writeback mode was fastest, ordered intermediate, and journal slowest.

ReiserFS consistently outperformed Ext3, XFS, and JFS in writeback and ordered modes, especially as the number of files grew.

Ext3 performed well with a small number of files (<10 k) but degraded sharply beyond that threshold.

XFS and JFS were slower overall but showed a more gradual performance decline with increasing file count.

For large‑file sequential writes, all file systems achieved similar throughput; the only outliers were the journal mode of Ext3 and ReiserFS, which were slower.

File creation and deletion (metadata‑only operations) showed negligible differences among Ext3 and ReiserFS modes; however, ReiserFS’s tree‑based directories gave it a clear advantage in overall metadata handling.

Conclusions

For small‑scale systems (mail, small e‑commerce), ReiserFS and Ext3 provide good performance, with ReiserFS having an edge when directories contain many files.

For large‑file I/O, the underlying storage device is the primary bottleneck; file‑system choice matters less.

XFS and JFS are architecturally superior for medium‑to‑large deployments but do not show clear benefits on modest hardware.

The default ordered mode offers a balanced trade‑off between safety and speed and is recommended for most deployments.

ReiserFS B+ tree diagram
ReiserFS B+ tree diagram
ReiserFS item structure
ReiserFS item structure
JFS disk structure
JFS disk structure
Test environment
Test environment
PostMark small‑file results
PostMark small‑file results
PostMark large‑file results
PostMark large‑file results
Bonnie++ sequential write rate
Bonnie++ sequential write rate
Bonnie++ CPU utilization during sequential write
Bonnie++ CPU utilization during sequential write
Bonnie++ sequential file creation
Bonnie++ sequential file creation
Bonnie++ random file creation
Bonnie++ random file creation
Bonnie++ random file deletion
Bonnie++ random file deletion
Bonnie++ CPU utilization during random deletion
Bonnie++ CPU utilization during random deletion
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance TestinglinuxXFSExt3JFSjournaling file systemReiserFS
Art of Distributed System Architecture Design
Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.