How Do Journaling File Systems Compare? Ext3, ReiserFS, XFS, and JFS Explained
This article explains the design and operation of journaling file systems, compares Ext3, ReiserFS, XFS, and JFS across their logging modes, internal structures, and performance results from Postmark and Bonnie++ tests, and offers practical conclusions for different workload scenarios.
Overview
A journaling file system records every modification in a sequential log before applying the change to the main file system structures. In the event of a crash, the log can be replayed to restore a consistent state. Linux provides several journaled file systems: Ext3 (derived from Ext2), ReiserFS (object‑oriented, B+‑tree based), XFS (high‑performance 64‑bit), and JFS (IBM’s 64‑bit journal).
Ext3
Ext3 is fully compatible with Ext2 and adds a journaling layer. Each high‑level modification is performed in two steps: the block is first copied to the journal; after the journal write completes, the block is written to its final location. The original copy in the journal is then discarded.
Journaling modes
Journal : both data and metadata are logged. This provides the highest safety but incurs the most I/O.
Ordered (default): only metadata are logged, but data blocks are flushed to disk before the corresponding metadata commit, protecting against torn writes.
Writeback : only metadata are logged; data may be written after the metadata commit, offering the best performance at the cost of data integrity.
Journaling Block Device (JBD)
Ext3 relies on the generic kernel layer JBD. JBD groups log records into transactions, ensures atomicity of system calls, and writes all records of a transaction to contiguous log blocks before they can be reclaimed.
ReiserFS
ReiserFS was designed from scratch using object‑oriented principles. It consists of a semantic layer (namespace and interfaces) and a storage layer (disk management) linked by globally unique keys. Data are stored in a B+‑tree.
Item structure
Each B+‑tree node contains items that act as the basic storage unit. An item comprises:
Item_body // data payload
Item_key // unique key for the item
Item_offset // offset of the data within the node
Item_length // length of the data payload
Item_Plugin_id // identifier of the item typeDifferent item types store static statistics, directory entries, pointers, or file fragments.
Journaling optimizations
Copy‑on‑capture : when two concurrent transactions modify the same block, the block is duplicated so both transactions can proceed independently.
Steal‑on‑capture : if multiple transactions modify the same block, only the latest transaction actually writes the block back, reducing redundant writes.
XFS
XFS is a 64‑bit high‑performance file system originally from SGI. The underlying device is divided into multiple allocation groups; each group manages its own inode tables and free‑space maps, enabling parallel I/O.
Metadata log
XFS uses a logical log that records only metadata changes. The log can be placed on a separate device to reduce contention.
Delayed allocation
Writes are buffered until the file is closed or the buffer is flushed. At that point XFS allocates a contiguous region for the data, improving write throughput and reducing fragmentation. For short‑lived temporary files, allocation may be avoided entirely.
JFS
JFS (IBM) provides byte‑level transactional semantics and is fully 64‑bit. Disk space is organized into aggregates (pools) and file sets (sub‑trees). Each aggregate contains an Aggregate Inode Table (AIT) and an allocation map.
Directory organization
Small directories are stored directly in the inode (up to eight entries).
Large directories use a B+‑tree indexed by name, offering faster lookup, insertion, and deletion.
Addressing
File data are described by a three‑tuple <logical offset, length, physical address>. These tuples are stored in a B+‑tree rooted in the inode, allowing efficient mapping from logical file offsets to physical disk locations.
Performance test
Test environment
Two benchmark suites were used:
Postmark : simulates small‑file workloads typical of mail or e‑commerce systems.
Bonnie++ : measures sequential and random I/O with large files.
Key results
Across all file systems, writeback mode was fastest, ordered intermediate, and journal slowest.
ReiserFS consistently outperformed Ext3, XFS, and JFS in writeback and ordered modes, especially as the number of files grew.
Ext3 performed well with a small number of files (<10 k) but degraded sharply beyond that threshold.
XFS and JFS were slower overall but showed a more gradual performance decline with increasing file count.
For large‑file sequential writes, all file systems achieved similar throughput; the only outliers were the journal mode of Ext3 and ReiserFS, which were slower.
File creation and deletion (metadata‑only operations) showed negligible differences among Ext3 and ReiserFS modes; however, ReiserFS’s tree‑based directories gave it a clear advantage in overall metadata handling.
Conclusions
For small‑scale systems (mail, small e‑commerce), ReiserFS and Ext3 provide good performance, with ReiserFS having an edge when directories contain many files.
For large‑file I/O, the underlying storage device is the primary bottleneck; file‑system choice matters less.
XFS and JFS are architecturally superior for medium‑to‑large deployments but do not show clear benefits on modest hardware.
The default ordered mode offers a balanced trade‑off between safety and speed and is recommended for most deployments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
