Fundamentals 14 min read

Unveiling UFS: From MPHY to Data Transfer – A Deep Technical Dive

Over the past decade, mobile storage has evolved from eMMC to high‑speed UFS, and this article explains UFS’s history, the MPHY physical layer, Unipro protocol stack, UFSHCI host controller, initialization sequence, and data‑transfer pathways, highlighting key registers, lane configurations, and performance metrics.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Unveiling UFS: From MPHY to Data Transfer – A Deep Technical Dive

Introduction

In the last ten years, explosive demand for mobile storage has driven the transition from eMMC to the high‑performance Universal Flash Storage (UFS) standard. JEDEC released the UFS 1.0 specification in 2011, and from 2015 manufacturers such as Samsung, Toshiba and Hynix began shipping UFS 2.0 devices. The article examines the advantages, features, and operation flow of UFS.

MPHY Physical Layer

UFS uses the MIPI‑Alliance MPHY protocol for its low‑level serial differential signaling. MPHY achieves high speed through serial transmission and stability through differential signaling, which cancels common‑mode noise. Typical UFS devices employ two lanes (each lane carries both TX and RX). Even if one lane is lost, the remaining lane can continue operation.

The MPHY state machine defines four power states—POWERED, HIBERN8, STALL/SLEEP, and BURST. During heavy data transfer the link operates in HS‑BURST mode, with sub‑gears (HS‑G1 to HS‑G4) that determine the per‑lane data rate. With a 2‑lane configuration and a rateB setting, the theoretical bandwidth is 2915.2 MB/s; after 8b/10b encoding the usable bandwidth is about 2332 MB/s. Recent UFS 3.0 devices approach this limit, achieving ~2279.8 MB/s, and future revisions may introduce HS‑G5 and more efficient 128b/130b encoding.

Unipro Protocol Stack

UFS adopts the Unified Protocol (Unipro) version 1.8, mirroring the OSI model. The stack consists of the Medium layer (differential signals), PHY Adaptor (L1.5) transmitting PACP frames, and the LA (Application) layer carrying UPIU (UFS Protocol Information Units). Device Management Entity (DME) controls all layers via a unified UIC interface, routing register accesses through LayerID bits [14:12] of the address.

Register address ranges identify the layer: MPHY registers use 0xXX, PA registers 0x15xx, DME registers 0xDxxx. MPHY registers also encode lane selection in the lowest bits (0x0, 0x1, 0x4, 0x5 for TX‑lane0, TX‑lane1, RX‑lane0, RX‑lane1).

UFS Host Controller Interface (UFSHCI)

The only external interface of a UFS module is the UFSHCI, accessed by the CPU over APB and AXI buses. APB maps the host‑controller registers into kernel virtual address space, while AXI handles DMA buffers for data transfer.

Key UFSHCI structures include:

Six blocks of host‑controller registers (Capability to Vendor‑Specific).

UTP Transfer Request Descriptor (UTRD) list pointers for command submission.

32 UTRDs, providing 32 concurrent slots for the host.

Physical Region Description Table (PRDT) entries that describe up to 128 non‑contiguous data buffers, each up to 64 KB, yielding a maximum per‑request payload of 8 MB.

Eight UTMR registers for monitoring and aborting outstanding requests.

Four UIC command registers (UICCMD, UCMDARG1‑3) allow the CPU to read/write any Unipro or MPHY register through the host controller.

Initialization Flow

The driver initialization follows the typical Linux storage driver pattern: ufs_qcom_probe: entry point for Qualcomm platforms. ufshcd_pltfrm_init: allocate I/O space, obtain IRQ, and reserve HBA resources. ufshcd_init: set up data structures, register interrupts, allocate DMA memory, enable the controller, and reset the device. ufshcd_async_scan: asynchronous device discovery. ufshcd_probe_hba: link establishment, device identification, and SCSI registration.

After the controller is enabled, DMA buffers are programmed into UFSHCI registers, and data exchange proceeds via a base + index × unit addressing scheme.

Data Transfer Path

Data movement proceeds through three stages:

System call parameters (fd, buf, count) are packaged into kiocb and iov_iter, which convey the user buffer address to the kernel. bio structures are created and placed on the elevator’s plug list; they may be merged to reduce request frequency.

When a plug list threshold is reached, the bio is converted to a request, then to a SCSI command, and finally encapsulated in a UPIU packet sent through the UFSHCI doorbell register.

The _blkdev_direct_IO function transfers the disk sector address to bio->bi_sector while iov_iter supplies the buffer. The __blk_segment_map_sg routine maps the buffer pages to a scatter‑gather list, which the PRDT consumes for DMA.

Read operations fetch data from the flash into the user buffer; write operations perform the inverse. The article omits a repeat of the write flow for brevity.

References

[1] MIPI‑M‑PHY Specification v4.1 (JEDEC Liaison Disclosure)

[2] JESD223D

[3] JESD220C

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MobilestorageUFSdriverMPHYUFSHCIUnipro
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.