Databases 15 min read

InnoDB Buffer Pool Management Mechanism and Implementation Details

This article explains the theory behind InnoDB's log management, details the architecture and dynamic sizing of the Buffer Pool, describes its internal data structures and multi‑instance implementation, and provides annotated source code snippets to illustrate how MySQL allocates and manages buffer pages.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
InnoDB Buffer Pool Management Mechanism and Implementation Details

InnoDB, the transactional storage engine of MySQL, is built on the ARIES recovery algorithm, which defines the fundamentals of logging, rollback, redo, concurrency control, and buffer pool management to ensure ACID properties.

InnoDB Buffer Pool Overview

The Buffer Pool stores frequently accessed data pages in a contiguous memory region, using an LRU algorithm to keep hot pages readily available. Its size is configured via the innodb_buffer_pool_size parameter; prior to MySQL 5.7.5 the size could not be changed without restarting the server, but newer versions allow dynamic resizing.

When the Buffer Pool size exceeds 1 GB, it is recommended to split it into multiple instances using innodb_buffer_pool_instances to reduce lock contention and improve concurrency.

Internal Data Structures

A Buffer Pool instance is represented by the buf_pool_t structure, which contains four main components:

FREE list – stores all free pages.

flush_list – stores dirty pages that need to be written back.

mutex – protects the instance from concurrent access.

chunks – points to the first physical page of the instance.

Each physical page is described by a buf_page_t structure, while the control information for a page is held in a buf_block_t structure. The relationship between these structures is illustrated in the source code.

Instance Initialization

The function buf_pool_init_instance() initializes a Buffer Pool instance. The following code shows how the memory for a chunk is allocated and how pages are linked to control blocks:

/* 一个Buffer Pool的实例大小 */
chunk->mem_size = mem_size;
/* 申请对应大小的内存空间。这里虽然申请了,但并不能真正直接得到这部分空间,而是通过mmap映射到相应大小的空间,在后面真正使用到内存页面的时候,才慢慢地逐渐分配真实的空间 */
chunk->mem = os_mem_alloc_large(&chunk->mem_size);
/* Allocate the block descriptors from the start of the memory block. */
chunk->blocks = (buf_block_t*) chunk->mem;
/* frame指向的是物理Buffer Pool页面,所以需要以UNIV_PAGE_SIZE大小对齐 */
frame = (byte*) ut_align(chunk->mem, UNIV_PAGE_SIZE);
/* chunk的单位是Buffer Pool中的页面个数 */
chunk->size = chunk->mem_size / UNIV_PAGE_SIZE - (frame != chunk->mem);
while (frame < (byte*) (chunk->blocks + size)) {
    /* 在这个循环中,frame从前向后,而chunk->blocks + size是从后向前,当第一次frame比blocks的最后一个大的时候,停止循环。 */
    frame += UNIV_PAGE_SIZE;
    size--;
}
chunk->size = size;

After allocating the chunk, each buf_block_t is initialized and added to the free list:

buf_block_init(buf_pool, block, frame);
UNIV_MEM_INVALID(block->frame, UNIV_PAGE_SIZE);
UT_LIST_ADD_LAST(list, buf_pool->free, (&block->page));
ut_d(block->page.in_free_list = TRUE);

Multi‑Instance Management

The global function buf_pool_init() creates the array of Buffer Pool instances based on the total size and the number of instances, then initializes each instance individually:

dberr_t
buf_pool_init(
    ulint   total_size,   /*!< in: size of the total pool in bytes */
    ibool   populate,    /*!< in: virtual page preallocation */
    ulint   n_instances)   /*!< in: number of instances */
{
    ulint i;
    const ulint size = total_size / n_instances;
    buf_pool_ptr = (buf_pool_t*) mem_zalloc(n_instances * sizeof *buf_pool_ptr);
    for (i = 0; i < n_instances; i++) {
        buf_pool_t* ptr = &buf_pool_ptr[i];
        if (buf_pool_init_instance(ptr, size, populate, i) != DB_SUCCESS) {
            buf_pool_free(i);
            return(DB_ERROR);
        }
    }
    buf_pool_set_sizes();
    buf_LRU_old_ratio_update(100*3/8, FALSE);
    btr_search_sys_create(buf_pool_get_curr_size() / sizeof(void*) / 64);
    return(DB_SUCCESS);
}

The helper buf_pool_get() maps a page's space ID and offset to the appropriate Buffer Pool instance using a hash‑based fold calculation:

buf_pool_t*
buf_pool_get(
    ulint   space,   /*!< in: space id */
    ulint   offset)  /*!< in: offset of the page within space */
{
    ulint   fold;
    ulint   index;
    ulint   ignored_offset;
    ignored_offset = offset >> 6; /* 2log of BUF_READ_AHEAD_AREA (64) */
    fold = buf_page_address_fold(space, ignored_offset);
    index = fold % srv_buf_pool_instances;
    return(&buf_pool_ptr[index]);
}

By using multiple independent Buffer Pool instances, MySQL reduces lock contention and improves performance when handling many concurrent page accesses.

In summary, the Buffer Pool is a critical component of InnoDB that caches data pages, manages them through sophisticated data structures, supports dynamic resizing, and can be split into multiple instances to achieve better scalability and reliability.

【END】

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

InnoDBmysqlDatabase Internalsbuffer pool
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.