Unlocking MySQL 8.0 Optimizer: Cost Model Configuration and Histogram Usage
This article explains how MySQL 8.0 improves the optimizer by introducing configurable cost model constants and a histogram feature, showing how to query and update system tables, use ANALYZE TABLE to create and manage histograms, and explore the underlying code structures and future tuning possibilities.
Cost Model
MySQL historically used hard‑coded cost constants. Since MySQL 5.7 the constants are stored in the system tables mysql.server_cost and mysql.engine_cost, which can be inspected and modified at runtime.
SELECT * FROM mysql.server_cost;
UPDATE mysql.server_cost SET cost_value = 40 WHERE cost_name='disk_temptable_create_cost';
FLUSH OPTIMIZER_COSTS;After FLUSH OPTIMIZER_COSTS the new values are used by newly created sessions; existing sessions keep the previous values until their reference count drops to zero.
The optimizer keeps a global cache Cost_constant_cache. During thread start‑up ( lex_start()) each thread copies a pointer to the cached Cost_model_constants (which contains Server_cost_constants and SE_cost_constants) into THD::m_cost_model. The cache is refreshed by reading the system tables or by FLUSH OPTIMIZER_COSTS. Reference counting guarantees that a cached version is not freed while still used by any thread.
Each TABLE object holds a TABLE::m_cost_model of type Cost_model_table. The optimizer stores the result of a cost estimation in a Cost_estimate structure with four dimensions:
double io_cost; // I/O operations
double cpu_cost; // CPU operations
double import_cost; // remote operations
double mem_cost; // memory usage (bytes)Histogram
MySQL 8.0 introduced column histograms, which allow the optimizer to estimate predicate selectivity more accurately. Histograms are created with the ANALYZE TABLE … UPDATE HISTOGRAM statement.
ANALYZE TABLE sbtest1 UPDATE HISTOGRAM ON k WITH 10 BUCKETS;The metadata is stored in the hidden table mysql.column_statistics and exposed through the view INFORMATION_SCHEMA.COLUMN_STATISTICS. Example query:
SELECT JSON_PRETTY(HISTOGRAM)
FROM INFORMATION_SCHEMA.COLUMN_STATISTICS
WHERE SCHEMA_NAME='sb1' AND TABLE_NAME='sbtest1';A bucket contains the lower and upper bounds, the cumulative frequency, and the distinct‑value count. For singleton (equally‑wide) histograms each bucket stores a single value and its cumulative frequency. The default number of buckets is 100; the WITH N BUCKETS clause overrides it. The variable histogram_generation_max_mem_size limits the memory used during histogram construction.
Histograms are not refreshed automatically. After heavy DML you must run ANALYZE TABLE … UPDATE HISTOGRAM again.
The optimizer uses histogram statistics for predicates such as =, <, >, BETWEEN, IN, IS NULL, etc., improving plan choice without requiring indexes.
Implementation details (MySQL 8.0.12)
Histogram code resides in sql/histograms/ (e.g., equi_height.cc, singleton.cc, value_map.cc). The class hierarchy is: histograms::Histogram – abstract base histograms::Equi_height – equi‑height histogram (template per data type) histograms::Singleton – singleton (equally‑wide) histogram
Command handling flow:
Sql_cmd_analyze_table::handle_histogram_command
└─ update_histogram
└─ histograms::update_histogram
├─ field type validation
├─ sampling (controlled by histogram_generation_max_mem_size)
│ └─ ha_sample_init / ha_sample_next / ha_sample_end
├─ build_histogram
│ ├─ Singleton::build_histogram (if distinct values ≤ buckets)
│ └─ Equi_height::build_histogram (otherwise)
└─ Histogram::store_histogram → dd::Column_statisticsDuring query optimization the selectivity is obtained via:
get_histogram_selectivity
└─ Histogram::get_selectivity
├─ get_equal_to_selectivity_dispatcher
├─ get_greater_than_selectivity_dispatcher
└─ get_less_than_selectivity_dispatcherSupported predicate types include equality, inequality, range, IS NULL/NOT NULL, BETWEEN, IN, and their negations.
References (URLs)
MySQL 5.7 Cost Model documentation: https://dev.mysql.com/doc/refman/5.7/en/cost-model.html
MySQL 8.0 ANALYZE TABLE syntax: https://dev.mysql.com/doc/refman/8.0/en/analyze-table.html
Histogram system variable: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_histogram_generation_max_mem_size
Histogram statistics overview: https://mysqlserverteam.com/histogram-statistics-in-mysql/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
