Resolving Common CBO Issues and Optimizing Inceptor Queries
This article examines typical problems encountered when using Cost‑Based Optimizer (CBO) in Inceptor, offers step‑by‑step troubleshooting methods, and demonstrates a real‑world case where statistical information is collected with Preanalyze, CBO is enabled, and query plans are refined for large‑scale JOIN operations.
Introduction
The previous article introduced the concept of Cost‑Based Optimizer (CBO) and basic usage such as collecting statistics with ANALYZE or the preanalyze.sh script. This continuation analyzes abnormal situations that may arise when CBO optimizes execution plans in production, provides corresponding solutions, and walks through a concrete case study.
CBO Common Issues and Solutions
Statistics collection failure – If CBO appears ineffective after the switch is turned on, first ensure the optimizer switch inceptor.optimizer.on is disabled before collecting statistics. Disable it with: SET inceptor.optimizer.on = FALSE; Then run ANALYZE or preanalyze.sh. Verify success by executing DESC FORMATTED <table_name>; and checking that numRows matches the actual row count (a value of –1 indicates failure).
Execution‑plan generation failure – After confirming correct statistics, enable CBO and compare plans with EXPLAIN <SQL_statement>;. If the plan does not change, CBO may not support the statement. Common unsupported scenarios are:
SQL contains HINTS – remove them and let CBO handle the query.
SQL contains complex WITH‑AS clauses or VIEWs without materialization – convert them to materialized tables and collect statistics on those tables.
Non‑equijoin – currently no solution.
In these cases CBO automatically falls back to the original execution logic without affecting query results.
Case Study
A securities‑trading company employee, Alice, needs to answer two questions for 2015 data: (1) count the number of different stock types purchased by A‑level accounts, and (2) count the number of accounts of each level that bought stock type ‘s’. The relevant tables in database trans_platform are:
transactions : stock_id (stock identifier), acc_num (account number), trans_id (transaction ID).
accounts : id (account ID), acc_level (account level).
stocks : id (stock ID), stock_type (type of stock).
SQL statements:
SELECT stock_type, COUNT(*) num_cnt
FROM transactions a JOIN stocks b ON a.stock_id = b.id
JOIN accounts c ON a.acc_num = c.id
WHERE c.acc_level = 'A'
GROUP BY stock_type; SELECT acc_level, COUNT(*) num_cnt
FROM transactions a JOIN stocks b ON a.stock_id = b.id
JOIN accounts c ON a.acc_num = c.id
WHERE b.stock_type = 's'
GROUP BY acc_level;Alice first runs the Preanalyze tool to collect statistics for the involved tables and columns:
./preanalyze -d /home/Alice/Documents/sqls/getTransInfo \
--database trans_platform -v hive2 -p alice/SE@TDHAfter confirming successful statistics collection with DESC FORMATTED, she enables CBO and checks the plans. For the first query, the filter c.acc_level = 'A' removes about 80% of rows, so CBO reorders the joins to transactions JOIN accounts JOIN stocks, improving performance. For the second query, the filter b.stock_type = 's' already yields an optimal plan, so the join order remains unchanged.
Summary
Key recommendations for using CBO effectively:
Use the Preanalyze tool to gather accurate statistics, paying attention to the Inceptor version and authentication parameters.
If Preanalyze or CBO cannot handle certain constructs, rewrite the SQL (e.g., remove HINTS, materialize WITH‑AS subqueries, avoid non‑equijoins).
For complex or large‑scale queries, verify the need for CBO, ensure statistics are collected, and compare execution plans before running the query.
When CBO appears ineffective, investigate statistics‑collection failures and execution‑plan generation issues as described above.
By following these steps, users can leverage Inceptor CBO to optimize multi‑table JOIN queries and improve data‑analysis efficiency.
StarRing Big Data Open Lab
Focused on big data technology research, exploring the Big Data era | [email protected]
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
