SQL Refactoring Case Study for JDL Routing System: Reducing Complexity and Improving Performance
This article presents a detailed case study of refactoring a complex, performance‑critical SQL query used in a JDL routing system, demonstrating step‑by‑step formatting, hierarchical decomposition, and optimization techniques such as layer merging, predicate push‑down, join simplification, and index tuning to dramatically reduce execution time.
1. Introduction
During continuous software development and maintenance, new features and bug fixes cause code bases to grow rapidly, leading to technical debt and what is colloquially called “code rot”. The same phenomenon can affect SQL statements and database schemas, making them hard to understand, maintain, and performant. This article uses a real‑world slow‑SQL case from the JDL routing system to illustrate how SQL can be refactored in a manner similar to Java code refactoring.
2. JDL Routing System Complex SQL Governance Case
The routing system plans logistics routes to ensure optimal operation times across network nodes. The underlying SQL query aggregates route information and suffers from deep nesting, multiple joins, and redundant fields.
2.1 Problem SQL
select count(*) total_count from (select * from (select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code) temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time) t_total;This production‑level SQL was identified by slow‑SQL metrics; its readability and performance are both poor.
2.2 Governance Steps
Step 1 – Formatting
For engineers, formatting is essential to ensure readability.
select count(*) total_count from (select * FROM (select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code) temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time) t_total;After formatting, it is clear that the query counts routes that satisfy specific time constraints.
Note: Formatting can be applied at any stage.
Step 2 – Hierarchical Decomposition
Level 0
select count(*) total_count from t_total;Level 1 – t_total
select * from temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time;Level 2 – temp
select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code;The hierarchy makes the purpose of each sub‑query explicit.
Step 3 – Refactoring
Java developers are familiar with the book “Refactoring – Improving the Design of Existing Code”. The same principles apply to SQL.
Toolbox:
Layer merging – reduce temporary tables.
Predicate push‑down – filter early to shrink intermediate results.
Join optimization – minimize rows scanned.
Sub‑query elimination – remove unnecessary nesting.
Sub‑query ↔ join conversion – choose the more efficient form.
Refactor 1 – Layer Merging
Combine level0 and level1.
select count(*) total_count from (select * from temp where a = "1") select count(*) from temp where a = "1"The second form avoids an extra sub‑query and temporary table, reducing computation, memory usage, and I/O.
Refactor 2 – Predicate Push‑Down
Push start_node_code = '311F001' down to the lowest level.
By applying filters as early as possible, the size of temporary tables is minimized, leading to faster execution.
Refactor 3 – Join Optimization
Simplify joins and remove redundant fields.
Redundant columns are stripped, unnecessary ON conditions are dropped, and duplicate sub‑queries are eliminated, resulting in a clean two‑table join.
select a.line_store_goods_id as line_resource_id, a.group_num, a.approval_erp, a.approval_person, a.approval_status, a.approval_time, a.approval_remark, a.master_slave, a.parent_line_code, a.remarks, a.operator_time, a.same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM line_store_goods a join line_resource b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join line_resource c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code where a.start_node_code = '311F001' and a.disable_time > '2023-11-15 00:00:00' and a.enable_time < a.disable_time;Step 4 – Theoretical Validation
Run the optimized query and compare results with the original to ensure functional equivalence.
select count(*) from (select line_code FROM line_store_goods WHERE yn = 1 and parent_line_code = line_code and master_slave = 1 and start_node_code = '311F001') a join (select line_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 and start_node_code = '311F001' group by line_code) b on a.line_code = b.line_code where b.disable_time > '2023-11-15 00:00:00' and b.enable_time < b.disable_time;The refactored SQL is far more readable and its performance improves dramatically.
Step 5 – Index Optimization
Further performance gains can be achieved by adding appropriate indexes on columns such as start_node_code , line_code , enable_time , and disable_time . Detailed index tuning is omitted for brevity.
Step 6 – Testing and Verification
As with code refactoring, functional tests and performance benchmarks are essential to validate that the optimized SQL produces correct results and meets latency targets.
3. Effect Comparison
Before Optimization
After Optimization
Nested Levels
4
1
Number of Joins
3
2
Sub‑queries
7
2
Execution Time
4.75s
0.6s
The optimization reduced nesting, eliminated redundant joins and sub‑queries, and cut execution time by more than 80%.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.