Databases 19 min read

SQL Refactoring Case Study for JDL Routing System: Reducing Complexity and Improving Performance

This article presents a detailed case study of refactoring a complex, performance‑critical SQL query used in a JDL routing system, demonstrating step‑by‑step formatting, hierarchical decomposition, and optimization techniques such as layer merging, predicate push‑down, join simplification, and index tuning to dramatically reduce execution time.

JD Tech Talk

May 21, 2024

SQL Refactoring Case Study for JDL Routing System: Reducing Complexity and Improving Performance

1. Introduction

During continuous software development and maintenance, new features and bug fixes cause code bases to grow rapidly, leading to technical debt and what is colloquially called “code rot”. The same phenomenon can affect SQL statements and database schemas, making them hard to understand, maintain, and performant. This article uses a real‑world slow‑SQL case from the JDL routing system to illustrate how SQL can be refactored in a manner similar to Java code refactoring.

2. JDL Routing System Complex SQL Governance Case

The routing system plans logistics routes to ensure optimal operation times across network nodes. The underlying SQL query aggregates route information and suffers from deep nesting, multiple joins, and redundant fields.

2.1 Problem SQL

select count(*) total_count from (select * from (select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code) temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time) t_total;

This production‑level SQL was identified by slow‑SQL metrics; its readability and performance are both poor.

2.2 Governance Steps

Step 1 – Formatting

For engineers, formatting is essential to ensure readability.

select count(*) total_count from (select * FROM (select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code) temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time) t_total;

After formatting, it is clear that the query counts routes that satisfy specific time constraints.

Note: Formatting can be applied at any stage.

Step 2 – Hierarchical Decomposition

Level 0 select count(*) total_count from t_total; Level 1 – t_total

select * from temp WHERE start_node_code = '311F001' and disable_time > '2023-11-15 00:00:00' and enable_time < disable_time;

Level 2 – temp

select a.line_store_goods_id as line_resource_id, a.group_num as group_num, a.approval_erp as approval_erp, a.approval_person as approval_person, a.approval_status as approval_status, a.approval_time as approval_time, a.approval_remark as approval_remark, a.master_slave as master_slave, a.parent_line_code as parent_line_code, a.remarks as remarks, a.operator_time, a.same_stowage as same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM (select * FROM line_store_goods WHERE yn = 1 and master_slave = 1) a join (select start_org_id, start_org_name, start_province_id, start_province_name, start_city_id, start_city_name, start_node_code, start_node_name, end_org_id, end_org_name, end_province_id, end_province_name, end_city_id, end_city_name, end_node_code, end_node_name, depart_time, arrive_time, depart_wave_code, arrive_wave_code, line_code, line_type, transport_type, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource where line_code in (select line_code FROM line_store_goods WHERE yn = 1) and yn=1 group by line_code) b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join (select line_code,start_node_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 group by line_code) c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code;

The hierarchy makes the purpose of each sub‑query explicit.

Step 3 – Refactoring

Java developers are familiar with the book “Refactoring – Improving the Design of Existing Code”. The same principles apply to SQL.

Toolbox:

Layer merging – reduce temporary tables.

Predicate push‑down – filter early to shrink intermediate results.

Join optimization – minimize rows scanned.

Sub‑query elimination – remove unnecessary nesting.

Sub‑query ↔ join conversion – choose the more efficient form.

Refactor 1 – Layer Merging

Combine level0 and level1.

select count(*) total_count from (select * from temp where a = "1")

select count(*) from temp where a = "1"

The second form avoids an extra sub‑query and temporary table, reducing computation, memory usage, and I/O.

Refactor 2 – Predicate Push‑Down

Push start_node_code = '311F001' down to the lowest level.

By applying filters as early as possible, the size of temporary tables is minimized, leading to faster execution.

Refactor 3 – Join Optimization

Simplify joins and remove redundant fields.

Redundant columns are stripped, unnecessary ON conditions are dropped, and duplicate sub‑queries are eliminated, resulting in a clean two‑table join.

select a.line_store_goods_id as line_resource_id, a.group_num, a.approval_erp, a.approval_person, a.approval_status, a.approval_time, a.approval_remark, a.master_slave, a.parent_line_code, a.remarks, a.operator_time, a.same_stowage, b.start_org_id, b.start_org_name, b.start_province_id, b.start_province_name, b.start_city_id, b.start_city_name, b.start_node_code, b.start_node_name, b.end_org_id, b.end_org_name, b.end_province_id, b.end_province_name, b.end_city_id, b.end_city_name, b.end_node_code, b.end_node_name, b.depart_time, b.arrive_time, b.depart_wave_code, b.arrive_wave_code, b.enable_time, b.disable_time, a.store_enable_time, a.store_disable_time, a.update_name operator_name, b.line_code, b.line_type, b.transport_type, IF(a.store_enable_time > b.enable_time, IF(a.store_enable_time > c.enable_time, a.store_enable_time, c.enable_time), IF(b.enable_time > c.enable_time, b.enable_time, c.enable_time)) as insect_start_time, IF(a.store_disable_time < b.disable_time, IF(a.store_disable_time < c.disable_time, a.store_disable_time, c.disable_time), IF(b.disable_time < c.disable_time, b.disable_time, c.disable_time)) as insect_end_time FROM line_store_goods a join line_resource b on a.line_code = b.line_code and a.start_node_code = b.start_node_code join line_resource c on a.parent_line_code = c.line_code and a.start_node_code = c.start_node_code where a.start_node_code = '311F001' and a.disable_time > '2023-11-15 00:00:00' and a.enable_time < a.disable_time;

Step 4 – Theoretical Validation

Run the optimized query and compare results with the original to ensure functional equivalence.

select count(*) from (select line_code FROM line_store_goods WHERE yn = 1 and parent_line_code = line_code and master_slave = 1 and start_node_code = '311F001') a join (select line_code, min(enable_time) as enable_time, max(disable_time) as disable_time FROM line_resource WHERE yn = 1 and start_node_code = '311F001' group by line_code) b on a.line_code = b.line_code where b.disable_time > '2023-11-15 00:00:00' and b.enable_time < b.disable_time;

The refactored SQL is far more readable and its performance improves dramatically.

Step 5 – Index Optimization

Further performance gains can be achieved by adding appropriate indexes on columns such as start_node_code, line_code, enable_time, and disable_time. Detailed index tuning is omitted for brevity.

Step 6 – Testing and Verification

As with code refactoring, functional tests and performance benchmarks are essential to validate that the optimized SQL produces correct results and meets latency targets.

3. Effect Comparison

Before Optimization

After Optimization

Nested Levels

Number of Joins

Sub‑queries

Execution Time

4.75s

0.6s

The optimization reduced nesting, eliminated redundant joins and sub‑queries, and cut execution time by more than 80%.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL refactoring Query Tuning

Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.