Backend Development 10 min read

Refactoring a Decade‑Old Query Understanding Module: Identifying Code Smells, Motivations, and Remedies

The article details a three‑month refactor of a ten‑year‑old query‑understanding backend, describing the code smells encountered—duplicate code, long functions, bloated classes, oversized parameter lists, confusing temporaries, ignored warnings, and magic numbers—along with their motivations, preventive measures, and the performance improvements achieved after cleanup.

FunTester
FunTester
FunTester
Refactoring a Decade‑Old Query Understanding Module: Identifying Code Smells, Motivations, and Remedies

The team inherited a legacy query‑understanding pipeline that had been in production for over ten years; after taking over, they reduced the codebase by 80%, dramatically improving performance, stability, observability, and enabling deployment on both self‑built cloud and on‑premise environments.

Key motivations for the refactor included low iteration efficiency (adding a simple operator required three person‑days), poor stability (frequent P99 spikes), extremely slow startup (18 minutes), excessive memory usage (single process needed 114 GB), lack of monitoring and tracing tools, an outdated GCC 4.8 compiler, and inability to deploy to the company’s cloud platform.

The authors catalogued common code smells encountered: duplicated code across GBK/UTF‑8 conversion functions, excessively long functions (e.g., a 1,380‑line function), overly bloated request‑handling classes, long parameter lists (up to 56 parameters), confusing temporary fields, overly large parameter ranges passed through many functions, unnecessary serialization steps, ignored compilation warnings, magic numbers, and long chained if‑statements.

For each smell they explained the original developer’s motivation (often laziness or shortcut thinking), and proposed preventive and remedial actions such as extracting common logic, increasing unit‑test coverage, applying the Single‑Responsibility Principle, using configuration objects instead of massive parameter lists, replacing commented‑out code with feature switches, adhering to the least‑knowledge principle, parallelizing independent tasks via a DAG scheduler, and treating warnings as errors with -Wall -Werror flags.

After the three‑month cleanup, the module’s startup time dropped from 18 minutes to a few seconds, memory consumption fell dramatically, and the main processing path improved from 13.19 ms to 9.71 ms—a 26 % speedup—while code size and technical debt were substantially reduced.

The article concludes with a personal pledge to avoid shortcuts, write cleaner code, and continuously apply the learned refactoring practices.

performance optimizationBackend Developmentsoftware engineeringcode refactoringcode smells
FunTester
Written by

FunTester

10k followers, 1k articles | completely useless

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.