Databases 22 min read

SQL Query Optimization and Stored Procedure Refactoring for Large‑Scale MES Reporting

The article describes how a massive MES reporting query that took over half an hour to run was analyzed, re‑engineered with proper indexing, partitioning considerations, and a comprehensive stored‑procedure redesign, resulting in sub‑second response times for both backend analysis and front‑end B/S applications.

Top Architect
Top Architect
Top Architect
SQL Query Optimization and Stored Procedure Refactoring for Large‑Scale MES Reporting

Recently I was tasked with optimizing a reporting query for a Siemens China MES project that previously required more than half an hour (or even failed) to return results due to full‑table scans on tables containing billions of rows. After adding missing indexes and evaluating partitioning strategies, I realized that naïve partitioning would not help because the query conditions lacked regular patterns such as dates or sequential identifiers.

Data Background

The source database contains several tables with over a hundred million rows each, and many tables with tens of millions of rows. The original SSRS report used a complex SQL statement that performed multiple left joins, many IN clauses with @var parameters, and LIKE '%'+@var+'%' patterns, leading to full scans.

select distinct b.MaterialID as matl_def_id, c.Descript,
case when right(b.MESOrderID, 12) < '001000000000' then right(b.MESOrderID, 9)
else right(b.MESOrderID, 12) end as pom_order_id,
... (full original query omitted for brevity) ...

Exploration and Pitfalls

Initial attempts to add indexes did not improve performance because the query still scanned large tables. Creating partitions on the original tables caused deadlocks due to ongoing production data ingestion. The key insight was that the query logic itself needed to be restructured rather than relying on table‑level tricks.

Optimization Design

I decided to rewrite the logic as a stored procedure, allowing flexible handling of optional parameters and eliminating repetitive @var = '' OR @var = @value checks. The new design uses temporary tables to collect intermediate results, replaces IN / NOT IN with EXISTS / NOT EXISTS , removes unnecessary * selections, and applies WITH (NOLOCK) where dirty reads are acceptable.

/**
 * 某某跟踪报表
 **/
--exec spName1 '','','','','','','公司代号'
CREATE Procedure spName1
    @MESOrderID nvarchar(320),
    @LotName nvarchar(700),
    @DateCode nvarchar(500),
    @BatchID nvarchar(700),
    @comdef nvarchar(700),
    @SNCust nvarchar(1600),
    @OnPlant nvarchar(20)
AS
BEGIN
    SET NOCOUNT ON;
    -- 1) Define global temp table #FinalLotName
    CREATE TABLE #FinalLotName (
        LotName NVARCHAR(50),
        SourceLotName NVARCHAR(50),
        SNCust NVARCHAR(128)
    );
    IF @LotName <> ''
    BEGIN
        SELECT Val INTO #WorkLot FROM fn_String_To_Table(@LotName,',',1);
        SELECT LotPK,LotName INTO #WorkLotPK FROM MMLots WITH(NOLOCK) WHERE EXISTS (SELECT 1 FROM #WorkLot b WHERE b.Val=MMLots.LotID);
        -- additional logic omitted for brevity
    END
    -- 2) Define #FinalCO_SN temp table
    CREATE TABLE #FinalCO_SN (
        SN NVARCHAR(50),
        SourceSN NVARCHAR(50),
        SNCust NVARCHAR(128),
        matl_def_id NVARCHAR(50),
        ComMaterials NVARCHAR(50),
        MESOrderID NVARCHAR(20),
        OnPlantID NVARCHAR(20),
        VendorID NVARCHAR(20),
        DateCode NVARCHAR(20),
        SNNote NVARCHAR(512)
    );
    IF @MESOrderID <> ''
    BEGIN
        SELECT Val INTO #WorkMESOrderID FROM fn_String_To_Table(@MESOrderID,',',1);
        IF @OnPlant='Comba'
            UPDATE #WorkMESOrderID SET Val='C000'+Val WHERE LEN(Val)=9;
        ELSE
            UPDATE #WorkMESOrderID SET Val='W000'+Val WHERE LEN(Val)=9;
        SELECT SN,MaterialID,MESOrderID,OnPlantID INTO #WorkCO_SN1 FROM CO_SN_GENERATION WITH(NOLOCK)
        WHERE SNType='IntSN' AND SNRuleName='ProductSNRule' AND OnPlantID=@OnPlant
          AND EXISTS (SELECT 1 FROM #WorkMESOrderID b WHERE CO_SN_GENERATION.MESOrderID=b.Val);
        -- further joins and inserts omitted for brevity
    END
    -- 3) Final result assembly
    IF EXISTS (SELECT 1 FROM #FinalLotName)
    BEGIN
        SELECT a.matl_def_id,b.Descript,a.MESOrderID AS pom_order_id,a.SN AS LotName,
               a.ComMaterials,c.Descript AS ComMatDes,a.VendorID,a.DateCode,a.SNNote,
               OnPlantID,SNCust
        FROM #FinalCO_SN a
        JOIN MMDefinitions b WITH(NOLOCK) ON a.matl_def_id=b.DefID
        JOIN MMDefinitions c WITH(NOLOCK) ON a.ComMaterials=c.DefID
        WHERE NOT EXISTS (SELECT DISTINCT SN, SourceSN FROM #FinalCO_SN x WHERE x.SN=a.SourceSN AND x.SourceSN=a.SN);
    END
    ELSE
        PRINT 'There is no queryable condition, please enter at least a query condition.';
END
GO

Conclusion

Writing reliable, high‑performance SQL is not difficult; the challenge lies in habit. The key takeaways are to avoid full‑table scans, use proper indexes, replace IN with EXISTS , consider dirty reads with WITH (NOLOCK) when appropriate, and encapsulate complex logic in stored procedures. After replacing the SSRS report with the new procedure, query time dropped from half an hour to 1‑2 seconds on the front‑end.

PerformanceSQLIndexingDatabase OptimizationStored ProcedureT-SQL
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.