Databases 21 min read

Why Did MongoDB’s Query Planner Suddenly Slow Down? A Deep Dive into Index Cache Failures

The article explains how null or empty query values caused MongoDB to ignore the intended index, leading to massive slow queries and service timeouts, and details the step‑by‑step investigation, plan‑cache invalidation, and the corrective addition of a compound index that restored normal performance.

WeiLi Technology Team
WeiLi Technology Team
WeiLi Technology Team
Why Did MongoDB’s Query Planner Suddenly Slow Down? A Deep Dive into Index Cache Failures

Simple Introduction

When a MongoDB query contains

null

or an empty string, the index may become ineffective, causing the query optimizer to select an incorrect execution plan, resulting in slow queries and service anomalies.

Event Timeline

The user center reported frequent Dubbo RPC timeouts for third‑party login. The stack trace showed a

RpcException

caused by a remote method timeout:

<code>org.apache.dubbo.rpc.RpcException: Failed to invoke the method loginWithThird ... timeout: 5000 ms</code>

Investigation Steps

Checked user‑center CAT monitoring – requests were received, confirming the client was not at fault.

Reviewed Kibana logs – no abnormal entries.

Used Arthas to profile the call chain and identified MongoDB query latency as the bottleneck.

MongoDB Query Analysis

Original query method:

<code>public User getUserByDfId(String dfId, String app) {
    Criteria criteria = Criteria.where("userBindInfo.dfId").is(dfId)
        .and("app").is(app)
        .and("status").is(1);
    return this.mongoTemplate.findOne(Query.query(criteria), User.class);
}</code>

Existing indexes:

<code>{ "key" : { "app" : 1 }, "name" : "app" }
{ "key" : { "userBindInfo.dfId" : 1 }, "name" : "userBindInfo.dfId" }</code>

Slow‑log analysis revealed that MongoDB used the

app

index instead of the

userBindInfo.dfId

index, causing the query to scan millions of documents.

Root Cause

The plan cache had stored a sub‑optimal plan after a previous failure. When the cache refreshed (due to LRU eviction, index changes, or a MongoDB restart), the optimizer chose the wrong index, and the bad plan was cached again, leading to repeated slow queries.

Fix Implemented

A new compound index was created to improve index matching:

<code>db.getCollection("wormhole_user").createIndex({
    "userBindInfo.dfId": 1,
    "app": 1,
    "status": 1
}, { name: "idx_user_app_userBindInfo.dfId", background: true });</code>

After the index was added, query latency returned to normal and the service recovered.

Further Analysis with explain()

Running

explain()

on the query showed the correct plan using the

userBindInfo.dfId

index, confirming that the issue was solely due to plan‑cache selection.

MongoDB Query Optimizer Behavior

The optimizer caches the most efficient plan for a given query shape.

Cache entries are invalidated on MongoDB restart, index modifications, or LRU eviction.

Conclusion & Recommendations

Avoid queries that contain

null

or empty values.

Use

explain()

to verify execution plans under different conditions.

Inspect the plan cache via MongoDB’s

planCache

utilities.

Apply

hint()

to force the use of the appropriate index when necessary.

Author: Zheng Yateng, Senior Backend Engineer, Micro‑Carp Technology Team.

IndexingQuery OptimizationMongoDBPerformance DebuggingPlan Cache
WeiLi Technology Team
Written by

WeiLi Technology Team

Practicing data-driven principles and believing technology can change the world.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.