How I Cut Search API Response Time from 150ms to 32ms in Three Optimizations

This article details a step‑by‑step optimization of a Java search API—covering OpenSearch configuration, batch caching, thread‑pool tuning, and Redis pipelining—that reduced average response time from 150 ms to 32 ms and met a sub‑100 ms SLA.

Java Backend Technology
Java Backend Technology
Java Backend Technology
How I Cut Search API Response Time from 150ms to 32ms in Three Optimizations

Business Logic

The API retrieves data from OpenSearch, assembles it, and returns the result. Initially estimated at five days, the project actually took ten due to many influencing factors such as configuration, database, cache, OpenSearch, and code.

Requirement

The mobile app expects the API to respond within 100 ms.

First Load Test

Average response time was 150 ms, and OpenSearch hit a per‑second query limit, causing backend errors.

Optimizations applied:

Modified OpenSearch configuration and switched the test environment address to an internal network.

Replaced looped cache queries with a single batch query.

Removed unused code after confirming with teammates.

Second Load Test

After code and configuration changes, performance worsened and new issues appeared.

Further actions:

Verified cache query count was minimal and tuned thread‑pool parameters to reasonable values.

When still unable to meet the SLA, introduced result‑set caching: using user ID and search keyword as the key, caching the result for five minutes.

Third Load Test

Performance finally met the requirement: with 60 concurrent requests the response time dropped to 32 ms, revealing another optimization opportunity.

Removed unnecessary database queries from the API and cleaned up unused dependencies.

Growth

Learned to use RedisTemplate.executePipelined for batch Redis queries.

Summary of Optimizations

Avoid looping over database or cache queries; loops must not contain cache or DB calls.

API endpoints should query cache directly, not the database.

Prefer batch queries over single‑row queries to reduce round‑trips.

When using cloud services (e.g., Alibaba Cloud), configure products properly and use internal network addresses in production.

Pay attention to connection‑pool sizes for databases, Redis, and thread pools.

Run load tests on dedicated machines without other services; in production, isolate critical services per machine.

Remove commented‑out or unused code and dependencies promptly.

Cluster configuration is essential.

Monitoring tools like PinPoint help locate bottlenecks.

If technical optimization space is exhausted, use business data to justify performance goals.

Always perform regression testing after code changes; tools like Postman and Beyond Compare are useful.

Increase logging in critical areas to simplify future troubleshooting.

Original link: www.cnblogs.com/cjsblog/p/10573215.html
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendredisPerformance TestingcachingOpenSearchsearch optimization
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.