Backend Development 19 min read

18 Proven Strategies to Supercharge Backend API Performance

This article shares eighteen practical techniques—including batch database operations, asynchronous processing, caching, prefetching, pooling, event callbacks, parallel remote calls, lock granularity, file‑based storage, indexing, SQL tuning, transaction management, deep pagination fixes, compression, NoSQL alternatives, thread‑pool design, and machine‑level optimizations—to dramatically reduce API latency from seconds to milliseconds.

Sanyou's Java Diary

Jul 20, 2023

18 Proven Strategies to Supercharge Backend API Performance

前言

Hello, I am 三友. I encountered a 504 timeout caused by an interface taking too long, exceeding the Nginx 10‑second limit. After performance tuning, the response dropped from 11.3s to 170ms. Below are some common optimization methods.

1. Batch Thinking: Batch Database Operations

Before optimization:

// for loop single insert
for(TransDetail detail:transDetailList){
  insert(detail);
}

After optimization:

batchInsert(transDetailList);

Analogy: moving 10,000 bricks with an elevator that can carry 500 at a time—batching is far more efficient.

2. Asynchronous Thinking: Offload Time‑Consuming Operations

Use asynchronous processing to reduce interface latency. For a transfer interface that matches bank codes synchronously, move the matching to asynchronous execution.

After moving the matching to async, the flow improves.

User registration notifications (SMS/email) can also be async.

Implementation can use thread pools or message queues.

3. Space‑for‑Time: Caching

Appropriate caching (Redis, JVM local cache, Memcached, Map, etc.) can dramatically improve performance by avoiding repeated DB queries.

In a transfer interface, each request queried the DB for bank‑code matching, which was slow. Introducing a cache reduced the latency.

After caching, the flow changes as shown.

4. Prefetching: Initialize Data into Cache Early

Pre‑compute and store complex query results in cache before they are needed, reducing runtime latency. Example: pre‑loading live‑stream user data.

5. Pooling: Pre‑allocate and Reuse Resources

Thread pools avoid the overhead of creating/destroying threads. Similar pooling concepts apply to TCP Keep‑Alive, DB connection pools, HttpClient pools, etc.

Thread pools manage threads, preventing resource waste.

6. Event Callback: Avoid Blocking Waits

Instead of blocking on a slow external system (e.g., system B taking >10 s), use an event‑callback model to continue other work and process the result when it arrives.

Reference: IO multiplexing model.

7. Parallel Remote Calls

Convert sequential remote calls (e.g., fetching user, banner, popup data) into parallel calls to cut total latency.

Parallel execution dramatically reduces response time.

8. Lock Granularity: Avoid Overly Coarse Locks

Lock only the necessary shared resources; avoid locking large scopes (e.g., whole class) which harms concurrency.

Locking the bathroom door is enough; you don’t need to lock the whole house.

// Wrong: coarse lock
public int wrong(){
  long begin = System.currentTimeMillis();
  IntStream.rangeClosed(1,10000).parallel().forEach(i->{
    synchronized(this){
      slowNotShare();
      data.add(i);
    }
  });
  return data.size();
}

// Right: fine‑grained lock
public int right(){
  long begin = System.currentTimeMillis();
  IntStream.rangeClosed(1,10000).parallel().forEach(i->{
    slowNotShare(); // no lock needed
    synchronized(data){
      data.add(i);
    }
  });
  return data.size();
}

9. Switch Storage: Temporary File Persistence

When DB inserts become a bottleneck, write bulk data to a file first, then asynchronously load into the database.

In a transfer service, 1000 detail rows took ~6 s to insert; using file‑based staging improved throughput by over tenfold.

After optimization, the flow uses file storage and optional MQ.

10. Indexing

Adding appropriate indexes is a low‑cost, high‑impact optimization.

Ensure SQL statements have indexes.

Verify indexes are effective.

Design indexes reasonably (avoid redundant, limit to ~5 per table, avoid low‑cardinality columns, use covering indexes, consider FORCE INDEX only when necessary).

10.1 Missing Index

Check with EXPLAIN and add missing indexes via ALTER TABLE … ADD INDEX.

explain select * from user_info where userId like '%123';

10.2 Ineffective Index

Common reasons for index loss are illustrated in the diagram.

10.3 Poor Index Design

Remove redundant/duplicate indexes.

Keep index count reasonable (≤5).

Avoid indexes on low‑cardinality columns.

Use covering indexes when possible.

Use FORCE INDEX only after careful consideration.

11. SQL Optimization

Beyond indexing, refine SQL statements (e.g., avoid SELECT *, use proper predicates). See referenced articles for details.

12. Avoid Large Transactions

Long‑running transactions hold DB connections, causing timeouts, deadlocks, and replication lag. Recommendations:

Do not place RPC calls inside transactions.

Keep read‑only operations out of transactions.

Avoid processing massive data within a single transaction.

13. Deep Pagination Issues

Deep pagination scans many rows, hurting performance. Solutions:

Tag‑record method: remember the last ID and query with WHERE id > last_id LIMIT 10 .

select id,name,balance FROM account where id > 100000 limit 10;

Delayed join method: first fetch primary keys via a secondary index, then join to the main table.

select acct1.id,acct1.name,acct1.balance FROM account acct1 INNER JOIN (SELECT a.id FROM account a WHERE a.create_time > '2020-09-19' limit 100000, 10) AS acct2 on acct1.id= acct2.id;

14. Optimize Program Structure

Eliminate unnecessary object creation, redundant DB calls, and inefficient logic ordering.

Example: checking VIP status before first‑login reduces evaluations.

if(isFirstLogin && isUserVip){
  sendMsg();
}

15. Compress Transfer Content

Compressing payloads (e.g., using gzip) reduces bandwidth usage and speeds up transmission.

Analogy: a horse carries less weight faster.

16. Massive Data Handling: Consider NoSQL

For very large datasets, use Elasticsearch, HBase, or other NoSQL solutions, or apply sharding if relational storage is required.

17. Reasonable Thread‑Pool Design

Key parameters: core size, max size, work queue. Misconfiguration can cause OOM, starvation, or core business slowdown.

Too few core threads limit parallelism.

Improper queue can cause OOM.

Lack of business isolation may let peripheral tasks hog resources.

18. Machine‑Level Issues (Full GC, Thread Saturation, Unclosed IO)

High GC pauses, thread exhaustion, and leaked IO resources also degrade API performance. Apply monitoring, limit concurrency, and ensure proper resource cleanup.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Java API optimization

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.