18 Proven Strategies to Supercharge Backend API Performance
This article shares eighteen practical techniques—including batch database operations, asynchronous processing, caching, prefetching, pooling, event callbacks, parallel remote calls, lock granularity, file‑based storage, indexing, SQL tuning, transaction management, deep pagination fixes, compression, NoSQL alternatives, thread‑pool design, and machine‑level optimizations—to dramatically reduce API latency from seconds to milliseconds.
前言
Hello, I am 三友. I encountered a 504 timeout caused by an interface taking too long, exceeding the Nginx 10‑second limit. After performance tuning, the response dropped from 11.3s to 170ms. Below are some common optimization methods.
1. Batch Thinking: Batch Database Operations
Before optimization:
<code>// for loop single insert
for(TransDetail detail:transDetailList){
insert(detail);
}
</code>After optimization:
<code>batchInsert(transDetailList);
</code>Analogy: moving 10,000 bricks with an elevator that can carry 500 at a time—batching is far more efficient.
2. Asynchronous Thinking: Offload Time‑Consuming Operations
Use asynchronous processing to reduce interface latency. For a transfer interface that matches bank codes synchronously, move the matching to asynchronous execution.
After moving the matching to async, the flow improves.
User registration notifications (SMS/email) can also be async.
Implementation can use thread pools or message queues.
3. Space‑for‑Time: Caching
Appropriate caching (Redis, JVM local cache, Memcached, Map, etc.) can dramatically improve performance by avoiding repeated DB queries.
In a transfer interface, each request queried the DB for bank‑code matching, which was slow. Introducing a cache reduced the latency.
After caching, the flow changes as shown.
4. Prefetching: Initialize Data into Cache Early
Pre‑compute and store complex query results in cache before they are needed, reducing runtime latency. Example: pre‑loading live‑stream user data.
5. Pooling: Pre‑allocate and Reuse Resources
Thread pools avoid the overhead of creating/destroying threads. Similar pooling concepts apply to TCP Keep‑Alive, DB connection pools, HttpClient pools, etc.
Thread pools manage threads, preventing resource waste.
6. Event Callback: Avoid Blocking Waits
Instead of blocking on a slow external system (e.g., system B taking >10 s), use an event‑callback model to continue other work and process the result when it arrives.
Reference: IO multiplexing model.
7. Parallel Remote Calls
Convert sequential remote calls (e.g., fetching user, banner, popup data) into parallel calls to cut total latency.
Parallel execution dramatically reduces response time.
8. Lock Granularity: Avoid Overly Coarse Locks
Lock only the necessary shared resources; avoid locking large scopes (e.g., whole class) which harms concurrency.
Locking the bathroom door is enough; you don’t need to lock the whole house.
<code>// Wrong: coarse lock
public int wrong(){
long begin = System.currentTimeMillis();
IntStream.rangeClosed(1,10000).parallel().forEach(i->{
synchronized(this){
slowNotShare();
data.add(i);
}
});
return data.size();
}
</code> <code>// Right: fine‑grained lock
public int right(){
long begin = System.currentTimeMillis();
IntStream.rangeClosed(1,10000).parallel().forEach(i->{
slowNotShare(); // no lock needed
synchronized(data){
data.add(i);
}
});
return data.size();
}
</code>9. Switch Storage: Temporary File Persistence
When DB inserts become a bottleneck, write bulk data to a file first, then asynchronously load into the database.
In a transfer service, 1000 detail rows took ~6 s to insert; using file‑based staging improved throughput by over tenfold.
After optimization, the flow uses file storage and optional MQ.
10. Indexing
Adding appropriate indexes is a low‑cost, high‑impact optimization.
Ensure SQL statements have indexes.
Verify indexes are effective.
Design indexes reasonably (avoid redundant, limit to ~5 per table, avoid low‑cardinality columns, use covering indexes, consider FORCE INDEX only when necessary).
10.1 Missing Index
Check with EXPLAIN and add missing indexes via ALTER TABLE … ADD INDEX .
<code>explain select * from user_info where userId like '%123';
</code>10.2 Ineffective Index
Common reasons for index loss are illustrated in the diagram.
10.3 Poor Index Design
Remove redundant/duplicate indexes.
Keep index count reasonable (≤5).
Avoid indexes on low‑cardinality columns.
Use covering indexes when possible.
Use FORCE INDEX only after careful consideration.
11. SQL Optimization
Beyond indexing, refine SQL statements (e.g., avoid SELECT *, use proper predicates). See referenced articles for details.
12. Avoid Large Transactions
Long‑running transactions hold DB connections, causing timeouts, deadlocks, and replication lag. Recommendations:
Do not place RPC calls inside transactions.
Keep read‑only operations out of transactions.
Avoid processing massive data within a single transaction.
13. Deep Pagination Issues
Deep pagination scans many rows, hurting performance. Solutions:
Tag‑record method: remember the last ID and query with WHERE id > last_id LIMIT 10 .
<code>select id,name,balance FROM account where id > 100000 limit 10;
</code>Delayed join method: first fetch primary keys via a secondary index, then join to the main table.
<code>select acct1.id,acct1.name,acct1.balance FROM account acct1 INNER JOIN (SELECT a.id FROM account a WHERE a.create_time > '2020-09-19' limit 100000, 10) AS acct2 on acct1.id= acct2.id;
</code>14. Optimize Program Structure
Eliminate unnecessary object creation, redundant DB calls, and inefficient logic ordering.
Example: checking VIP status before first‑login reduces evaluations.
<code>if(isFirstLogin && isUserVip){
sendMsg();
}
</code>15. Compress Transfer Content
Compressing payloads (e.g., using gzip) reduces bandwidth usage and speeds up transmission.
Analogy: a horse carries less weight faster.
16. Massive Data Handling: Consider NoSQL
For very large datasets, use Elasticsearch, HBase, or other NoSQL solutions, or apply sharding if relational storage is required.
17. Reasonable Thread‑Pool Design
Key parameters: core size, max size, work queue. Misconfiguration can cause OOM, starvation, or core business slowdown.
Too few core threads limit parallelism.
Improper queue can cause OOM.
Lack of business isolation may let peripheral tasks hog resources.
18. Machine‑Level Issues (Full GC, Thread Saturation, Unclosed IO)
High GC pauses, thread exhaustion, and leaked IO resources also degrade API performance. Apply monitoring, limit concurrency, and ensure proper resource cleanup.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.