10 Proven Techniques to Supercharge Backend API Performance
This article presents a comprehensive guide for backend developers, covering ten practical techniques—including local and distributed caching, parallel execution, asynchronous processing, connection pooling, sharding, SQL tuning, pre‑computation, batch operations, lock granularity, and context propagation—to dramatically improve API performance.
1. Local Cache
Local cache runs in the same process as the application, providing ultra‑fast access without network overhead. It is suitable for single‑instance deployments or clusters where nodes do not need to share cache state. The downside is memory waste and lack of sharing across services.
Common local cache libraries include Guava and Caffeine , both distributed as simple JAR packages.
Typical scenarios for local cache:
Data with low freshness requirements where a short TTL is acceptable.
Immutable mappings such as order‑id to user‑id.
Local cache is easy to adopt; you can find tutorials online.
2. Distributed Cache
Distributed caches break the stateful nature of local caches by providing a clustered, independently managed service with virtually unlimited capacity. Network latency (1‑2 ms) is negligible compared with the benefits.
Popular distributed cache systems include Memcached and Redis . They can handle tens of thousands of QPS per node, offloading read/write pressure from relational databases.
Ensure a high hit rate to achieve effective load reduction.
Size the cache according to business needs to avoid eviction of hot data.
Maintain data consistency.
Plan for rapid scaling.
Monitor average, max, and min response times, QPS, network traffic, and client connections.
3. Parallelization
Analyze business flows, draw sequence diagrams, and identify which steps can run in parallel. Leverage multi‑core CPUs to execute independent tasks concurrently.
Java's CompletableFuture offers around 50 APIs for serial, parallel, composition, and error handling.
4. Asynchronization
Separate non‑core logic from the main request path and execute it asynchronously. For example, after creating an order, send a message to a MQ for downstream notifications (SMS, email) without blocking the primary flow.
5. Pooling
Keep reusable objects (threads, connections, buffers) in a pool to avoid the cost of repeated creation and destruction. Core pool parameters include minimum, idle, and maximum size.
Connection pool parameters: min connections, idle connections, max connections.
<code>new ThreadPoolExecutor(3, 15, 5, TimeUnit.MINUTES,
new ArrayBlockingQueue<>(10),
new ThreadFactoryBuilder().setNameFormat("data-thread-%d").build(),
(r, executor) -> {
if (r instanceof BaseRunnable) {
((BaseRunnable) r).rejectedExecute();
}
});</code>6. Sharding (Database Partitioning)
MySQL InnoDB uses B+‑tree storage; a single table can hold millions of rows, but massive user bases often require horizontal partitioning into multiple identical physical tables to relieve storage and access pressure.
Sharding introduces challenges such as data skew, global unique ID generation, and routing logic.
7. SQL Optimization
Poorly written SQL dramatically degrades API latency. Common pitfalls include deep pagination, missing indexes, and full‑table scans.
Avoid SELECT * ; select only required columns.
Use LIMIT 1 when only one row is needed.
Keep the number of indexes reasonable (typically ≤ 5).
Prefer AND over OR in WHERE clauses to preserve index usage.
Do not index low‑cardinality columns (e.g., gender).
Index columns used in WHERE and ORDER BY to prevent full scans.
8. Pre‑computation
Complex business calculations (e.g., site PV, red‑packet statistics) are performed ahead of time and cached, allowing API calls to read from cache instantly.
9. Batch Read/Write
IO is often the bottleneck. Instead of issuing 100 individual queries, expose a batch endpoint that retrieves all data in a single call, and use batch updates for writes.
10. Lock Granularity
Use locks only around truly contended resources. Over‑locking reduces concurrency; keep the lock scope minimal.
11. Context Propagation
Pass a reusable context object through the call chain to avoid repeated remote lookups for the same data.
12. Collection Size Planning
When the expected number of elements is known, pre‑size collections to avoid costly resizing. For example, ArrayList expands by 1.5× when its threshold is exceeded, causing data copying.
<code>List<String> lists = Lists.newArrayList();</code>13. Query Optimization
Avoid loading massive result sets into memory; prefer pagination or batch fetching to keep memory usage under control.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.