Performance Optimization Journey of a No‑Card Payment System: Database Deadlock, Transaction Length, Thread‑Pool, and Logging Improvements
This article examines the performance bottlenecks of a no‑card payment platform—including database deadlocks, overly long transactions, CPU saturation, unbounded thread pools, and excessive logging—and presents concrete backend‑level refactorings, configuration changes, and code examples that dramatically improve scalability and reliability.
Introduction The author shares a detailed case study of a no‑card payment project, focusing on code‑level performance optimizations rather than high‑level architecture.
Server Environment Four servers (4‑core CPU, 8 GB RAM) running RabbitMQ, DB2, an internal Dubbo‑based SOA framework, Redis, Memcached, and a custom configuration management system.
Problem Description Key issues observed in production included low scalability (40 TPS per node, 60 TPS total), frequent DB deadlocks, long‑running transactions, memory/CPU exhaustion, poor fault tolerance, missing or useless logs, unnecessary DB reads, multiple WARs in a single Tomcat, platform bugs, lack of rate‑limiting, absent fallback strategies, and insufficient monitoring.
Optimization Solutions
1. Database deadlock mitigation Instead of using pessimistic FOR UPDATE locks, the team adopted three alternatives: (a) Redis‑based distributed locks, (b) primary‑key duplicate‑insert checks, and (c) version‑number based optimistic locking. All approaches enforce expiration to avoid stale locks.
2. Transaction duration reduction A sample problematic transaction was shown:
public void test() {
Transaction.begin;
try {
dao.insert;
httpClient.queryRemoteResult(); // long‑running I/O
dao.update;
Transaction.commit();
} catch(Exception e) {
Transaction.rollFor();
}
}The guideline is to keep transactions short and move time‑consuming calls (e.g., HTTP requests) outside the transactional scope.
3. CPU saturation analysis High CPU usage was traced to an unbounded thread pool created with Executors.newCachedThreadPool(), which can spawn up to Integer.MAX_VALUE threads. Switching to a fixed pool ( Executors.newFixedThreadPool(50)) limited thread creation but introduced an unbounded work queue, causing memory pressure under heavy load.
4. Thread‑pool redesign Two solutions were proposed: (a) a custom bounded thread‑pool architecture that caps both threads and queue size, and (b) migration to the Akka framework for actor‑based concurrency. Additionally, asynchronous tasks were off‑loaded to a dedicated task‑processing service with retry and callback mechanisms.
5. Logging improvements The original code logged exceptions with logger.info and produced noisy, low‑value entries. The recommended pattern uses logger.warn or logger.error with a structured format:
logger.warn("[innersys] - [${exceptionType}] - [${methodName}] - errorCode:[${code}], errorMsg:[${msg}]", e);and logs input/output parameters via a utility that encrypts sensitive data. The Log4j pattern was also simplified from %d %-5p %c:%L [%t] - %m%n to %d %-5p %c [%t] - %m%n, reducing synchronization overhead and thread blocking.
Conclusion By addressing database locking, transaction scope, thread‑pool sizing, and logging practices, the system’s throughput rose from ~40 TPS to over 100 TPS, with markedly lower latency and fewer crashes. The author promises a follow‑up part covering further code‑level performance evolution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
