Backend Development 9 min read

Optimizing High‑Concurrency List Service for 58 Used‑Car Platform: Data Query, Transformation, and Thread‑Pool Tuning

This article analyzes the performance bottlenecks of the 58 used‑car list service under high concurrency, breaks down data‑query and data‑transfer stages, and presents three optimization solutions—including query redesign, concurrent data conversion, and thread‑pool parameter tuning—that together reduce latency by over 80 ms and improve resource utilization.

58 Tech

May 8, 2020

Optimizing High‑Concurrency List Service for 58 Used‑Car Platform: Data Query, Transformation, and Thread‑Pool Tuning

Background – The 58 used‑car platform experiences rapid traffic growth, causing the list‑service API to exceed 200 ms latency. The service must fetch up to ten different information types, some independent and some dependent on count queries.

Scenario Description – Independent data (A, B, C) can be fetched concurrently in three threads, averaging ~40 ms. Dependent data (D‑J) requires a count‑then‑result workflow, leading to a combined average of ~150 ms. Overall data‑query latency is therefore around 150 ms.

Problem Analysis – Using Alibaba’s arthas tool, two costly operations were identified: data query (≈150 ms) and data transformation (≈50 ms). The original implementation used separate threads for dependent and independent lists, but the transformation step was single‑threaded.

Optimization 1 – Data Query Redesign – The count (C) and result (R) queries are fully separated. Count queries run in parallel (max 50 ms) followed by a unified supplement strategy, then result queries run in parallel (max 60 ms), yielding an average of 110 ms and a 40 ms latency reduction in production.

Optimization 2 – Concurrent Data Transformation – After merging, sorting, and deduplication, the list is split into sub‑collections that are transformed concurrently while preserving original order. Each thread processes five items (≈5 ms), providing another ~40 ms latency gain.

Optimization 3 – Thread‑Pool Parameter Tuning – Based on Doug Lea’s guidelines, core pool size, max pool size, and queue capacity are calculated from expected QPS (15‑20), task counts, and acceptable timeout (300 ms). For the given workload, corePoolSize = 20, maxPoolSize = 34, and workQueue ≈ 150 tasks were chosen.

Performance Evaluation – Load tests with varying max thread counts showed: 300 threads (88 % CPU, 183 ms), 180 threads (54 % CPU, 128 ms), 50 threads (28 % CPU, 125 ms), 34 threads (21 % CPU, 120 ms), 24 threads (18 % CPU, 135 ms with queue wait), 12 threads (18 % CPU, 148 ms with queue wait). The optimal configuration balances CPU load and queue latency.

Conclusion & Planning – After the three optimizations, overall latency improved by more than 80 ms. However, increased concurrency raises debugging complexity, and thread‑pool tuning remains critical. Future work includes dynamic thread‑pool adjustment to maximize resource utilization.

References

Arthas – Alibaba Java diagnostic tool: https://alibaba.github.io/arthas/

Java Concurrency in Practice – Doug Lea (translated by Tong Yunlan)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java concurrency high concurrency thread pool

Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.