Cloud Computing 18 min read

Optimizing Qunar's Serverless Platform: Parallel Computation, Thread‑Pool Design, and Cache Governance

This article details how Qunar improved its Serverless FAAS platform by introducing Node.js worker_threads for parallel computation, designing a thread‑pool with shared buffers and postMessage communication, and implementing comprehensive cache pre‑heating and governance, resulting in a 72% reduction in P99 latency and more stable service performance.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Optimizing Qunar's Serverless Platform: Parallel Computation, Thread‑Pool Design, and Cache Governance

Qunar's Serverless platform was built to accelerate front‑end and back‑end data aggregation and service rollout by allowing developers to write cloud functions. After launch, the platform dramatically reduced code size and iteration time, but performance issues emerged, notably Node.js single‑thread limits and fragmented caching.

To address these, the team explored parallel computation using the worker_threads module, creating multiple independent V8 isolates that run in parallel. They designed a thread‑pool where the number of threads matches the container's CPU cores minus one, keeping threads resident to avoid creation overhead.

Two inter‑thread communication methods were evaluated: sharedBuffer with atomic operations for fine‑grained synchronization, and postMessage for message‑based coordination. Benchmarks showed sharedBuffer offers lower scheduling latency, while postMessage provides faster data transfer for complex objects.

The final solution combines both: sharedBuffer tracks thread states, and postMessage delivers task payloads, enabling efficient task dispatch, result aggregation, and fault handling.

Cache governance was added to mitigate cold‑start latency. Function code, configuration data, and enumerated results are cached in a three‑tier hierarchy (local, Redis, MySQL). A pre‑heat process loads cold data before service start, and a unified cache layer periodically refreshes data, reducing the initial execution time variance from ~30 ms to under 10 ms.

Production results during high‑traffic periods show the P99 latency of the Neeko tag function dropped from 400 ms to 152 ms (‑72%), with stable response times and higher request throughput.

Future plans include broader Serverless adoption across business scenarios, service separation to isolate function execution from platform management, automated scaling based on monitoring metrics, and data disaster‑recovery mechanisms for function storage.

serverlessNode.jscachingCloud FunctionsWorker Threads
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.