Can Groovy Match Java’s Performance? 110k QPS with FunTester
The author shares a deep‑dive performance analysis of the FunTester framework, comparing Java and Groovy implementations, detailing JVM tuning, hardware usage, and practical tips that enabled a single Groovy process to sustain 110,000 QPS while outlining when distributed testing and connection‑pool sizing become necessary.
Background
The author evaluated the FunTester performance‑testing framework after changing jobs and needed to verify whether it could sustain the company’s production load. Earlier single‑machine experiments showed that a single FunTester process could reach 60 kQPS and, after further tuning, up to 120 kQPS.
JVM and Groovy Tuning Strategy
Because Groovy scripts run on the JVM, the author initially expected a large performance gap compared with pure Java. To eliminate the uncertainty, the Groovy test suite was launched from a Java‑started JVM, allowing explicit control of JVM start‑up options (heap size, GC settings, thread stack size, etc.). This approach sacrifices some of Groovy’s dynamic loading convenience but enables the same resource limits as a native Java process.
Key JVM configuration observed in production:
Heap memory: 16 GB
CPU usage: ≈1200 %** (≈12 logical cores)
Garbage collection: Young GC triggered every 3 seconds , with no Full GC observed during the test run.
Observed Performance
With the tuned JVM, the Groovy implementation achieved 110 kQPS on a single process. Because Groovy can start additional processes on demand, manual multi‑process deployment scaled the load to roughly 500 kQPS without encountering resource limits.
Practical Guidelines
Avoid custom distributed testing unless required. Mature distributed load‑testing solutions provide better stability; the performance gain from a home‑grown distribution is usually marginal.
Single‑process capacity. A properly provisioned Java (or Groovy) process can comfortably handle 100 kQPS for HTTP, MySQL, Redis, and RPC workloads when sufficient CPU and memory are available.
Groovy performance parity. When the JVM is tuned, Groovy’s throughput is comparable to Java, and its ability to spawn multiple processes offers flexible scaling.
HTTP connection‑pool sizing. The number of concurrent HTTP connections should match the number of active threads. For an estimated response latency of 50 ms, achieving 100 kQPS requires about 5 000 threads (100 000 × 0.05 s). Setting the pool’s maximum size to 8 000 provides a safety margin for traffic spikes.
Caveats and Observations
Even with 5 000–8 000 active threads, context‑switch overhead remained modest; CPU consumption did not become a bottleneck. The test showed that a high thread count does not necessarily translate into excessive CPU usage.
Brief Comparison with Go
The author experimented with a Go implementation as a fallback (Plan C). In practice, Go’s memory footprint is smaller, but its CPU performance was similar to Java/Groovy for the same workload. Go’s script‑style execution offers flexibility comparable to Groovy, yet the author concluded that deep expertise in a single ecosystem (Java/Groovy) is more valuable for long‑term development.
Code example
Java+GroovySigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
