How I Boosted FunTester QPS by 14% and Halved Memory Usage
After a weekend of code refactoring, asynchronous processing, and removing unnecessary statistics, the author increased FunTester's QPS from 104,375 to 118,904 (≈13.9% gain), reduced memory consumption by over 57%, and documented detailed performance impacts of various optimizations with code samples and benchmark tables.
Goal
The aim was to close the performance gap of the FunTester load‑testing framework after an initial benchmark showed it lagging behind K6 and Gatling in both memory consumption and queries per second (QPS).
Initial Benchmark
Baseline measurements (CPU usage, memory usage, QPS, response time):
K6 – CPU 718.74, Memory 370 MB, QPS 75 980, RT 1 ms
Gatling – CPU 585.97, Memory 350 MB, QPS 113 355, RT 1 ms
FunTester – CPU 528.03, Memory 1 770 MB, QPS 104 375, RT 1 ms
Three improvement actions were planned:
Convert non‑essential processing to asynchronous execution.
Replace the test‑metadata storage mechanism.
Gradually remove business‑related compatibility code.
Optimizations Applied
Removed all heavy statistics and calculation code, keeping only the total run count and total execution time.
Disabled response body parsing (which previously kept the full HTTP response as a String) while preserving the method stub for future verification.
Changed the data type used for response‑time statistics from int to short to reduce memory footprint.
Test Code
The simplified test uses the com.funtester.base.constaint.FixedThread template instead of the older ThreadLimitTimesCount implementation:
import com.funtester.base.constaint.FixedThread;
import com.funtester.base.constaint.ThreadBase;
import com.funtester.config.Constant;
import com.funtester.frame.execute.Concurrent;
import com.funtester.httpclient.ClientManage;
import com.funtester.httpclient.FunLibrary;
import org.apache.http.client.methods.HttpRequestBase;
class Share2 extends FunLibrary {
public static void main(String[] args) {
ClientManage.init(10, 5, 0, "", 0);
String url = "http://localhost:12345/tps";
Constant.RUNUP_TIME = 0;
def get = getHttpGet(url);
def task = new FunTester(get);
new Concurrent(task, 30, "Local fixed‑QPS test").start();
testOver();
}
private static class FunTester extends FixedThread<HttpRequestBase> {
FunTester(HttpRequestBase request) {
super(request, 50000, true);
}
@Override
protected void doing() throws Exception {
FunLibrary.executeOnly(request);
}
@Override
ThreadBase clone() {
return new FunTester(request);
}
}
}Modified Load‑Test Template
The run method of FixedThread was stripped of all commented‑out statistics. The essential loop now looks like this:
public abstract class FixedThread<F> extends ThreadBase<F> {
@Override
public void run() {
try {
before();
long start = Time.getTimeStamp();
while (true) {
executeNum++;
doing();
long now = Time.getTimeStamp();
if ((now - start) / 1000 > RUNUP_TIME + 3) {
logger.info("Thread:{}, exec:{}, errors:{}, elapsed:{} s",
threadName, executeNum, errorNum, (now - start) / 1000.0);
}
// Statistics collection is omitted here for the experiment
if (executeNum >= limit) break;
}
} catch (Exception e) {
logger.warn("Task execution failed!", e);
} finally {
after();
}
}
// other methods omitted for brevity
}Real‑World Measurements
Monitoring Overhead
Using macOS Activity Monitor and jvisualvm, the monitoring tools themselves consumed >1 GB of heap, while the actual test kept memory below 300 MB. After the optimizations, FunTester measured:
CPU 558.71
Memory 741.9 MB
QPS 117 123
Exception Handling Impact
Re‑enabling the try‑catch block in FixedThread#run produced the following results (no exceptions were thrown):
CPU 532.26
Memory 684.8 MB
QPS 118 400
Conclusion: exception handling overhead is negligible when the error rate is low.
Statistics Collection Overhead
When full per‑request timing statistics were restored, memory rose sharply while QPS remained similar:
CPU 558.71
Memory 961.6 MB (≈+120 MB)
QPS 117 054
The additional heap usage is caused by the objects that store each request’s response time.
Response Parsing Overhead
Re‑introducing the original response‑parsing method, which converts the HTTP entity to a String, yielded:
CPU 562.79
Memory 1.25 GB
QPS 110 501
Keeping the full response in memory significantly increases heap consumption, while the impact on QPS is modest.
Summary of Results
QPS improved from 104 375 to approximately 118 904, a gain of ~13.9 %.
Peak memory usage dropped from 1 770 MB to the 700–800 MB range (≈57 % reduction).
Exception handling adds negligible overhead when errors are rare.
Full statistics collection adds ~120 MB of heap usage without affecting QPS.
Parsing the full HTTP response inflates memory to >1 GB and slightly reduces QPS.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
