Efficiently Deleting Tens of Thousands of Records via API: Serial, Async, and Multithreaded Solutions
Faced with tens of thousands of dirty records and no bulk‑delete API, the author iteratively optimized a deletion script—starting with simple serial requests, then adding asynchronous HTTP calls, and finally employing multithreading—to achieve rapid, reliable removal of hundreds of entries per second.
A list query revealed tens of thousands of dirty records accumulated from previous load testing. Because the project did not provide a bulk‑delete API, a custom solution was built to delete the records one by one using the existing verification endpoint.
Version 1: Serial Requests
The simplest approach fetches a paginated list and calls the delete endpoint for each item sequentially.
public static void main(String[] args) {
def base = getBase();
def manager = new TeacherManager(base);
// iterate pages 3‑1000
3.upto(1000) {
def list = manager.verifyList(it);
list.getJSONObject("data")?.getJSONArray("list").each { x ->
manager.verify(x.id, x.tel);
}
}
allOver();
}Helper methods used by the loop:
public JSONObject verifyList(int page = 3) {
String url = TeacherManagerApi.VERIFY_LIST;
JSONObject params = getParams();
params.put("page", page);
params.put("page_size", 50); // 50 records per page
JSONObject response = getPostResponse(url, params);
output(response);
return response;
}
public JSONObject verify(int id = 0, String tel = "") {
String url = TeacherManagerApi.VERIFY;
JSONObject params = getParams();
params.put("action", 2); // 2 = reject / delete
params.put("id", id);
params.put("tel", tel);
params.put("refused_result", "清空脏数据");
JSONObject response = getPostResponse(url, params);
return response;
}Version 2: HTTP Asynchronous Optimization
To reduce the overall runtime, the verify method was changed to submit the delete request asynchronously via Apache HttpAsyncClient. The list is still retrieved synchronously, but each delete call returns immediately.
public JSONObject verify(int id = 0, String tel = "") {
String url = TeacherManagerApi.VERIFY;
JSONObject params = getParams();
params.put("action", 2);
params.put("id", id);
params.put("tel", tel);
params.put("refused_result", "清空脏数据");
output(params.toString());
HttpPost post = getPost(url, params);
setHeaders(post);
// fire‑and‑forget asynchronous request
FanLibrary.excuteSync(post);
return null;
}Utility methods that manage the asynchronous client:
public static void excuteSync(HttpRequestBase request) {
if (!ClientManage.httpAsyncClient.isRunning()) {
ClientManage.httpAsyncClient.start();
}
ClientManage.httpAsyncClient.execute(request, null);
}
public static JSONObject excuteSyncWithResponse(HttpRequestBase request) {
if (!ClientManage.httpAsyncClient.isRunning()) {
ClientManage.httpAsyncClient.start();
}
Future<HttpResponse> future = ClientManage.httpAsyncClient.execute(request, null);
try {
HttpResponse httpResponse = future.get();
String content = getContent(httpResponse);
return getJsonResponse(content, null);
} catch (Exception e) {
logger.error("异步请求获取响应失败!", e);
}
return new JSONObject();
}
private static CloseableHttpAsyncClient getCloseableHttpAsyncClient() {
return HttpAsyncClients.custom()
.setConnectionManager(NconnManager)
.setSSLHostnameVerifier(SSLConnectionSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER)
.setSSLContext(sslContext)
.build();
}Testing showed a noticeable speed improvement, but the asynchronous client must remain running until all pending requests finish; closing it prematurely causes request failures.
Version 3: Multithreading
Further speed gains were achieved by processing each page in its own thread. This allows hundreds of deletions per second and avoids the complexity of handling asynchronous responses.
public static void main(String[] args) {
def base = getBase();
def manager = new TeacherManager(base);
// launch a thread for each page (3‑100)
3.upto(100) {
new Thread({ ->
def list = manager.verifyList(it);
list.getJSONObject("data")?.getJSONArray("list").each { x ->
manager.verify(x.id, x.tel);
}
}).start();
}
allOver();
}Benchmarks indicated that the multithreaded version outperformed the asynchronous approach, consistently deleting over a hundred records per second and completing the cleanup in a matter of seconds.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
