Understanding ForkJoinPool and the Fork/Join Framework in Java
This article explains the limitations of ThreadPoolExecutor, introduces the Fork/Join model and ForkJoinPool, demonstrates how to implement divide‑and‑conquer tasks with RecursiveTask, analyzes the pool’s design, task submission methods, work‑stealing mechanism, common pool pitfalls, and presents performance evaluation results.
Previously we studied ThreadPoolExecutor , which efficiently manages a task queue and a pool of threads but has two major drawbacks: it cannot split large tasks and its workers compete for tasks from the queue, both of which hurt performance in high‑concurrency scenarios.
To address these issues, the article introduces ForkJoinPool , an implementation of the Fork/Join model based on the divide‑and‑conquer algorithm. The model recursively splits a large problem into independent sub‑problems, solves them in parallel, and then merges the results.
1. Divide‑and‑Conquer – The basic idea is to decompose a problem of size N into K smaller, independent sub‑problems, solve each, and combine the solutions. The steps are:
Decompose the problem.
Solve each sub‑problem (often sequentially when small enough).
Merge the sub‑results to obtain the final answer.
2. Fork/Join Application Example – A RecursiveTask<Long> named TheKingRecursiveSumTask computes the sum of numbers in a range. If the range exceeds a threshold, it splits into two subtasks, forks them, and joins the results:
public class TheKingRecursiveSumTask extends RecursiveTask<Long> {
private final int sumBegin;
private final int sumEnd;
private final int threshold;
// constructor omitted for brevity
@Override
protected Long compute() {
if ((sumEnd - sumBegin) > threshold) {
TheKingRecursiveSumTask subTask1 = new TheKingRecursiveSumTask(sumBegin, (sumBegin + sumEnd) / 2, threshold);
TheKingRecursiveSumTask subTask2 = new TheKingRecursiveSumTask((sumBegin + sumEnd) / 2, sumEnd, threshold);
subTask1.fork();
subTask2.fork();
return subTask1.join() + subTask2.join();
}
long result = 0L;
for (int i = sumBegin; i < sumEnd; i++) {
result += i;
}
return result;
}
}A driver program creates a ForkJoinPool with parallelism 16, invokes the task, and compares it with a single‑threaded loop. The initial experiment (threshold = 100) shows many task splits (131 071) and slower overall time (207 ms vs 40 ms single‑thread), illustrating that overly fine granularity hurts performance.
3. ForkJoinPool Design and Constructors – Four core parameters control parallelism, worker‑thread factory, exception handling, and queue mode (FIFO vs LIFO). Three construction styles are provided: default (no‑arg), parallelism‑only, and full‑parameter constructor.
4. Task Submission Methods – ForkJoinPool supports three families of operations:
Asynchronous execution: execute(ForkJoinTask) or execute(Runnable) .
Synchronous execution with result: invoke(ForkJoinTask) or ForkJoinTask.fork() followed by join() .
Submission returning a Future : submit(...) for ForkJoinTask , Callable , or Runnable .
5. ForkJoinTask Types – Two main subclasses exist: RecursiveAction – performs work without returning a value (e.g., parallel sort). RecursiveTask<V> – returns a result (e.g., the sum example or Fibonacci).
6. Work‑Stealing Queues – Each worker thread has a double‑ended queue; it pushes and pops tasks from the head, while idle workers steal from the tail of other queues, reducing contention and improving cache locality.
7. CommonPool Caveats – The static ForkJoinPool.commonPool() is shared by CompletableFuture and parallel streams. Submitting blocking tasks to this pool can starve other computations, so for I/O‑bound work a dedicated pool is recommended.
8. Performance Evaluation – Experiments on macOS JDK 8 show that with a very low split threshold the Fork/Join approach is slower than a plain loop due to excessive task creation. Raising the threshold to 100 000 reduces splits (16 383) and makes ForkJoinPool faster (143 ms vs 410 ms single‑thread). The article concludes that task count, per‑task cost, and parallelism level must be tuned for optimal results.
Conclusion – Fork/Join provides powerful parallelism for pure computational tasks through divide‑and‑conquer and work‑stealing, but developers must avoid blocking operations, choose appropriate thresholds, and monitor the common pool to prevent performance regressions.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.