Backend Development 27 min read

Understanding ForkJoinPool: Divide‑and‑Conquer, Implementation, and Performance Evaluation

This article explains the Fork/Join model and Java's ForkJoinPool, covering the divide‑and‑conquer algorithm, custom RecursiveTask examples, core source‑code analysis, common pool pitfalls, and performance testing to help developers decide when and how to use ForkJoinPool effectively.

IT Architects Alliance

Jun 27, 2024

Understanding ForkJoinPool: Divide‑and‑Conquer, Implementation, and Performance Evaluation

1. Divide‑and‑Conquer Algorithm and Fork/Join Model

The Fork/Join model applies the classic divide‑and‑conquer strategy: a large problem of size N is recursively split into K smaller independent sub‑problems, each solved separately, and the partial results are merged to obtain the final answer.

The basic steps of divide‑and‑conquer are:

Divide : partition the problem into smaller sub‑problems.

Solve : solve each sub‑problem directly when it becomes sufficiently small.

Merge : combine the sub‑problem solutions to form the solution of the original problem.

2. Fork/Join Application Scenario and Experience

Scenario: compute the sum of all integers from 1 to n. A custom TheKingRecursiveSumTask class extending RecursiveTask is used to demonstrate task splitting.

public class TheKingRecursiveSumTask extends RecursiveTask {
    private static final AtomicInteger taskCount = new AtomicInteger();
    private final int sumBegin;
    private final int sumEnd;
    /**
     * Task splitting threshold; tasks larger than this will be split.
     */
    private final int threshold;

    public TheKingRecursiveSumTask(int sumBegin, int sumEnd, int threshold) {
        this.sumBegin = sumBegin;
        this.sumEnd = sumEnd;
        this.threshold = threshold;
    }

    @Override
    protected Long compute() {
        if ((sumEnd - sumBegin) > threshold) {
            // Split the task
            TheKingRecursiveSumTask subTask1 = new TheKingRecursiveSumTask(sumBegin, (sumBegin + sumEnd) / 2, threshold);
            TheKingRecursiveSumTask subTask2 = new TheKingRecursiveSumTask((sumBegin + sumEnd) / 2, sumEnd, threshold);
            subTask1.fork();
            subTask2.fork();
            taskCount.incrementAndGet();
            return subTask1.join() + subTask2.join();
        }
        // Direct computation
        long result = 0L;
        for (int i = sumBegin; i < sumEnd; i++) {
            result += i;
        }
        return result;
    }

    public static AtomicInteger getTaskCount() {
        return taskCount;
    }
}

The driver code creates a pool with parallelism 16, sets the threshold to 100, and compares ForkJoin execution with a single‑threaded loop.

public static void main(String[] args) {
    int sumBegin = 0, sumEnd = 10000000;
    computeByForkJoin(sumBegin, sumEnd);
    computeBySingleThread(sumBegin, sumEnd);
}

private static void computeByForkJoin(int sumBegin, int sumEnd) {
    ForkJoinPool forkJoinPool = new ForkJoinPool(16);
    long forkJoinStartTime = System.nanoTime();
    TheKingRecursiveSumTask task = new TheKingRecursiveSumTask(sumBegin, sumEnd, 100);
    long forkJoinResult = forkJoinPool.invoke(task);
    System.out.println("======");
    System.out.println("ForkJoin task splits: " + TheKingRecursiveSumTask.getTaskCount());
    System.out.println("ForkJoin result: " + forkJoinResult);
    System.out.println("ForkJoin time (ms): " + (System.nanoTime() - forkJoinStartTime) / 1000000);
}

private static void computeBySingleThread(int sumBegin, int sumEnd) {
    long computeResult = 0L;
    long startTime = System.nanoTime();
    for (int i = sumBegin; i < sumEnd; i++) {
        computeResult += i;
    }
    System.out.println("======");
    System.out.println("Single‑thread result: " + computeResult);
    System.out.println("Single‑thread time (ms): " + (System.nanoTime() - startTime) / 1000000);
}

=====
ForkJoin task splits: 131071
ForkJoin result: 49999995000000
ForkJoin time (ms): 207
=====
Single‑thread result: 49999995000000
Single‑thread time (ms): 40

The result shows that with a very low split threshold the ForkJoinPool incurs far more overhead than a simple loop, making it slower.

3. ForkJoinPool Design and Source‑Code Analysis

ForkJoinPool, introduced in Java 7 and widely used since Java 8, implements the Executor and ExecutorService interfaces and extends AbstractExecutorService. It supports two main task types: RecursiveAction (no result) and RecursiveTask (returns a result), both subclasses of ForkJoinTask.

Key constructor parameters:

parallelism : number of worker threads (defaults to the number of available processors).

ForkJoinWorkerThreadFactory : factory for creating worker threads.

UncaughtExceptionHandler : handler for uncaught exceptions.

asyncMode : determines whether the work queue uses FIFO (true) or LIFO (false).

Three ways to construct a pool:

// 1. Default constructor
public ForkJoinPool() {
    this(Math.min(MAX_CAP, Runtime.getRuntime().availableProcessors()),
         defaultForkJoinWorkerThreadFactory, null, false);
}

// 2. Specify parallelism only
public ForkJoinPool(int parallelism) {
    this(parallelism, defaultForkJoinWorkerThreadFactory, null, false);
}

// 3. Full parameter constructor
public ForkJoinPool(int parallelism, ForkJoinWorkerThreadFactory factory,
                    UncaughtExceptionHandler handler, boolean asyncMode) {
    this(checkParallelism(parallelism), checkFactory(factory), handler,
         asyncMode ? FIFO_QUEUE : LIFO_QUEUE,
         "ForkJoinPool-" + nextPoolId() + "-worker-");
    checkPermission();
}

Task submission methods include execute (asynchronous, no result), invoke (synchronous, returns result), and submit (returns a ForkJoinTask future). The table below shows which API is used from a non‑fork/join thread versus a fork/join thread.

From non‑fork/join thread

From fork/join thread

Asynchronous execution

execute(ForkJoinTask)

ForkJoinTask.fork()

Wait and get result

invoke(ForkJoinTask)

ForkJoinTask.invoke()

Submit for Future

submit(ForkJoinTask)

ForkJoinTask.fork() (tasks are Futures)

Core methods of ForkJoinTask:

public final ForkJoinTask fork() {
    Thread t;
    if ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread)
        ((ForkJoinWorkerThread) t).workQueue.push(this);
    else
        ForkJoinPool.common.externalPush(this);
    return this;
}

public final V join() {
    int s;
    if ((s = doJoin() & DONE_MASK) != NORMAL)
        reportException(s);
    return getRawResult();
}

RecursiveAction

and RecursiveTask are the two concrete subclasses. Example of a sorting action:

static class SortTask extends RecursiveAction {
    final long[] array;
    final int lo, hi;
    SortTask(long[] array, int lo, int hi) { this.array = array; this.lo = lo; this.hi = hi; }
    SortTask(long[] array) { this(array, 0, array.length); }
    protected void compute() {
        if (hi - lo < THRESHOLD) {
            sortSequentially(lo, hi);
        } else {
            int mid = (lo + hi) >>> 1;
            invokeAll(new SortTask(array, lo, mid), new SortTask(array, mid, hi));
            merge(lo, mid, hi);
        }
    }
    static final int THRESHOLD = 1000;
    void sortSequentially(int lo, int hi) { Arrays.sort(array, lo, hi); }
    void merge(int lo, int mid, int hi) {
        long[] buf = Arrays.copyOfRange(array, lo, mid);
        for (int i = 0, j = lo, k = mid; i < buf.length; j++)
            array[j] = (k == hi || buf[i] < array[k]) ? buf[i++] : array[k++];
    }
}

And a classic Fibonacci RecursiveTask example:

class Fibonacci extends RecursiveTask<Integer> {
    final int n;
    Fibonacci(int n) { this.n = n; }
    Integer compute() {
        if (n <= 1) return n;
        Fibonacci f1 = new Fibonacci(n - 1);
        f1.fork();
        Fibonacci f2 = new Fibonacci(n - 2);
        return f2.compute() + f1.join();
    }
}

4. Caution About ForkJoinPool#commonPool

The static commonPool is a shared pool used internally by CompletableFuture and parallel streams. While it reduces the cost of creating multiple pools, it can become a bottleneck if other components submit blocking tasks, potentially starving compute‑heavy tasks and degrading overall application performance.

5. ForkJoinPool Performance Evaluation

Experiments on macOS with JDK 8 compared the custom sum task (threshold = 100) against a single‑thread loop. The ForkJoin version performed ~5× slower because the split granularity was too fine, creating 131 071 subtasks.

Increasing the threshold to 100 000 reduced the number of splits to 16 383 and made ForkJoin roughly three times faster than the single‑thread version.

=====
ForkJoin task splits: 16383
ForkJoin result: 499999999500000000
ForkJoin time (ms): 143
=====
Single‑thread result: 499999999500000000
Single‑thread time (ms): 410

Key factors influencing performance are total task count, per‑task execution time, and parallelism level. Properly tuning the split threshold and avoiding blocking operations are essential for achieving the expected speed‑up.

Conclusion

ForkJoinPool implements the divide‑and‑conquer model for parallel computation. Its advantages stem from task splitting and work‑stealing, but developers must ensure tasks are pure computations, choose an appropriate split granularity, and be cautious with the shared commonPool to avoid performance regressions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

java concurrency ThreadPool ForkJoinPool DivideAndConquer ParallelComputing

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.