Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ForkJoinPool vs Plain Recursion

After reading about ForkJoinPool, I tried an experiment to test how fast actually ForkJoinPool is, when compared to plain recursion.

I calculated the number of files in a folder recursively, and to my surprize, plain recursion performed way better than ForkJoinPool

Here's my code.

Recursive Task

class DirectoryTask extends RecursiveTask<Long> {

    private Directory directory;

    @Override
    protected Long compute() {
        List<RecursiveTask<Long>> forks = new ArrayList<>();
        List<Directory> directories = directory.getDirectories();
        for (Directory directory : directories) {
            DirectoryTask directoryTask = new DirectoryTask(directory);
            forks.add(directoryTask);
            directoryTask.fork();
        }
        Long count = directory.getDoumentCount();
        for (RecursiveTask<Long> task : forks) {
            count += task.join();
        }
        return count;
    }
}

Plain Recursion

private static Long getFileCount(Directory directory) {
        Long recursiveCount = 0L;
        List<Directory> directories = directory.getDirectories();
        if (null != directories) {
            for (Directory d : directories) {
                recursiveCount += getFileCount(d);
            }
        }
        return recursiveCount + directory.getDoumentCount();
    }

Directory Object

class Directory {

    private List<Directory> directories;
    private Long doumentCount = 0L;

    static Directory fromFolder(File file) {
        List<Directory> children = new ArrayList<>();
        Long documentCount = 0L;
        if (!file.isDirectory()) {
            throw new IllegalArgumentException("Only directories are allowed");
        }
        String[] files = file.list();
        if (null != files) {
            for (String path : files) {
                File f = new File(file.getPath() + File.separator + path);
                if (f.isHidden()) {
                    continue;
                }
                if (f.isDirectory()) {
                    children.add(Directory.fromFolder(f));
                } else {
                    documentCount++;
                }
            }
        }
        return new Directory(children, documentCount);
    }
}

Results

  • Plain Recursion: 3ms
  • ForkJoinPool: 25ms

Where's the mistake here?

I am just trying to understand whether there is a particular threshold, below which plain recursion is faster than a ForkJoinPool.

like image 563
Rachit Sachdeva Avatar asked Oct 26 '25 08:10

Rachit Sachdeva


1 Answers

Nothing in life comes for free. If you had to move one beer crate from your car to your apartment - what is quicker: carrying it there manually, or first going to the shed, to get the wheelbarrow to use that to move that one crate?

Creating thread objects is a "native" operation that goes down into the underlying Operating System to acquire resources there. That can be a rather costly operation.

Meaning: just throwing "more threads" at a problem doesn't automatically speed things up. To the contrary. When your task is mainly CPU-intensive, there might be small gain from doing things in parallel. When you are doing lots of IO, then having multiple threads allows you to do "less" waiting overall; thus improving your throughput.

In other words: Fork/Join requires considerable activity before it does the real job. Using it for computations that only require a few ms is simply overkill. Thus: you would be looking to "fork/join" operations that work on larger data sets.

For further reading, you might look at parallel streams. Those are using the fork/join framework under the covers; and surprise, it is a misconception to expect arbitrary parallelStream to be "faster" than ordinary streams, too.

like image 141
GhostCat Avatar answered Oct 28 '25 21:10

GhostCat



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!