Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does official documentation say that Java's parallel stream operations use fork/join?

Here's my understanding of the Stream framework of Java 8:

  1. Something creates a source Stream
  2. The implementation is responsible for providing a BaseStream#parallel() method, which in turns returns a Stream that can run it's operations in parallel.

While someone has already found a way to use a custom thread pool with Stream framework's parallel executions, I cannot for the life of me find any mention in the Java 8 API that the default Java 8 parallel Stream implementations would use ForkJoinPool#commonPool(). (Collection#parallelStream(), the methods in StreamSupport class, and others possible sources of parallel-enabled streams in the API that I don't know about).

Only tidbits that I could gleam off search results were these:

  • State of the Lambda: Libraries Edition ("Parallelism under the hood")
    Vaguely mentions the Stream framework and the Fork/Join machinery.

    The Fork/Join machinery is designed to automate this process.

  • JEP 107: Bulk Data Operations for Collections
    Almost directly states that the the Collection interface's default method #parallelStream() implements itself using Fork/Join. But still nothing about common pool.

    The parallel implementation builds upon the java.util.concurrency Fork/Join implementation introduced in Java 7.

    and hence: Collection#parallelStream().

  • Class Arrays (Javadoc)
    Directly states multiple times that the common pool is used.

    The ForkJoin common pool is used to execute any parallel tasks.


So my question is:

Where is it said that the ForkJoinPool#commonPool() is used for parallel operations on streams that are obtained from the Java 8 API?

like image 753
Gima Avatar asked Sep 07 '25 08:09

Gima


1 Answers

W.r.t. where is it documented that Java 8 parallel streams use FJ Framework?

Afaik (Java 1.8u5) it is not mentioned in the JavaDoc of parallel streams that a common ForkJoinPool is used.

But it is mentioned in the ForkJoin documentation at the bottom of http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

W.r.t. replacing the Thread pool

My understanding is that you can use a custom ForkJoinPool (instead of the common one) - see Custom thread pool in Java 8 parallel stream -, but not a custom ThreadPool which is different from the ForkJoin implementation (I have an open question here: How to (globally) replace the common thread pool backend of Java parallel streams? )

W.r.t. replacing the Streams api

You may checkout https://github.com/nurkiewicz/LazySeq which is a more Scala like streams implementation - very nice, very interesting

PS (w.r.t. ForkJoin and Streams)

If you are interested, I would like to note that I stumbled across some issues with the use of the FJ pool, see, e.g.

  • Nested Java 8 parallel forEach loop perform poor. Is this behavior expected?
  • Using a semaphore inside a nested Java 8 parallel stream action may DEADLOCK. Is this a bug?
like image 142
Christian Fries Avatar answered Sep 10 '25 12:09

Christian Fries