I have 5 documents(say) and I have some processing on each of them. Processing here includes open the document/file, read the data, do some document manipulation(edit text etc). For document manipulation I will probably be using docx4j or apache-poi. But my use case is this - I want to somehow process these 4-5 documents in parallel utilizing multiple cores available to me on my CPU. The processing on each document is independent of each other.
What would be the best way to achieve this parallel processing in Java. I have used ExecutorService in java before and Thread class too. But I dont have much idea about the newer concepts like Streams or RxJava. Can this task be achieved by using Parallel Stream in Java as introduced in Java 8? What would be better to use Executors/Streams/Thread Class etc. If Streams can be used please provide a link where I can find some tutorial on how to do that. Thanks for your help!
You can process in parallel using Java Streams using the following pattern.
List<File> files = ...
files.parallelStream().forEach(f -> process(f));
or
File[] files = dir.listFiles();
Stream.of(files).parallel().forEach(f -> process(f));
Note: process cannot throw a CheckedException in this example. I suggest you either log it or return a result object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With