This is probably a pretty easy question, but as I never worked with threads before I figured it would be best to ask instead of trying to find the optimal solution completely on my own.
I have a giant for
loop that runs literally billions of times. On each on loop run, according to the current index
, the program calculates a final result in the form of a number. I am only interested in storing the top result
(or top x results), and its corresponding index.
My question is simple, what would be the right way running this loop in threads so it uses all the available CPUs/cores.
int topResultIndex;
double topResult = 0;
for (i=1; i < 1000000000; ++i) {
double result = // some complicated calculation based on the current index
if (result > topResult) {
topResult = result;
topResultIndex = i;
}
}
The calculation is completely independent for each index, no resources are shared. topResultIndex
and topResult
will be obviously accessed by each thread though.
* Update: Both Giulio's and rolfl's solution are good, also very similar. Could only accept one of them as my answer.
Let's assume that the result is computed by a calculateResult(long)
method, which is private and static, and does not access any static field, (it can also be non-static, but still it must be thread-safe and concurrently-executable, hopefully thread-confined).
Then, I think this will do the dirty work:
public static class Response {
int index;
double result;
}
private static class MyTask implements Callable<Response> {
private long from;
private long to;
public MyTask(long fromIndexInclusive, long toIndexExclusive) {
this.from = fromIndexInclusive;
this.to = toIndexExclusive;
}
public Response call() {
int topResultIndex;
double topResult = 0;
for (long i = from; i < to; ++i) {
double result = calculateResult(i);
if (result > topResult) {
topResult = result;
topResultIndex = i;
}
}
Response res = new Response();
res.index = topResultIndex;
res.result = topResult;
return res;
}
};
private static calculateResult(long index) { ... }
public Response interfaceMethod() {
//You might want to make this static/shared/global
ExecutorService svc = Executors.newCachedThreadPool();
int chunks = Runtime.getRuntime().availableProcessors();
long iterations = 1000000000;
MyTask[] tasks = new MyTask[chunks];
for (int i = 0; i < chunks; ++i) {
//You'd better cast to double and round here
tasks[i] = new MyTask(iterations / chunks * i, iterations / chunks * (i + 1));
}
List<Future<Response>> resp = svc.invokeAll(Arrays.asList(tasks));
Iterator<Future<Response>> respIt = resp.iterator();
//You'll have to handle exceptions here
Response bestResponse = respIt.next().get();
while (respIt.hasNext()) {
Response r = respIt.next().get();
if (r.result > bestResponse.result) {
bestResponse = r;
}
}
return bestResponse;
}
From my experience, this division in chunks is much faster that having a task for each index (especially if the computational load for each single index is small, like it probably is. By small, I mean less than half a second). It's a bit harder to code, though, because you need to make a 2-step maximization (first at chunk-level, then at a global level). With this, if the computation is purely cpu-based (does not push the ram too much) you should get a speedup almost equal to 80% the number of physical cores.
Apart from the observation that a C program with OpenMP or some other parallel computing extensions would be a better idea, the Java way to do it would be to create a 'Future' Task that calculates a subset of the problem:
private static final class Result {
final int index;
final double result;
public Result (int index, double result) {
this.result = result;
this.index = index;
}
}
// Calculate 10,000 values in each thead
int steps = 10000;
int cpucount = Runtime.getRuntime().availableProcessors();
ExecutorService service = Executors.newFixedThreadPool(cpucount);
ArrayList<Future<Result>> results = new ArrayList<>();
for (int i = 0; i < 1000000000; i+= steps) {
final int from = i;
final int to = from + steps;
results.add(service.submit(new Callable<Result>() {
public Result call() {
int topResultIndex = -1;
double topResult = 0;
for (int j = from; j < to; j++) {
// do complicated things with 'j'
double result = // some complicated calculation based on the current index
if (result > topResult) {
topResult = result;
topResultIndex = j;
}
}
return new Result(topResultIndex, topResult);
}
});
}
service.shutdown();
while (!service.isTerminated()) {
System.out.println("Waiting for threads to complete");
service.awaitTermination(10, TimeUnit.SECONDS);
}
Result best = null;
for (Future<Result> fut : results) {
if (best == null || fut.result > best.result) {
best = fut;
}
}
System.out.printf("Best result is %f at index %d\n", best.result, best.index);
Future<Result>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With