With a basic Minimax search, it seems easy to use an OMP For to split up the work between multiple threads. For instance -
#pragma omp parallel for
for (each child node)
{
val = minimax(child, depth - 1, FALSE);
bestValue = max(bestValue, val);
}
However, it seems like this will be impossible with Alpha-Beta pruning, at least with my understanding.
#pragma omp parallel for
for (each child node)
{
α = max(α, alphabeta(child, depth - 1, α, β, FALSE));
if (β ≤ α)
break;
}
In OpenMP, it is a requirement that a For loop may only have a single entry/exit point if the loop is to be made parallel. However, Alpha-Beta pruning breaks this rule, as it is possible to break out of the loop whenever pruning needs to be accomplished (in the pseudo code above, this will happen when β is less than or equal to α).
So my question is, is there some way to work around this limitation of OpenMP? I would like to run my Alpha-Beta search in parallel using OpenMP, but this limitation has me stumped at the moment.
As a first idea you can calculate the root moves in parallel (for example with #pragma omp parallel for schedule(dynamic, 1), so that every thread gets its own root move and after that, like a task pool, choose the next that no thread has touched. Because you have to calculate every root move, there is no problem with a break where your threads have to leave the loop. If a thread is ready with it's root move you can update the "best value" and use it for the next root moves. It has to be a shared value and you have to guard it with an #pragma omp critical for example.
If you have only 2-4 working threads that's an satisfying approach. But it has disadvantages when there are only a few root-moves in a position or you have a very good sorting function for the moves. If the best move is calculate at first, in the sequential case the other moves will be calculate very fast because of the cut-offs. So it is a waste of calculate power, if the other threads search parallel other root moves before the value of the probably "best move" is known. It is possible to ordering the moves so efficient, that in over 99% of the cases the best move will be calculate at fist. This works with helpers called "history table", "killer moves" and "null move prunging".
As a state of the art parallelization you can use the Young Brothers Wait Concept (YBWC). There, like in the Principal Variation Splitting (PVS), in every node the first move is calculate sequential (!) and only if there is no cut-off the alternative moves (brothers) will be calculate in parallel. The brothers have the already calculated value of the first node to make fast cut-offs and only try to beat it. Hardly every open source engine in the top 10 use its own variation of that, but its not easy to implement, because you can spawn tasks in every node in different depths so its not easy to formulate with the standard OpenMP constructs. The new OpenMP 3.1 tasks-construct can help, but I'm not sure about that. As an first place to go you can visit https://www.chessprogramming.org/Parallel_Search to read in the topic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With