I am trying to use openmp tasks to schedule a tiled execution of basic jacobi2d computation. In jacobi2d there is a dependence on A(i,j) from
A(i, j)
A(i-1, j)
A(i+1, j)
A(i, j-1)
A(i, j+1).
To my understanding of the depend clause I am declaring the dependences correctly, but they are not being respected while executing the code. I have copied the simplified code piece below. Initially my guess was that the out of bounds ranges for some tiles might be causing this issue, so I corrected that but the issue persists.(I have not copied the longer code with corrected tile ranges as that part is just a bunch of ifs + max)
int n=8,tsteps=2,b=4; //n - size of matrix, tsteps - time iterations, b - tile size or block size
#pragma omp parallel
{
#pragma omp master
for (t=0; t<tsteps; ++t)
{
for (i=0; i<n; i+=b)
for (j=0; j<n; j+=b)
{
#pragma omp task firstprivate(t,i,j) depend(in:A[i-1:b+2][j-1:b+2]) depend(out:B[i:b][j:b])
{
#pragma omp critical
printf("t-%d i-%d j-%d --A",t,i,j); //Prints out time loop, i,j
}
}
for (i=0; i<n; i+=b)
for (j=0; j<n; j+=b)
{
#pragma omp task firstprivate(t,i,j) depend(in:B[i-1:b+2][j-1:b+2]) depend(out:A[i:b][j:b])
{
#pragma omp critical
printf("t-%d i-%d j-%d --B",t,i,j); //Prints out time loop, i,j
}
}
}
}
}
So the idea with declaring dependence starting from i-1 and j-1 and the range being (b+2) is that the neighbouring tiles also affect your current tiles calculations. And similarly for the second set of loop where values in A should only be overwritten once the neighbouring tiles have used the values.
Code is being compiled using gcc 5.3 which supports openmp 4.0.
ps: the way array range is declared above denotes the starting position and the number of indices to be considered while creating the dependence graph.
edit (based on Zulan's comment) - changed the inner code to a simple print statement as this will suffice to check order of task execution. Ideally for the above values(since there are only 4 tiles) all tiles should complete the first printf and then only execute the second. But if you execute the code it will mix the order.
So I finally figured out the issue, even though OpenMP specs say that depend clause is supposed to be implemented with a starting point and range, it has not been implemented yet in gcc. So currently it only compares the starting point from the depend clause (depend(in:A[i-1:b+2][j-1:b+2])) A[i-1][j-1] in this case.
Initially I was comparing elements in different relative tile positions. Eg comparing (0,0) element with the last element of the tile, which was giving a no conflicts with dependence and hence the random order of execution of various tasks.
Current gcc implementation does not care about the range provided in the clause at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With