Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does OpenMP atomic apply to line, to variable name or to actual memory address?

The way I understand the explanations on the web about the OpenMP atomic directive in C++ is that they apply to specific memory locations, designated by some variable (or its pointer?). So when I access this location on different lines of code within a parallelized for-loop, can I protect all of them or will atomic only protect one line of code and not look at other possible lines that access the same memory location?

For example, consider the following piece of code:

int N = 10000;  // just some big number
float a[N];     // a big array
#pragma omp parallel for
for(int i = 1; i < N-1; i++) {
    #pragma omp atomic
    a[i-1] += 0.5f;
    #pragma omp atomic
    a[i]   += 1.0f;
    #pragma omp atomic
    a[i+1] += 0.5f;
}

In every loop iteration, the same array is accessed at three points, at index i, i minus one and i plus one. In different threads, however, the i-1 line may evaluate to the same number as either the i or i+1 line. For instance when in thread 1 i==1 and in thread 2 i==3 the third (in thread 1) and first (in thread 2) array access lines will access the same array element, possibly at the same time.

Will atomic protect these different lines if they happen to access the same memory location? Or does it only apply to one line and would the only solution be to incorporate all three accesses into one line (e.g. by putting i-1, i and i+1 in a second array and making a second for-loop that loops over them)?

like image 384
egpbos Avatar asked Dec 13 '25 05:12

egpbos


2 Answers

From the OpenMP standard 3.1 (section 2.8.5):

The atomic construct ensures that a specific storage location is accessed atomically, rather than exposing it to the possibility of multiple, simultaneous reading and writing threads that may result in indeterminate values.

So, to give you a brief answer to:

Will atomic protect these different lines if they happen to access the same memory location?

Yes, it will.

But let me elaborate a bit more. According to the standard, the syntax of the construct is the following:

#pragma omp atomic new-line 
   expression-stmt

where expression-stmt has the form:

x binop= expr
x++
++x
x--
--x

Of course you have a few restrictions:

  1. x must be an lvalue expression with scalar type
  2. expr is an expression with scalar type, and it does not reference the object designated by x
  3. binop is not an overloaded operator and is one of +, *, -, /, &, ^, |, <<, or >>
  4. all atomic references to the storage location x throughout the program are required to have a compatible type

All these constraints are fulfilled by your snippet. In particular, point 4 is of no concern, as x is always a float in your code (in other words a[i] always returns a float). The usual example it is given to show a violation of point 4 is the use of a union, as in the link posted in other answers.

like image 193
Massimiliano Avatar answered Dec 15 '25 19:12

Massimiliano


Edit: My first try at this failed in Visual Studio 2012. But I could not think of any reason it should not work. I changed the constants to be float instead of double (0.5f instead of 0.5). Now it works. So to answer your question you can use the atomic the way you did as long as you use the same data type (don't mix float and double). I learned this readying here http://msdn.microsoft.com/en-us/library/5fhhcxk3.aspx (All atomic references to the storage location x throughout the program are required to have a compatible type.)

void foo_v1(float *a, const int N) {
    #pragma omp parallel for
    for(int i = 1; i < N-1; i++) {
        #pragma omp atomic
        a[i-1] += 0.5f;
        #pragma omp atomic
        a[i] += 1.0f;
        #pragma omp atomic
        a[i+1] += 0.5f;
    }
}

Below is my original answer before I realized your code was mixing types. It's a better solution anyway :-)

No, that's not going to work (see my correction above). You can check these things yourself by generating results with and without OpenMP and comparing and you will see it fails (see my correction above). You should make a table to see what's going on

          a[0]   a[1]   a[2]   a[3]   a[4]   a[5] ....
i=1      +=0.5  +=1.0  +=0.5
i=2             +=0.5  +=1.0  +=0.5
i=3                    +=0.5  +=1.0  +=0.5
i=4                           +=0.5  +=1.0  +=0.5
  .

Notice that for a[2] through a[N-3] they are simply a[i] = 0.5 + 1.0 + 0.5. You can replace the constants (0.5, 1.0, 0.5) with an array of values if you want. Use this code.

void foo_v2(float *a, const int N) {
    a[0] += 0.5;
    a[1] += 0.5 + 1.0;
    a[N-1] += 0.5;
    a[N-2] += 1.0 + 0.5;

    #pragma omp parallel for
    for(int i = 2; i < N-2; i++) {
        a[i] += 0.5 + 1.0 + 0.5;
    }
}

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!