Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenMP reduce on large heap array cause segment fault

When I try to reduce a large heap array with OpenMP reduction, it segment fault:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  double *test = NULL;
  size_t size = (size_t)1024 * 1024 * 16; // large enough to overflow openmp threads' stack

  test = malloc(size * sizeof(double));

#pragma omp parallel reduction(+ : test[0 : size]) num_threads(2) 
  {
    test[0] = 0;
#pragma omp critical
    {
      printf("frame address: %p\n", __builtin_frame_address(0));
      printf("test: %p\n", test);
    }
  }
  free(test);
  printf("Allocated %zu doubles\n\n", size);
}

Please note that double *test is allocated on heap, thus not a duplication of this and this.

This example works with small size array, but segment fault with large array. The array is allocated on heap, and the system memory is enough.

Simimar issue but segment fault still happens even when the array is allocated on heap.

There are same issue on other community:

https://community.intel.com/t5/Intel-Fortran-Compiler/Segmentation-fault-when-using-large-array-with-OpenMP/m-p/758829

https://forums.oracle.com/ords/apexds/post/segmentation-fault-with-large-arrays-and-openmp-1728

but all the solution I found is about increase openmp stack size.

like image 724
LXYan Avatar asked Dec 02 '25 18:12

LXYan


1 Answers

I thought there should be a real solution so I issue a bug on gcc's bugzilla:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118909

Thanks to Jakub Jelinek, the reason of this bug is, most compiler allocates privatized data on stack, which is good for performance. If you do need a large privatized data, you can either increase the OMP stack size by set OMP_STACKSIZE environment variable or use allocate clause to specify it should be allocated on heap.

So the solution is adding allocate(test) to make the privatized test array on heap:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  double *test = NULL;
  size_t size = (size_t)1024 * 1024 * 16;

  test = malloc(size * sizeof(double));

#pragma omp parallel reduction(+ : test[0 : size]) num_threads(2) allocate(test)
  {
    test[0] = 0;
#pragma omp critical
    {
      printf("frame address: %p\n", __builtin_frame_address(0));
      printf("test: %p\n", test);
    }
  }
  free(test);
  printf("Allocated %zu doubles\n\n", size);
}
like image 102
LXYan Avatar answered Dec 04 '25 09:12

LXYan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!