I have the following code snippet in C that declares multiple one-dimensional and two-dimensional arrays of type double.
double func(double alphas[], double betas[], double rhos[], double **X, unsigned int n) {
double result = 0;
double Z[3][n], inv[6];
double ZtZ[n][6];
for (unsigned int i = 0;i<n;i++){
memset(ZtZ[i],0,(size_t)6 * sizeof(double));
}
double invxZtZ[n];
memset(invxZtZ,0,(size_t)n*sizeof(double));
double arr[n];
memset(arr,0,(size_t)n * sizeof(double));
// Perform a series of steps to compute the value of the variable result.
return result;
}
So, when I run the code with the function having the above array declarations at the beginning of the function, I get the error of Segmentation Fault (Core dumped) in the following line memset(invxZtZ,0,(size_t)n*sizeof(double));
, as shown by gdb.
I believe all the arrays, except for inv[6]
, declared in the above code snippet are VLA (Variable-length arrays) because of the inclusion of n
as one of the dimensions. And, here the value of n
is 112420
. So, taking into account all the sizes of arrays declared in the above code, it actually exceeds the stack size on my WSL2 system, which is `8192 kilobytes'.
Now, it is possible to dynamically allocate memory for the above arrays using malloc/calloc, where they will be allocated in the Heap Memory.
But the problem, the above function will be called repeatedly by another function written in C, which is an implementation of a numerical optimization algorithm. So, that function will repeatedly call the above function until convergence of that optimization algorithm. In doing so, allocating memory using malloc/calloc and then freeing up memory within the function call adds extra overhead of running time during consecutive function calls. So, I wanted to avoid that extra overhead of running time by not dynamically allocating memory using malloc/calloc when the above function is repeatedly called by the optimizing function.
So, could you please suggest what the alternative is in declaring the above arrays without the need to call malloc/calloc (dynamic memory allocation), to avoid additional overhead of such allocation, and instead declare arrays in such a way that they also do not CROSS the limit of stack size? Please provide a resolution to this issue.
To avoid allocating memory in all cases for the VLAs, especially for smaller values of n
, you can define the VLAs as pointers to a local array up to a given threshold and into allocated memory beyond:
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#define ALLOC_THRESHOLD 1024 // adjust for your default stack size
double func(double alphas[], double betas[], double rhos[], double **X, unsigned int n) {
double result = 0;
double inv[6];
//double Z[3][n], ZtZ[n][6], invxZtZ[n], arr[n];
double local_array[ALLOC_THRESHOLD * (3 + 6 + 1 + 1)];
double (*Z)[n], (*ZtZ)[6], *invxZtZ, *arr;
double *array_ptr = local_array;
if (n <= ALLOC_THRESHOLD) {
memset(array_ptr, 0, sizeof(double) * n * (3 + 6 + 1 + 1));
} else {
// calloc is more efficient than malloc+memset
array_ptr = calloc(sizeof(double), n * (3 + 6 + 1 + 1));
assert(array_ptr);
}
Z = (double(*)[n])array_ptr;
ZtZ = (double(*)[6])array_ptr + (3 * n);
invxZtZ = array_ptr + ((3 + 6) * n);
arr = array_ptr + ((3 + 6 + 1) * n);
// Perform a series of steps to compute the value of the variable result.
if (n > ALLOC_THRESHOLD) free(array_ptr);
return result;
}
I would suggest having a fixed limit on n
and then use statically allocated arrays:
#define MAX_N 200000u
static double invxZtZ[MAX_N];
static double arr[MAX_N];
static double Z[3][MAX_N];
static double ZtZ[MAX_N][6];
double func(double alphas[], double betas[], double rhos[], double **X, unsigned int n) {
if(n > MAX_N) EXIT_WITH_ERROR();
double result = 0;
double inv[6];
for (unsigned int i = 0;i<n;i++){
memset(ZtZ[i],0,(size_t)6 * sizeof(double));
}
memset(invxZtZ,0,(size_t)n*sizeof(double));
memset(arr,0,(size_t)n * sizeof(double));
// Perform a series of steps to compute the value of the variable result.
return result;
}
Note that the use of statically allocated arrays means that the function becomes non-reentrant (it cannot be called recursively or from multiple threads at the same time).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With