Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing an array out of bounds, but returning earlier - UB?

I have code that calculates an array index, and if it is valid accesses that array item. Something like:

int b = rowCount() - 1;
if (b == -1) return;
const BlockInfo& bi = blockInfo[b];

I am worried that this might be triggering undefined behavior. For example, the compiler might assume that b is always non-negative, since I use it to index the array, so it will optimize the if clause away.

Under which circumstances is it safe to "access" an array out-of-bounds, when you do nothing with the invalid result? Does it change if blockInfo is not an actual array, but an container like a vector? If this is unsafe, could I fix it by putting the access in an else clause?

if (b == -1) {
    return;
} else {
    const BlockInfo& bi = blockInfo[b];
}

Lastly, are there compiler flags in the spirit of -fno-strict-aliasing or -fno-delete-null-pointer-checks that make the compiler "do the obvious thing" and prevent any unwanted behavior?

For clarification: My concern is specifically because of a different issue, where you intend to test whether a pointer is non-null before accessing it. The compiler turns this around and reasons that, since you are dereferencing it, it cannot have been null! Something like this (untested):

void someFunc(struct MyStruct *s) {
    if (s != NULL) {
       cout << s->someField << endl;
       delete s;
    }
 }

I recall hearing that simply forming an out-of-bounds array access is UB in C++. Thus the compiler could legally assume the array index is not out of bounds, and remove checks to the contrary.

like image 248
jdm Avatar asked Sep 03 '25 14:09

jdm


1 Answers

There is no access to blockInfo[-1] in your program. Your code specifically prohibits that.


For example, the compiler might assume that b is always non-negative, since I use it to index the array, so it will optimize the if clause away.

No, it cannot do that, precisely because an access to index -1 (or, rather, (std::size_t)-1) may or may not be a valid index. The language does let you pass -1 as an index; it'll just be converted first to a std::size_t with the well-defined unsigned wrap-around logic that comes with doing so. So there is not, and cannot be, any rule whereby the compiler is permitted to assume that you will never pass int -1 as an index.

Even if there were, it'd still make no sense to let the compiler completely ignore the if statement. If it could, if our if statements were not reliable, every program in the world would be unsafe! There'd be no way to enforce any of your operations' preconditions.


The compiler may only skip or re-order things when it can prove that doing so results in a well-defined program with the same behaviour as your original instructions, given any possible input.

In fact, this is where UB comes from: where proving correctness is really difficult, that's usually where the standard throws compilers a bone and says something is "undefined" and the compiler can just do whatever it likes.

One interesting example of this is kind of the opposite of your case, where a check is [erroneously] placed after the access, and the compiler therefore assumes the check passes, whether it actually did or not:

void foo(char* ptr)
{
   char x = *ptr;
   if (ptr)
      bar();
   else
      baz();
}

The function foo may call bar() even if ptr is null! That might sound unlikely to you, but it actually does happen (e.g. this crash in a widely-used library).


could I fix it by putting the access in an else clause?

Those two pieces of code are semantically equivalent; it's the same program.


Lastly, are there compiler flags in the spirit of -fno-strict-aliasing or -fno-delete-null-pointer-checks that make the compiler "do the obvious thing" and prevent any unwanted behavior?

The compiler already does the obvious thing, as long as "obvious" is "according to the C++ standard".

like image 116
Asteroids With Wings Avatar answered Sep 05 '25 03:09

Asteroids With Wings