Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C union type punning arrays

Tags:

c

c99

c11

Given the following code, I have some questions related to type punning. I do not see any way that this isn't violating strict aliasing rules, but I cannot point to the specific violation. My best guess is that passing the union members into the function violates strict aliasing.

The following code is on Compiler Explorer.

#include <stdint.h>

union my_type
{
    uint8_t m8[8];
    uint16_t m16[4];
    uint32_t m32[2];
    uint64_t m64;
};

int func(uint16_t *x, uint32_t *y)
{
    return *y += *x;
}

int main(int argc, char *argv[])
{
    union my_type mine = {.m64 = 1234567890};
    return func(mine.m16, mine.m32);
}

My observations:

  • Assuming the arguments to func do not alias each other, func does not violate strict aliasing.
  • In C, it is permissible to use a union for type punning.
  • Passing m16 and m32 into func must violate something.

My questions:

  • Is type punning with arrays like this valid?
  • What exactly am I violating by passing the pointers into func?
  • What other gotchas am I missing in this example?
like image 791
Graznarak Avatar asked Oct 29 '25 14:10

Graznarak


2 Answers

The rule violated is C 2018 6.5.16.1 3:

If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined.

Specifically, in *y += *x, the value being stored in the object pointed to by y, mine.m16, is read from another object, mine.m32, that overlaps the storage of mine.m16, but the overlap is not exact and neither do the objects have compatible types, regardless of qualifiers.

Note that this rule is for simple assignment, as in E1 = E2, whereas the code has a compound assignment, E1 += E2. However, the compound assignment E1 += E2 is defined in 6.5.16.2 3 to be equivalent to E1 = E1 + E2 except that the lvalue E1 is evaluated only once.

Is type punning with arrays like this valid?

Yes, the C standard allows aliasing via union members; reading a member other than the last one stored will reinterpret the bytes in the new type. However, this does not absolve a program of conforming to other rules if its behavior is to be defined by the C standard, notably the rule quoted above.

What exactly am I violating by passing the pointers into func?

No rule is violated by passing the pointers. The assignment using the pointers violates a rule, as answered above.

What other gotchas am I missing in this example?

If we change func:

int func(uint16_t *x, uint32_t *y)
{
    *y += 1;
    *x += 1;
    return *y;
}

then the rule in 6.5.16.1 3 does not apply, as there is no assignment involving overlapping objects. And the aliasing rules in 6.5 7 are not violated, as *y is an object defined as the type used to access it, uint16_t, and *x is an object defined as the type used to access it, uint32_t. Yet, if this function is translated in isolation (without the union definition visible), the compiler is permitted to assume *x and *y do not overlap, so it may cache the value of *y produced by *y += 1; and return that cached value, in ignorance of the fact that *x += 1; changes *y. This is a defect in the C standard.

like image 65
Eric Postpischil Avatar answered Oct 31 '25 05:10

Eric Postpischil


Passing m16 and m32 into func must violate something.

func(uint16_t *x, uint32_t *y) is free to assume *x and *y do not overlap as x, y are different enough pointer types. Since the referenced data does overlap in OP's code, we have a problem.

The special issues about unions and aliasing do not apply here in the body of func() as the union-ness of the calling code is lost.

Alternate "safe" code could have been:

// Use volatile to prevent folding these 2 lines of code.
// The key is that even with optimized code, 
// the sum must be done before *y assignment.
volatile uint32_t sum = *y + *x;
*y = sum;

return (int) (*y);

What exactly am I violating by passing the pointers into func?

Passing pointers to overlapping data that the function func() is not obliged to account for.


Is type punning with arrays like this valid?

I do not see this as an array or union issue, just one of passing pointers to overlapping data that the function func() is not obliged to account for.

What other gotchas am I missing in this example?

Minor: int may be 16-bit, potentially causing implementation defined behavior in the conversion of uint32_t to int.


Consider the difference between

uint32_t fun1(uint32_t *a, uint32_t *b);
uint32_t fun2(uint32_t * restrict a, uint32_t * restrict b);

fun1() would have to consider an overlap potential. fun2() would not.

like image 23
chux - Reinstate Monica Avatar answered Oct 31 '25 04:10

chux - Reinstate Monica