I am trying to better understand a rather surprising discovery regarding unions and the common initial sequence rule. The common initial sequence rule says (class.mem 23):
In a standard-layout union with an active member of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated.
So, given:
struct A {
  int a;
  double x;
};
struct B {
  int b;
};
union U {
  A a;
  B b;
};
U u;
u.a = A{};
int i = u.b.b;
This is defined behavior and i should have the value 0 (because A and B have a CIS of their first member, an int). So far, so good. The confusing thing is that if B is replaced by simply an int:
union U {
  A a;
  int b;
};
...
int i = u.b;
According to the definition of common initial sequence:
The common initial sequence of two standard-layout struct types is...
So CISs can only apply between two standard-layout structs. And in turn:
A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
So a primitive type very definitely does not qualify; that is it cannot have a CIS with anything, so A has no CIS with an int. Therefore the standard says that the first example is defined behavior, but the second is UB. This simply does not make any sense to me at all; the compiler intuitively is at least as restricted with a primitive type as with a class. If this is intentional, is there any rhyme or reason (perhaps alignment related) as to why this makes sense? Is it possibly a defect?
ninjalj has it right: the purpose of this rule is to support tagged unions (with tags at the front). While one of the types supported in such a union could be stateless (aside from the tag), this case can be trivially addressed by making a struct containing just the tag. Thus, there is no need for extending the rule beyond structs, and by default such exceptions to undefined behavior (akin to strict aliasing in this case) should be kept to a minimum for the usual reasons of optimization and flexibility of future standardization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With