Here is a rather contrived series of types. A2 is just a non-POD version of A:
template <size_t N>
struct A { 
    char data[N];
} __attribute__((packed));
template <size_t N>
struct A2 {
    A2() { // actual body not significant
        memset(data, 0, N); 
    }   
    char data[N];
} __attribute__((packed));
template <template <size_t> class T>
struct C { 
    T<9> a;
    int32_t i;
    T<11> t2; 
} __attribute__((packed));
//}; // oops, forgot to pack
template <template <size_t> class T>
struct B : C<T> {
    char footer;
} __attribute__((packed));
As-is, sizeof(B<A>) == sizeof(B<A2>) == 25. I get no warnings compiling with -Wall -pedantic (this is gcc 4.9.0). 
But now, let's say I forgot to pack C. Now I get:
sizeof(B<A>) == 32
sizeof(B<A2>) == 28
Still no warnings. What happened here? How is B<A2> smaller than B<A>? Is this just undefined behavior due to A2 not being POD? 
If I reorganize B to look have a C<T> member instead of inheriting from it, then and only then do I get a warning:
ignoring packed attribute because of unpacked non-POD field ‘C B::c’
[Update] In response to IdeaHat's comments, here are some other the offsets:
        B<A>  B<A2>
a       0     0
i       12    12
t2      16    16
footer  28    27
And
sizeof(C<A>) == 28
sizeof(C<A2>) == 28
[Update 2] I see the same behavior on clang with respect to the offsets, except that sizeof(B<A>) is 29 instead of 32. 
That behaviour is due to the compiler is allowed to apply optimizations in a non-POD type (e.g.: C<A2>).
It doesn't apply to POD types (e.g.:C<A>).
I've found this related question with a very helpful answer by Kerrek SB:
When extending a padded struct, why can't extra fields be placed in the tail padding?
On the other hand, you can force this optimization regardless the POD'ness with the -fpack-struct option in GCC. Although not recommended, it's useful for the example.
#include <stdint.h>
#include <stdio.h>
struct C {
    int16_t i;
    char t[1];
};
struct C2 {
    C2() {}
    int16_t i;
    char t[1];
};
template <class T>
struct B : T {
    char footer;
};
int main(void) {
    printf("%lu\n", sizeof(B<C>));
    printf("%lu\n", sizeof(B<C2>));
    return 0;
}
If you compile it with -fpack-struct (gcc-4.7):
sizeof(B<C>) == sizeof(B<C2>) == 4
If not:
sizeof(B<C>) == 6
sizeof(B<C2>) == 4
From man gcc (4.7):
-fpack-struct[=n]
       Without a value specified, pack all structure members together
       without holes. 
       When a value is specified (which must be a small power of two),
       pack structure members according to this value, representing
       the maximum alignment (that is, objects with default alignment
       requirements larger than this will be output potentially
       unaligned at the next fitting location.
       Warning: the -fpack-struct switch causes GCC to generate code
       that is not binary compatible with code generated without that
       switch. 
       Additionally, it makes the code suboptimal. Use it to conform
       to a non-default application binary interface.
As you can see, when a class is POD and it acts as the base class of another class, that base is not packed unless you force it. i.e.: it doesn't use the tail padding of the base.
In the particular case of the C++ ABI which GCC uses, there's a discussion about this:
It looks like the ABI document meant to require the reuse of tail-padding in non PODs, but it doesn't actually say that.
Consider this case, as the canonical example:
struct S1 {
    virtual void f();
    int i;
    char c1;
};
struct S2 : public S1 {
    char c2;
};
I think the ABI meant to say that you put "c2" in the tail padding for S1. (That is what G++ implements, FWIW.)
Take a look at what the Itanium ABI C++ (this is the one GCC uses) specifies about tail padding:
This ABI uses the definition of POD only to decide whether to allocate objects in the tail-padding of a base-class subobject. While the standards have broadened the definition of POD over time, they have also forbidden the programmer from directly reading or writing the underlying bytes of a base-class subobject with, say, memcpy. Therefore, even in the most conservative interpretation, implementations may freely allocate objects in the tail padding of any class which would not have been POD in C++98. This ABI is in compliance with that.
In addition, here's the reason why the C++ ABI doesn't use the tail padding of POD objects:
We ignore tail padding for PODs because an early version of the standard did not allow us to use it for anything else and because it sometimes permits faster copying of the type.
In your example, C<A> is POD and for that reason the ABI is not using its tail padding when the type act as the base class of B<A>.
For that, C<A> remains with padding (and occupying 28 bytes as it) and footer occupies 4 bytes respecting the alignment.
Finally, I want to share the code I used to make some test, previously to find a proper answer. You can find it useful in order to see what the compiler ABI does with the objects in C++(11) (GCC).
#include <iostream>
#include <stddef.h>
struct C {
    int16_t i;
    char t[1];
};
struct C2 {
    C2() {}
    int16_t i;
    char t[1];
};
template <class T>
struct B : T {
    char footer;
};
int main(void) {
    std::cout << std::boolalpha;
    std::cout << "standard_layout:" << std::endl;
    std::cout << "C: " << std::is_standard_layout<C>::value << std::endl;
    std::cout << "C2: " << std::is_standard_layout<C2>::value << std::endl;
    std::cout << "B<C>: " << std::is_standard_layout<B<C>>::value << std::endl;
    std::cout << "B<C2>: " << std::is_standard_layout<B<C2>>::value << std::endl;
    std::cout << std::endl;
    std::cout << "is_trivial:" << std::endl;
    std::cout << "C: " << std::is_trivial<C>::value << std::endl;
    std::cout << "C2: " << std::is_trivial<C2>::value << std::endl;
    std::cout << "B<C>: " << std::is_trivial<B<C>>::value << std::endl;
    std::cout << "B<C2>: " << std::is_trivial<B<C2>>::value << std::endl;
    std::cout << std::endl;
    std::cout << "is_pod:" << std::endl;
    std::cout << "C: " << std::is_pod<C>::value << std::endl;
    std::cout << "C2: " << std::is_pod<C2>::value << std::endl;
    std::cout << "B<C>: " << std::is_pod<B<C>>::value << std::endl;
    std::cout << "B<C2>: " << std::is_pod<B<C2>>::value << std::endl;
    std::cout << std::endl;
    std::cout << "offset:" << std::endl;
    std::cout << "C::i offset " << offsetof(C, i) << std::endl;
    std::cout << "C::t offset " << offsetof(C, t) << std::endl;
    std::cout << "C2::i offset " << offsetof(C2, i) << std::endl;
    std::cout << "C2::t offset " << offsetof(C2, t) << std::endl;
    B<C> bc;
    std::cout << "B<C>.i: " << (int)(reinterpret_cast<char*>(&bc.i) - reinterpret_cast<char*>(&bc)) << std::endl;
    std::cout << "B<C>.t: " << (int)(reinterpret_cast<char*>(&bc.t) - reinterpret_cast<char*>(&bc)) << std::endl;
    std::cout << "B<C>.footer: " << (int)(reinterpret_cast<char*>(&bc.footer) - reinterpret_cast<char*>(&bc)) << std::endl;
    B<C2> bc2;
    std::cout << "B<C2>.i: " << (int)(reinterpret_cast<char*>(&bc2.i) - reinterpret_cast<char*>(&bc2)) << std::endl;
    std::cout << "B<C2>.t: " << (int)(reinterpret_cast<char*>(&bc2.t) - reinterpret_cast<char*>(&bc2)) << std::endl;
    std::cout << "B<C2>.footer: " << (int)(reinterpret_cast<char*>(&bc2.footer) - reinterpret_cast<char*>(&bc2)) << std::endl;
    std::cout << std::endl;
    std::cout << "sizeof:" << std::endl;
    std::cout << "C: " << sizeof(C) << std::endl;
    std::cout << "C2: " << sizeof(C2) << std::endl;
    std::cout << "DIFFERENCE:\n";
    std::cout << "B<C>: " << sizeof(B<C>) << std::endl;
    std::cout << "B<C2>: " << sizeof(B<C2>) << std::endl;
    std::cout << "B<C>::C: " << sizeof(B<C>::C) << std::endl;
    std::cout << "B<C2>::C: " << sizeof(B<C2>::C2) << std::endl;
    std::cout << std::endl;
    std::cout << "alignment:" << std::endl;
    std::cout << "C: " << std::alignment_of<C>::value << std::endl;
    std::cout << "C2: " << std::alignment_of<C2>::value << std::endl;
    std::cout << "B<C>: " << std::alignment_of<B<C>>::value << std::endl;
    std::cout << "B<C2>: " << std::alignment_of<B<C2>>::value << std::endl;
    std::cout << "B<C>::C: " << std::alignment_of<B<C>::C>::value << std::endl;
    std::cout << "B<C2>::C2: " << std::alignment_of<B<C2>::C2>::value << std::endl;
    std::cout << "B<C>.i: " << std::alignment_of<decltype(std::declval<B<C>>().i)>::value << std::endl;
    std::cout << "B<C>.t: " << std::alignment_of<decltype(std::declval<B<C>>().t)>::value << std::endl;
    std::cout << "B<C>.footer: " << std::alignment_of<decltype(std::declval<B<C>>().footer)>::value << std::endl;
    std::cout << "B<C2>.i: " << std::alignment_of<decltype(std::declval<B<C2>>().i)>::value << std::endl;
    std::cout << "B<C2>.t: " << std::alignment_of<decltype(std::declval<B<C2>>().t)>::value << std::endl;
    std::cout << "B<C2>.footer: " << std::alignment_of<decltype(std::declval<B<C2>>().footer)>::value << std::endl;
    return 0;
}
standard_layout:
C: true
C2: true
B<C>: false
B<C2>: false
is_trivial:
C: true
C2: false
B<C>: true
B<C2>: false
is_pod:
C: true
C2: false
B<C>: false
B<C2>: false
offset:
C::i offset 0
C::t offset 2
C2::i offset 0
C2::t offset 2
B<C>.i: 0
B<C>.t: 2
B<C>.footer: 4
B<C2>.i: 0
B<C2>.t: 2
B<C2>.footer: 3
sizeof:
C: 4
C2: 4
DIFFERENCE:
B<C>: 6
B<C2>: 4
B<C>::C: 4
B<C2>::C: 4
alignment:
C: 2
C2: 2
B<C>: 2
B<C2>: 2
B<C>::C: 2
B<C2>::C2: 2
B<C>.i: 2
B<C>.t: 1
B<C>.footer: 1
B<C2>.i: 2
B<C2>.t: 1
B<C2>.footer: 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With