I am working on reducing binary size of an embedded project which heavily (ab)uses macro-generated functions. Not just function-like macros, but wannabe templates like:
#define DEFINE_SWAP(type) \
void swap_##type(struct type* lhs, struct type* rhs) { \
struct type temp; \
memcpy(&temp, lhs, sizeof(struct type)); \
memcpy(lhs, rhs, sizeof(struct type)); \
memcpy(rhs, &temp, sizeof(struct type)); \
}
I already have various techniques like using function-like macros and/or generic functions with type erasure (above becomes #define SWAP_TYPE(lhs, rhs) memcpy_swap(lhs, rhs, sizeof(lhs)). This discussion can also be applied to C++ with actual templates.
I was wondering, for the functions that end up with identical assembly (e.g. SWAP_FOO and SWAP_BAR with sizeof(foo) == sizeof(bar)), can the optimizer deduplicate them? I have seen Do C compilers de-duplicate (merge) code? but that asks about more general case more akin to reversing loop unrolling, whereas seeing two functions have identical assembly should be way easier.
AFAIK, two functions with different names are required to have different addresses (unless there exists something akin to [[no_unique_address]]), but that could be overcome by having one function jump to another, or point at a distinct point in a NOP slide preceding it.
The general rule is that the compiler can do just about anything to a program with defined behavior as long as the outcome is according to spec. (your code).
So, sure, if a compiler is capable of recognizing that your functions do exactly the same thing and that no other part of the code is taking the addresses of these functions and uses those addresses in a way that requires them to be equal or different, it can discard one of the instances.
If that can't be proven, they must be separate instances.
Separate instances can also be simulated by the compiler if it recognizes the functions to do the same thing:
foo:
lots of code
fooA:
call foo
fooB:
call foo
The compiler could even set up a lie:
foo:
code
fooA: // just a tag
fooB: // just a tag
fooa_ptr = fooA; // one address
foob_ptr = fooB; // another address
if (fooa_ptr != foob_ptr) {
puts("separate instances");
}
fooa_ptr(...); // does a call to foo without indirection
foob_ptr(...); // also does a call to foo without indirection
Just as long as the outcome as observed is according to what's defined by the language, anything goes.
Final words: While being allowed, I don't know of any implementation where you can rely on any of these optimizations being used consistently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With