Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can optimizing compiler deduplicate functions with identical assembly?

I am working on reducing binary size of an embedded project which heavily (ab)uses macro-generated functions. Not just function-like macros, but wannabe templates like:

#define DEFINE_SWAP(type)  \
void swap_##type(struct type* lhs, struct type* rhs) {  \
    struct type temp;  \
    memcpy(&temp, lhs, sizeof(struct type));  \
    memcpy(lhs, rhs, sizeof(struct type));  \
    memcpy(rhs, &temp, sizeof(struct type));  \
}

I already have various techniques like using function-like macros and/or generic functions with type erasure (above becomes #define SWAP_TYPE(lhs, rhs) memcpy_swap(lhs, rhs, sizeof(lhs)). This discussion can also be applied to C++ with actual templates.

I was wondering, for the functions that end up with identical assembly (e.g. SWAP_FOO and SWAP_BAR with sizeof(foo) == sizeof(bar)), can the optimizer deduplicate them? I have seen Do C compilers de-duplicate (merge) code? but that asks about more general case more akin to reversing loop unrolling, whereas seeing two functions have identical assembly should be way easier.

AFAIK, two functions with different names are required to have different addresses (unless there exists something akin to [[no_unique_address]]), but that could be overcome by having one function jump to another, or point at a distinct point in a NOP slide preceding it.

like image 978
Dominik Kaszewski Avatar asked Oct 31 '25 02:10

Dominik Kaszewski


1 Answers

The general rule is that the compiler can do just about anything to a program with defined behavior as long as the outcome is according to spec. (your code).

So, sure, if a compiler is capable of recognizing that your functions do exactly the same thing and that no other part of the code is taking the addresses of these functions and uses those addresses in a way that requires them to be equal or different, it can discard one of the instances.

If that can't be proven, they must be separate instances.


Separate instances can also be simulated by the compiler if it recognizes the functions to do the same thing:

foo:
    lots of code

fooA:
    call foo

fooB:
    call foo

The compiler could even set up a lie:

foo:
    code
fooA: // just a tag
fooB: // just a tag
fooa_ptr = fooA; // one address
foob_ptr = fooB; // another address

if (fooa_ptr != foob_ptr) {
    puts("separate instances");
}

fooa_ptr(...); // does a call to foo without indirection
foob_ptr(...); // also does a call to foo without indirection

Just as long as the outcome as observed is according to what's defined by the language, anything goes.


Final words: While being allowed, I don't know of any implementation where you can rely on any of these optimizations being used consistently.

like image 121
Ted Lyngmo Avatar answered Nov 03 '25 01:11

Ted Lyngmo