We have been developing a large financial application at a bank. It started out being 150k lines of really bad code. By 1 month ago, it was down to a little more than half that, but the size of the executable was still huge. I expected that as we were just making the code more readable, but the templated code was still generating plenty of object code, we were just being more efficient with our effort.
The application is broken into about 5 shared objects and a main. One of the bigger shared objects was 40Mb and grew to 50 even while the code shrank.
I wasn't entirely surprised that the code started to grow, because after all we are adding some functionality. But I was surprised that it grew by 20%. Certainly no one came close to writing 20% of the code, so it's hard for me to imagine how it grew that much. That module is kind of hard for me to analyze, but on Friday, I have a new datapoints that sheds some light.
There are perhaps 10 feeds to SOAP servers. The code is autogenerated, badly. Each service had one parser class with exactly the same code, something like:
#include <boost/shared_ptr.hpp>
#include <xercesstuff...>
class ParserService1 {
public:
  void parse() {
    try {
      Service1ContentHandler*p = new Service1ContentHandler( ... );
      parser->setContentHandler(p);
      parser->parser();
    } catch (SAX ...) {
      ...
    }
  }
};
These classes were completely unnecessary, a single function works. Each ContentHandler class had been autogenerated with the same 7 or 8 variables, which I was able to share with inheritance.
So I was expecting the size of the code to go down when I removed the parser classes and all from the code. But with only 10 services, I wasn't expecting it to drop from 38Mb to 36Mb. That's an outrageous amount of symbols.
The only thing that I can think of is that each parser was including boost::shared_ptr, some Xerces parser stuff, and that somehow, the compiler and linker are storing all those symbols repeatedly for each file. I'm curious to find out in any case.
So, can anyone suggest how I would go about tracking down why a simple modification like this should have so much impact? I can use nm on a module to look at the symbols inside, but that's going to generate a painful, huge amount of semi-readable stuff.
Also, when a colleague ran her code with my new library, the user time went from 1m55 seconds to 1m25 seconds. The real time is highly variable, because we are waiting on slow SOAP servers (IMHO, SOAP is an incredibly poor replacement for CORBA...) but the CPU time is quite stable. I would have expected a slight boost from reducing the code size that much, but the bottom line is, on a server with massive memory, I was really surprised that the speed was impacted so much, considering I didn't change the architecture of the XML processing itself.
I'm going to take it much further on Tuesday, and hopefully will get more information, but if anyone has some idea of how I could get this much improvement, I'd love to know.
Update: I verified that in fact, having debugging symbols in the task does not appear to change the run time at all. I did this by creating a header file that included lots of stuff, including the two that had the effect here: boost shared pointers and some of the xerces XML parser. There appears to be no runtime performance hit (I checked because there were differences of opinion between two answers). However, I also verified that including header files creates debugging symbols for each instance, even though the stripped binary size is unchanged. So if you include a given file, even if you don't even use it, there is a fixed number of symbols objected into that object that are not folded together at link time even though they are presumably identical.
My code looks like:
#include "includetorture.h"
void f1()
{
    f2(); // call the function in the next file
}
The size with my particular include files was about 100k per source file. Presumably, if I had included more, it would be higher. The total executable with the includes was ~600k, without about 9k. I verified that the growth is linear with the number of files doing the including, but the stripped code is the same size regardless, as it should be.
Clearly I was mistaken thinking this was the reason for the performance gain. I think I have accounted for that now. Even though I didn't remove much code, I did streamline a lot of big xml string processing, and reduced the path through code considerably, and that is presumably the reason.
You can use the readelf utility on linux, or dumpbin on windows, to find the exact amount of space used by various kinds of data in the exe file. Though, I don't see why the executable size is worrying you: debugging symbols use ABSOLUTELY NO memory at run-time!
It seems you are using a lot of c++ classes with inline methods. If these classes have a high visibility, this inline code will bloat the whole application. I bet your link times have increased as well. Try reducing the number of inline methods and move the code to the .cpp files. This will reduce the size of your object files, the exe file and reduce link times.
The trade off in this case is of course reduced size of compilation units, versus execution time.
I don't have the very answer you are expecting to your question, but let me share my experience.
It is pretty common that the difference in size of executable files is very high. I cannot explain why in detail, but just think of all the crazy things that modern debuggers let you do on your code. You know, this is thanks to debugging symbols.
The difference in size is so big that if you are, say, loading dynamically some shared libraries, then the sheer loading time of the file could explain the difference in performance you found.
Indeed, this is a pretty "internal" aspect of compilers, and just to give you an example, years back I was quite unhappy with the huge executable files that GCC-4 produced in comparison to GCC-3, then I simply got used to it (and my HD grew in size, also).
All in all, I would not mind, because you are supposed to use builds with debugging symbols only during development, where it should not be an issue. In deployment, no debugging symbol just be there, and you will see how much the files will shrink.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With