I've compiled the following using Visual Studio C++ 2008 SP1, <code>x64</code> <code>C++</code> compiler: <img src="https://i.stack.imgur.com/cHMW6.png" alt="enter image description here"> I'm curious, why did compiler add those <code>nop</code> instructions after those <code>call</code>s? PS1. I would understand that the 2nd and 3rd <code>nop</code>s would be to align the code on a 4 byte margin, but the 1st <code>nop</code> breaks that assumption. PS2. The C++ code that was compiled had no loops or special optimization stuff in it: <pre class="prettyprint"><code>CTestDlg::CTestDlg(CWnd* pParent /*=NULL*/) : CDialog(CTestDlg::IDD, pParent) { m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME); //This makes no sense. I used it to set a debugger breakpoint ::GdiFlush(); srand(::GetTickCount()); } </code></pre> PS3. Additional Info: First off, thank you everyone for your input. Here's additional observations: <ol> <li>My first guess was that incremental linking could've had something to do with it. But, the <code>Release</code> build settings in the <code>Visual Studio</code> for the project have <code>incremental linking</code> off.</li> <li>This seems to affect <code>x64</code> builds only. The same code built as <code>x86</code> (or <code>Win32</code>) does not have those <code>nop</code>s, even though instructions used are very similar:</li> </ol> <img src="https://i.stack.imgur.com/aD5Iw.png" alt="enter image description here"> <ol start="3"> <li>I tried to build it with a newer linker, and even though the <code>x64</code> code produced by <code>VS 2013</code> looks somewhat different, it still adds those <code>nop</code>s after some <code>call</code>s:</li> </ol> <img src="https://i.stack.imgur.com/7elBP.png" alt="enter image description here"> <ol start="4"> <li>Also <code>dynamic</code> vs <code>static</code> linking to MFC made no difference on presence of those <code>nop</code>s. This one is built with dynamical linking to MFC dlls with <code>VS 2013</code>:</li> </ol> <img src="https://i.stack.imgur.com/natmC.png" alt="enter image description here"> <ol start="5"> <li>Also note that those <code>nop</code>s can appear after <code>near</code> and <code>far</code> <code>call</code>s as well, and they have nothing to do with alignment. Here's a part of the code that I got from <code>IDA</code> if I step a little bit further on:</li> </ol> <img src="https://i.stack.imgur.com/OpZsm.png" alt="enter image description here"> As you see, the <code>nop</code> is inserted after a <code>far</code> <code>call</code> that happens to "align" the next <code>lea</code> instruction on the <code>B</code> address! That makes no sense if those were added for alignment only. <ol start="6"> <li>I was originally inclined to believe that since <code>near</code> <code>relative</code> <code>call</code>s (i.e. those that start with <code>E8</code>) are somewhat faster than <code>far</code> <code>call</code>s (or the ones that start with <code>FF</code>,<code>15</code> in this case)</li> </ol> <img src="https://i.stack.imgur.com/6pjIo.png" alt="enter image description here"> the linker may try to go with <code>near</code> <code>call</code>s first, and since those are one byte shorter than <code>far</code> <code>call</code>s, if it succeeds, it may pad the remaining space with <code>nop</code>s at the end. But then the example (5) above kinda defeats this hypothesis. So I still don't have a clear answer to this.

This is purely a guess, but it might be some kind of a SEH optimization. I say optimization because SEH seems to work fine without the NOPs too. NOP might help speed up unwinding. In the following example (live demo with VC2017), there is a <code>NOP</code> inserted after a call to <code>basic_string::assign</code> in <code>test1</code> but not in <code>test2</code> (identical but declared as non-throwing1). <pre class="prettyprint"><code>#include <stdio.h> #include <string> int test1() { std::string s = "a"; // NOP insterted here s += getchar(); return (int)s.length(); } int test2() throw() { std::string s = "a"; s += getchar(); return (int)s.length(); } int main() { return test1() + test2(); } </code></pre> Assembly: <pre class="prettyprint"><code>test1: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign npad 1 ; nop call getchar . . . test2: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign call getchar </code></pre> Note that MSVS compiles by default with the <code>/EHsc</code> flag (synchronous exception handling). Without that flag the <code>NOP</code>s disappear, and with <code>/EHa</code> (synchronous and asynchronous exception handling), <code>throw()</code> no longer makes a difference because SEH is always on. <hr> 1 For some reason only <code>throw()</code> seems to reduce the code size, using <code>noexcept</code> makes the generated code even bigger and summons even more <code>NOP</code>s. MSVC...

Why does 64-bit VC++ compiler add nop instruction after function calls?

Tags:

I've compiled the following using Visual Studio C++ 2008 SP1, x64 C++ compiler:

enter image description here

I'm curious, why did compiler add those nop instructions after those calls?

PS1. I would understand that the 2nd and 3rd nops would be to align the code on a 4 byte margin, but the 1st nop breaks that assumption.

PS2. The C++ code that was compiled had no loops or special optimization stuff in it:

CTestDlg::CTestDlg(CWnd* pParent /*=NULL*/)
    : CDialog(CTestDlg::IDD, pParent)
{
    m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);

    //This makes no sense. I used it to set a debugger breakpoint
    ::GdiFlush();
    srand(::GetTickCount());
}

PS3. Additional Info: First off, thank you everyone for your input.

Here's additional observations:

My first guess was that incremental linking could've had something to do with it. But, the Release build settings in the Visual Studio for the project have incremental linking off.
This seems to affect x64 builds only. The same code built as x86 (or Win32) does not have those nops, even though instructions used are very similar:

enter image description here

I tried to build it with a newer linker, and even though the x64 code produced by VS 2013 looks somewhat different, it still adds those nops after some calls:

enter image description here

Also dynamic vs static linking to MFC made no difference on presence of those nops. This one is built with dynamical linking to MFC dlls with VS 2013:

enter image description here

Also note that those nops can appear after near and far calls as well, and they have nothing to do with alignment. Here's a part of the code that I got from IDA if I step a little bit further on:

enter image description here

As you see, the nop is inserted after a far call that happens to "align" the next lea instruction on the B address! That makes no sense if those were added for alignment only.

I was originally inclined to believe that since near relative calls (i.e. those that start with E8) are somewhat faster than far calls (or the ones that start with FF,15 in this case)

enter image description here

the linker may try to go with near calls first, and since those are one byte shorter than far calls, if it succeeds, it may pad the remaining space with nops at the end. But then the example (5) above kinda defeats this hypothesis.

So I still don't have a clear answer to this.

949

asked Jun 30 '17 20:06

c00000fd

2 Answers

This is purely a guess, but it might be some kind of a SEH optimization. I say optimization because SEH seems to work fine without the NOPs too. NOP might help speed up unwinding.

In the following example (live demo with VC2017), there is a NOP inserted after a call to basic_string::assign in test1 but not in test2 (identical but declared as non-throwing¹).

#include <stdio.h>
#include <string>

int test1() {
  std::string s = "a";  // NOP insterted here
  s += getchar();
  return (int)s.length();
}

int test2() throw() {
  std::string s = "a";
  s += getchar();
  return (int)s.length();
}

int main()
{
  return test1() + test2();
}

Assembly:

test1:
    . . .
    call     std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign
    npad     1         ; nop
    call     getchar
    . . .
test2:
    . . .
    call     std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign
    call     getchar

Note that MSVS compiles by default with the /EHsc flag (synchronous exception handling). Without that flag the NOPs disappear, and with /EHa (synchronous and asynchronous exception handling), throw() no longer makes a difference because SEH is always on.

¹ For some reason only throw() seems to reduce the code size, using noexcept makes the generated code even bigger and summons even more NOPs. MSVC...

128

answered Nov 09 '22 23:11

rustyx

This is special filler to let exception handler/unwinding function to detect correctly whether it's prologue/epilogue/body of the function.

answered Nov 09 '22 22:11

Anatoly Mikhailov

Related questions
                            
                                Load spike protection for Django Channels
                            
                                Is it possible to create an AWS AMI from a Docker image?
                            
                                Can we secure a dotnet core 2.0 React App with only aspnet identity?
                            
                                Colaboratory: How to install and use on local machine?
                            
                                Keep wifi active in foreground service after phone goes to sleep
                            
                                How to remove or identify unused packages from flutter to reduce size of the project?
                            
                                Set timezone React-Datepicker
                            
                                EF Core HasMany vs OwnsMany
                            
                                How to run tensorflow with gpu support in docker-compose?
                            
                                Android Studio everytime starts with minimize window size
                            
                                How to use IronPython with App.Config?
                            
                                Row and column labelling of matrixes in LaTeX

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does 64-bit VC++ compiler add nop instruction after function calls?

Tags:

c00000fd

People also ask

2 Answers

rustyx

Anatoly Mikhailov

Recent Activity

Donate For Us