Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling a standard-library-function in MASM

I want to get started in MASM in a mixed C++/Assembly way. I am currently trying to call a standard-library-function (e.g. printf) from a PROC in assembly, that I then call in C++.

I have the code working after I declared printf's signature in my cpp-file. But I do not understand why I have to do this and if I can avoid that.

My cpp-file:

#include <stdio.h>

extern "C" {
    extern int __stdcall foo(int, int);
}

extern int __stdcall printf(const char*, ...); // When I remove this line I get Linker-Error "LNK2019: unresolved external symbol"

int main()
{
    foo(5, 5);
}

My asm-file:

.model flat, stdcall

EXTERN printf :PROC ; declare printf

.data

tstStr db "Mult: %i",0Ah,"Add: %i",0 ; 0Ah is the backslash - escapes are not supported

.code

foo PROC x:DWORD, y:DWORD

mov eax, x
mov ebx, y
add eax, ebx
push eax
mov eax, x
mul ebx
push eax
push OFFSET tstStr
call printf
ret

foo ENDP

END

Some Updates

In response to the comments I tried to rework the code to be eligible for the cdecl calling-convention. Unfortunatly this did not solve the problem (the code runs fine with the extern declaration, but throws an error without).

But by trial and error i found out, that the extern seems to force external linkage, even though the keyword should not be needed, because external linkage should be the default for function declarations.

I can omit the declaration by using the function in my cpp-code (i.e. if a add a printf("\0"); somewhere in the source file the linker is fine with it and everythings works correctly.

The new (but not really better) cpp-file:

#include <stdio.h>

extern "C" {
    extern int __cdecl foo(int, int);
}
extern int __cdecl printf(const char*, ...); // omiting the extern results in a linker error

int main()
{
    //printf("\0"); // this would replace the declaration
    foo(5, 5);
    return 0;
}

The asm-file:

.model flat, c

EXTERN printf :PROC

.data

tstStr db "Mult: %i",0Ah,"Add: %i",0Ah,0 ; 0Ah is the backslash - escapes are not supported

.code

foo PROC

push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ebx, [ebp+12]
add eax, ebx
push eax
mov eax, [ebp+8]
mul ebx
push eax
push OFFSET tstStr
call printf
add esp, 12
pop ebp
ret

foo ENDP

END
like image 895
Anonymous Anonymous Avatar asked Oct 20 '25 11:10

Anonymous Anonymous


2 Answers

My best guess is that this has to do with the fact that Microsoft refactored the C library starting with VS 2015 and some of the C library is now inlined (including printf) and isn't actually in the default .lib files.

My guess is in this declaration:

extern int __cdecl printf(const char*, ...);

extern forces the old legacy libraries to be included in the link process. Those libraries contain the non-inlined function printf. If the C++ code doesn't force the MS linker to include the legacy C library then the MASM code's use of printf will become unresolved.

I believe this is related to this Stackoverflow question and my answer in 2015. If you want to remove extern int __cdecl printf(const char*, ...); from the C++ code you may wish to consider adding this line to your MASM code:

includelib legacy_stdio_definitions.lib

Your MASM code would look like this if you are using CDECL calling convention and mixing C/C++ with assembly:

.model flat, C      ; Default to C language
includelib legacy_stdio_definitions.lib

EXTERN printf :PROC ; declare printf

.data

tstStr db "Mult: %i",0Ah,"Add: %i",0 ; 0Ah is the backslash - escapes are not supported

.code

foo PROC x:DWORD, y:DWORD   
    mov eax, x
    mov ebx, y
    add eax, ebx
    push eax
    mov eax, x
    mul ebx
    push eax
    push OFFSET tstStr
    call printf
    ret
foo ENDP

END

Your C++ code would be:

#include <stdio.h>

extern "C" {
    extern int foo(int, int); /* __cdecl removed since it is the default */
}

int main()
{
    //printf("\0"); // this would replace the declaration
    foo(5, 5);
    return 0;
}

The alternative to passing the includelib line in the assembly code is to add legacy_stdio_definitions.lib to the dependency list in the linker options of your Visual Studio project or the command line options if you invoke the linker manually.


Calling Convention Bug in your MASM Code

You can read about the CDECL calling convention for 32-bit Windows code in the Microsoft documentation as well as this Wiki article. Microsoft summarizes the CDECL calling convention as:

On x86 platforms, all arguments are widened to 32 bits when they are passed. Return values are also widened to 32 bits and returned in the EAX register, except for 8-byte structures, which are returned in the EDX:EAX register pair. Larger structures are returned in the EAX register as pointers to hidden return structures. Parameters are pushed onto the stack from right to left. Structures that are not PODs will not be returned in registers.

The compiler generates prologue and epilogue code to save and restore the ESI, EDI, EBX, and EBP registers, if they are used in the function.

The last paragraph is important in relation to your code. The ESI, EDI, EBX, and EBP registers are non-volatile and must be saved and restored by the called function if they are modified. Your code clobbers EBX, you must save and restore it. You can get MASM to do that by using the USES directive in a PROC statement:

foo PROC uses EBX x:DWORD, y:DWORD    
    mov eax, x
    mov ebx, y
    add eax, ebx
    push eax
    mov eax, x
    mul ebx
    push eax
    push OFFSET tstStr
    call printf
    add esp, 12               ; Remove the parameters pushed on the stack for
                              ;     the printf call. The stack needs to be
                              ;     properly restored. If not done, the function
                              ;     prologue can't properly restore EBX
                              ;     (and any registers listed by USES)
    ret
foo ENDP

uses EBX tell MASM to generate extra prologue and epilogue code to save EBX at the start and restore EBX when the function does a ret instruction. The generated instructions would look something like:

0000                    _foo:
0000  55                        push            ebp
0001  8B EC                     mov             ebp,esp
0003  53                        push            ebx
0004  8B 45 08                  mov             eax,0x8[ebp]
0007  8B 5D 0C                  mov             ebx,0xc[ebp]
000A  03 C3                     add             eax,ebx
000C  50                        push            eax
000D  8B 45 08                  mov             eax,0x8[ebp]
0010  F7 E3                     mul             ebx
0012  50                        push            eax
0013  68 00 00 00 00            push            tstStr
0018  E8 00 00 00 00            call            _printf
001D  83 C4 0C                  add             esp,0x0000000c
0020  5B                        pop             ebx
0021  C9                        leave
0022  C3                        ret
like image 126
Michael Petch Avatar answered Oct 22 '25 23:10

Michael Petch


That's indeed a bit pointless, isn't it?

Linkers are often pretty dumb things. They need to be told that an object file requires printf. Linkers can't figure that out from a missing printf symbol, stupidly enough.

The C++ compiler will tell the linker that it needs printf when you write extern int __stdcall printf(const char*, ...);. Or, and that's the normal way, the compiler will tell the linker so when you actually call printf. But your C++ code doesn't call it!

Assemblers are also pretty dumb. Your assembler clearly fails to tell the linker that it needs printf from C++.

The general solution is not to do complex things in assembly. That's just not what assembly is good for. Calls from C to assembly generally work well, calls the other way are problematic.

like image 23
MSalters Avatar answered Oct 23 '25 00:10

MSalters