I'm trying to tweak the rules a little bit here, and malloc a buffer,
then copy a function to the buffer.
Calling the buffered function works, but the function throws a Segmentation fault when i'm trying to call another function within.
Any thoughts why?
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>
int foo(int x)
{
    printf("%d\n", x);
}
int bar(int x)
{
}
int main()
{
    int foo_size = bar - foo;
    void* buf_ptr;
    buf_ptr = malloc(1024);
    memcpy(buf_ptr, foo, foo_size);
    mprotect((void*)(((int)buf_ptr) & ~(sysconf(_SC_PAGE_SIZE) - 1)),
             sysconf(_SC_PAGE_SIZE),
             PROT_READ|PROT_WRITE|PROT_EXEC);
    int (*ptr)(int) = buf_ptr;
    printf("%d\n", ptr(3));
    return 0;
}
This code will throw a segfault, unless i'll change the foo function to:
int foo(int x)
{
    //Anything but calling another function.
    x = 4;
    return x;
}
NOTE:
The code successfully copies foo into the buffer, i know i made some assumptions, but on my platform they're ok.
See if your compiler or library can be set to check bounds on [i] , at least in debug mode. Segmentation faults can be caused by buffer overruns that write garbage over perfectly good pointers. Doing those things will considerably reduce the likelihood of segmentation faults and other memory problems.
Use debuggers to diagnose segfaults Start your debugger with the command gdb core , and then use the backtrace command to see where the program was when it crashed. This simple trick will allow you to focus on that part of the code.
Rarely, a segmentation fault can be caused by a hardware error, but for our purposes we will be taking a look at software-based memory access errors. Although, segmentation faults are the result of invalid memory access, incorrect access to memory will not always result in a segmentation fault.
Your code is not position independent and even if it were, you don't have the correct relocations to move it to an arbitrary position. Your call to printf (or any other function) will be done with pc-relative addressing (through the PLT, but that's besides the point here). This means that the instruction generated to call printf isn't a call to a static address but rather "call the function X bytes from the current instruction pointer". Since you moved the code the call is done to a bad address. (I'm assuming i386 or amd64 here, but generally it's a safe assumption, people who are on weird platforms usually mention that).
More specifically, x86 has two different instructions for function calls. One is a call relative to the instruction pointer which determines the destination of the function call by adding a value to the current instruction pointer. This is the most commonly used function call. The second instruction is a call to a pointer inside a register or memory location. This is much less commonly used by compilers because it requires more memory indirections and stalls the pipeline. The way shared libraries are implemented (your call to printf will actually go to a shared library) is that for every function call you make outside of your own code the compiler will insert fake functions near your code (this is the PLT I mentioned above). Your code does a normal pc-relative call to this fake function and the fake function will find the real address to printf and call that. It doesn't really matter though. Almost any normal function call you make will be pc-relative and will fail. Your only hope in code like this are function pointers.
You might also run into some restrictions on executable mprotect. Check the return value of mprotect, on my system your code doesn't work for one more reason: mprotect doesn't allow me to do this. Probably because the backend memory allocator of malloc has additional restrictions that prevents executable protections of its memory. Which leads me to the next point:
You will break things by calling mprotect on memory that isn't managed by you. That includes memory you got from malloc. You should only mprotect things you've gotten from the kernel yourself through mmap.
Here's a version that demonstrates how to make this work (on my system):
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
int
foo(int x, int (*fn)(const char *, ...))
{
        fn("%d\n", x);
        return 42;
}
int
bar(int x)
{
        return 0;
}
int
main(int argc, char **argv)
{
        size_t foo_size = (char *)bar - (char *)foo;
        int ps = getpagesize();
        void *buf_ptr = mmap(NULL, ps, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANON|MAP_PRIVATE, -1, 0);
        if (buf_ptr == MAP_FAILED)
                err(1, "mmap");
        memcpy(buf_ptr, foo, foo_size);
        int (*ptr)(int, int (*)(const char *, ...)) = buf_ptr;
        printf("%d\n", ptr(3, printf));
        return 0;
}
Here, I abuse the knowledge of how the compiler will generate the code for the function call. By using a function pointer I force it to generate a call instruction that isn't pc-relative. Also, I manage the memory allocation myself so that we get the right permissions from start and not run into any restrictions that brk might have. As a bonus we do error handling that actually helped me find a bug in the first version of this experiment and I also corrected other minor bugs (like missing includes) which allowed me to enable warnings in the compiler and catch another potential problem.
If you want to dig deeper into this you can do something like this. I added two versions of the function:
int
oldfoo(int x)
{
        printf("%d\n", x);
        return 42;
}
int
foo(int x, int (*fn)(const char *, ...))
{
        fn("%d\n", x);
        return 42;
}
Compile the whole thing and disassemble it:
$ cc -Wall -o foo foo.c
$ objdump -S foo | less
We can now look at the two generated functions:
0000000000400680 <oldfoo>:
  400680:       55                      push   %rbp
  400681:       48 89 e5                mov    %rsp,%rbp
  400684:       48 83 ec 10             sub    $0x10,%rsp
  400688:       89 7d fc                mov    %edi,-0x4(%rbp)
  40068b:       8b 45 fc                mov    -0x4(%rbp),%eax
  40068e:       89 c6                   mov    %eax,%esi
  400690:       bf 30 08 40 00          mov    $0x400830,%edi
  400695:       b8 00 00 00 00          mov    $0x0,%eax
  40069a:       e8 91 fe ff ff          callq  400530 <printf@plt>
  40069f:       b8 2a 00 00 00          mov    $0x2a,%eax
  4006a4:       c9                      leaveq
  4006a5:       c3                      retq
00000000004006a6 <foo>:
  4006a6:       55                      push   %rbp
  4006a7:       48 89 e5                mov    %rsp,%rbp
  4006aa:       48 83 ec 10             sub    $0x10,%rsp
  4006ae:       89 7d fc                mov    %edi,-0x4(%rbp)
  4006b1:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
  4006b5:       8b 45 fc                mov    -0x4(%rbp),%eax
  4006b8:       48 8b 55 f0             mov    -0x10(%rbp),%rdx
  4006bc:       89 c6                   mov    %eax,%esi
  4006be:       bf 30 08 40 00          mov    $0x400830,%edi
  4006c3:       b8 00 00 00 00          mov    $0x0,%eax
  4006c8:       ff d2                   callq  *%rdx
  4006ca:       b8 2a 00 00 00          mov    $0x2a,%eax
  4006cf:       c9                      leaveq
  4006d0:       c3                      retq
The instruction for the function call in the printf case is "e8 91 fe ff ff". This is a pc-relative function call. 0xfffffe91 bytes in front of our instruction pointer. It's treated as a signed 32 bit value, and the instruction pointer used in the calculation is the address of the next instruction. So 0x40069f (next instruction) - 0x16f (0xfffffe91 in front is 0x16f bytes behind with signed math) gives us the address 0x400530, and looking at the disassembled code I find this at the address:
0000000000400530 <printf@plt>:
  400530:       ff 25 ea 0a 20 00       jmpq   *0x200aea(%rip)        # 601020 <_GLOBAL_OFFSET_TABLE_+0x20>
  400536:       68 01 00 00 00          pushq  $0x1
  40053b:       e9 d0 ff ff ff          jmpq   400510 <_init+0x28>
This is the magic "fake function" I mentioned earlier. Let's not get into how this works. It's necessary for shared libraries to work and that's all we need to know for now.
The second function generates the function call instruction "ff d2". This means "call the function at the address stored inside the rdx register". No pc-relative addressing and that's why it works.
The compiler is free to generate the code the way it wants provided the observable results are correct (as if rule). So what you do is just an undefined behaviour invocation.
Visual Studio sometimes uses relays. That means that the address of a function just points to a relative jump. That's perfectly allowed per standard because of the as is rule but it would definitely break that kind of construction. Another possibility is to have local internal functions called with relative jumps but outside of the function itself. In that case, your code would not copy them, and the relative calls will just point to random memory. That means that with different compilers (or even different compilation options on same compiler) it could give expected result, crash, or directly end the program without error which is exactly UB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With