Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can it be done that functions in .so file are automatically exported?

In Windows, to call a function in a DLL, the function must have an explicit export declaration. For example, __declspec(dllexport) or .def file.

Other than Windows, we can call a function in a .so(shared object file) even if the function has no export declaration. It is much easier for me to make .so than .dll in terms of this.

Meanwhile, I am curious about how non-Windows enables functions defined in .so be called by other programs without having explicit export declaration. I roughly guess that all of the functions in .so file are automatically exported, but I am not sure of it.

like image 508
Hyunjik Bae Avatar asked Sep 06 '25 03:09

Hyunjik Bae


1 Answers

An .so file is conventionally a DSO (Dynamic Shared Object, a.k.a shared library) in unix-like OSes. You want to know how symbols defined in such a file are made visible to the runtime loader for dynamic linkage of the DSO into the process of some program when it's executed. That's what you mean by "exported". "Exported" is a somewhat Windows/DLL-ish term, and is also apt to be confused with "external" or "global", so we'll say dynamically visible instead.

I'll explain how dynamic visibility of symbols can be controlled in the context of DSOs built with the GNU toolchain - i.e. compiled with a GCC compiler (gcc, g++,gfortran, etc.) and linked with the binutils linker ld (or compatible alternative compiler and linker). I'll illustrate with C code. The mechanics are the same for other languages.

The symbols defined in an object file are the file-scope variables in the C source code. i.e. variables that are not defined within any block. Block-scope variables:

{ int i; ... }

are defined only when the enclosing block is being executed and have no permanent place in an object file.

The symbols defined in an object file generated by GCC are either local or global.

A local symbol can be referenced within the object file where it's defined but the object file does not reveal it for linkage at all. Not for static linkage. Not for dynamic linkage. In C, a file-scope variable definition is global by default and local if it is qualified with the static storage class. So in this source file:

foobar.c (1)

static int foo(void)
{
    return 42;
}

int bar(void)
{
    return foo();
}

foo is a local symbol and bar is a global one. If we compile this file with -save-temps:

$ gcc -save-temps -c -fPIC foobar.c

then GCC will save the assembly listing in foobar.s, and there we can see how the generated assembly code registers the fact that bar is global and foo is not:

foobar.s (1)

    .file   "foobar.c"
    .text
    .type   foo, @function
foo:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $42, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   foo, .-foo
    .globl  bar
    .type   bar, @function
bar:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    call    foo
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   bar, .-bar
    .ident  "GCC: (Ubuntu 8.2.0-7ubuntu1) 8.2.0"
    .section    .note.GNU-stack,"",@progbits

The assembler directive .globl bar means that bar is a global symbol. There is no .globl foo; so foo is local.

And if we inspect the symbols in the object file itself, with

$ readelf -s foobar.o

Symbol table '.symtab' contains 10 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     5: 0000000000000000    11 FUNC    LOCAL  DEFAULT    1 foo
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     9: 000000000000000b    11 FUNC    GLOBAL DEFAULT    1 bar

the message is the same:

     5: 0000000000000000    11 FUNC    LOCAL  DEFAULT    1 foo
     ...
     9: 000000000000000b    11 FUNC    GLOBAL DEFAULT    1 bar

The global symbols defined in the object file, and only the global symbols, are available to the static linker for resolving references in other object files. Indeed the local symbols only appear in the symbol table of the file at all for possible use by a debugger or some other object-file probing tool. If we redo the compilation with even minimal optimisation:

$ gcc -save-temps -O1 -c -fPIC foobar.c
$ readelf -s foobar.o

Symbol table '.symtab' contains 9 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     8: 0000000000000000     6 FUNC    GLOBAL DEFAULT    1 bar

then foo disappears from the symbol table.

Since global symbols are available to the static linker, we can link a program with foobar.o that calls bar from another object file:

main.c

#include <stdio.h>

extern int foo(void);

int main(void)
{
    printf("%d\n",bar());
    return 0;
}

Like so:

$ gcc -c main.c
$ gcc -o prog main.o foobar.o
$ ./prog
42

But as you've noticed, we do not need to change foobar.o in any way to make bar dynamically visible to the loader. We can just link it as it is into a shared library:

$ gcc -shared -o libbar.so foobar.o

then dynamically link the same program with that shared library:

$ gcc -o prog main.o libbar.so

and it's fine:

$ ./prog
./prog: error while loading shared libraries: libbar.so: cannot open shared object file: No such file or directory

...Oops. It's fine as long as we let the loader know where libbar.so is, since my working directory here isn't one of the search directories that it caches by default:

$ export LD_LIBRARY_PATH=.
$ ./prog
42

The object file foobar.o has a table of symbols as we've seen, in the .symtab section, including (at least) the global symbols that are available to the static linker. The DSO libbar.so has a symbol table in its .symtab section too. But it also has a dynamic symbol table, in it's .dynsym section:

$ readelf -s libbar.so

    Symbol table '.dynsym' contains 6 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __cxa_finalize
         2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
         3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTab
         4: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
         5: 00000000000010f5     6 FUNC    GLOBAL DEFAULT    9 bar

    Symbol table '.symtab' contains 45 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         ...
         ...
        21: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
        22: 0000000000001040     0 FUNC    LOCAL  DEFAULT    9 deregister_tm_clones
        23: 0000000000001070     0 FUNC    LOCAL  DEFAULT    9 register_tm_clones
        24: 00000000000010b0     0 FUNC    LOCAL  DEFAULT    9 __do_global_dtors_aux
        25: 0000000000004020     1 OBJECT  LOCAL  DEFAULT   19 completed.7930
        26: 0000000000003e88     0 OBJECT  LOCAL  DEFAULT   14 __do_global_dtors_aux_fin
        27: 00000000000010f0     0 FUNC    LOCAL  DEFAULT    9 frame_dummy
        28: 0000000000003e80     0 OBJECT  LOCAL  DEFAULT   13 __frame_dummy_init_array_
        29: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
        30: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
        31: 0000000000002094     0 OBJECT  LOCAL  DEFAULT   12 __FRAME_END__
        32: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS
        33: 0000000000003e90     0 OBJECT  LOCAL  DEFAULT   15 _DYNAMIC
        34: 0000000000004020     0 OBJECT  LOCAL  DEFAULT   18 __TMC_END__
        35: 0000000000004018     0 OBJECT  LOCAL  DEFAULT   18 __dso_handle
        36: 0000000000001000     0 FUNC    LOCAL  DEFAULT    6 _init
        37: 0000000000002000     0 NOTYPE  LOCAL  DEFAULT   11 __GNU_EH_FRAME_HDR
        38: 00000000000010fc     0 FUNC    LOCAL  DEFAULT   10 _fini
        39: 0000000000004000     0 OBJECT  LOCAL  DEFAULT   17 _GLOBAL_OFFSET_TABLE_
        40: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __cxa_finalize
        41: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
        42: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTab
        43: 00000000000010f5     6 FUNC    GLOBAL DEFAULT    9 bar

The symbols in the dynamic symbol table are the ones that are dynamically visible - available to the runtime loader. You can see that bar appears both in the .symtab and in the .dynsym of libbar.so. In both cases, the symbol has GLOBAL in the bind ( = binding) column and DEFAULT in the vis ( = visibility) column.

If you want readelf to show you just the dynamic symbol table, then:

readelf --dyn-syms libbar.so

will do it, but not for foobar.o, because an object file has no dynamic symbol table:

$ readelf --dyn-syms foobar.o; echo Done
Done

So the linkage:

$ gcc -shared -o libbar.so foobar.o

creates the dynamic symbol table of libbar.so, and populates it with symbols the from global symbol table of foobar.o (and various GCC boilerplate files that GCC adds to the linkage by defauilt).

This makes it look like your guess:

I roughly guess that all of the functions in .so file are automatically exported

is right. In fact it's close, but not correct.

See what happens if I recompile foobar.c like this:

$ gcc -save-temps -fvisibility=hidden -c -fPIC foobar.c

Let's take another look at the assembly listing:

foobar.s (2)

...
...
    .globl  bar
    .hidden bar
    .type   bar, @function
bar:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    call    foo
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
...
...

Notice the assembler directive:

    .hidden bar

that wasn't there before. .globl bar is still there; bar is still a global symbol. I can still statically link foobar.o in this program:

$ gcc -o prog main.o foobar.o
$ ./prog
42

And I can still link this shared library:

$ gcc -shared -o libbar.so foobar.o

But I can no longer dynamically link this program:

$ gcc -o prog main.o libbar.so
/usr/bin/ld: main.o: in function `main':
main.c:(.text+0x5): undefined reference to `bar'
collect2: error: ld returned 1 exit status

In foobar.o, bar is still in the symbol table:

$ readelf -s foobar.o | grep bar
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
     9: 000000000000000b    11 FUNC    GLOBAL HIDDEN     1 bar

but it is now marked HIDDEN in the vis ( = visibility) column of the output.

And bar is still in the symbol table of libbar.so:

$ readelf -s libbar.so | grep bar
    29: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
    41: 0000000000001100    11 FUNC    LOCAL  DEFAULT    9 bar

But this time, it is a LOCAL symbol. It will not be available to the static linker from libbar.so - as we saw just now when our linkage failed. And it is no longer in the dynamic symbol table at all:

$ readelf --dyn-syms libbar.so | grep bar; echo done
done

So the effect of -fvisibility=hidden, when compiling foobar.c, is to make the compiler annotate .globl symbols as .hidden in foobar.o. Then, when foobar.o is linked into libbar.so, the linker converts every global hidden symbol to a local symbol in libbar.so, so that it cannot be used to resolve references whenever libbar.so is linked with something else. And it does not add the hidden symbols to the dynamic symbol table of libbar.so, so the runtime loader cannot see them to resolve references dynamically.

The story so far: When the linker creates a shared library, it adds to the dynamic symbol table all of the global symbols that are defined in the input object files and are not marked hidden by the compiler. These become the dynamically visible symbols of the shared library. Global symbols are not hidden by default, but we can hide them with the compiler option -fvisibility=hidden. The visibility that this option refers to is dynamic visibility.

Now the ability to remove global symbols from dynamic visibility with -fvisibility=hidden doesn't look very useful yet, because it seems that any object file we compile with that option can contribute no dynamically visible symbols to a shared library.

But actually, we can control individually which global symbols defined in an object file will be dynamically visible and which will not. Let's change foobar.c as follows:

foobar.c (2)

static int foo(void)
{
    return 42;
}

int __attribute__((visibility("default"))) bar(void)
{
    return foo();
}

The __attribute__ syntax you see here is a GCC language extension that is used to specify properties of symbols that are not expressible in the standard language - such as dynamic visibility. Microsoft's declspec(dllexport) is an Microsoft language extension with the same effect as GCC's __attribute__((visibility("default"))), But for GCC, global symbols defined in an object file will possess __attribute__((visibility("default"))) by default, and you have to compile with -fvisibility=hidden to override that.

Recompile like last time:

$ gcc -fvisibility=hidden -c -fPIC foobar.c

And now the symbol table of foobar.o:

$ readelf -s foobar.o | grep bar
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
     9: 000000000000000b    11 FUNC    GLOBAL DEFAULT    1 bar

shows bar with DEFAULT visibility once again, despite -fvisibility=hidden. And if we relink libbar.so:

$ gcc -shared -o libbar.so foobar.o

we see that bar is back in the dynamic symbol table:

$ readelf --dyn-syms libbar.so | grep bar
     5: 0000000000001100    11 FUNC    GLOBAL DEFAULT    9 bar

So, -fvisibility=hidden tells the compiler to mark a global symbol as hidden unless, in the source code, we explicitly specify a countervailing dynamic visibility for that symbol.

That's one way to select precisely the symbols from an object file that we wish to make dynamically visible: pass -fvisibility=hidden to the compiler, and individually specify __attribute__((visibility("default"))), in the source code, for just the symbols we want to be dynamically visible.

Another way is not to pass -fvisibility=hidden to the compiler, and indvidually specify __attribute__((visibility("hidden"))), in the source code, for just the symbols that we don't want to be dynamically visible. So if we change foobar.c again like so:

foobar.c (3)

static int foo(void)
{
    return 42;
}

int __attribute__((visibility("hidden"))) bar(void)
{
    return foo();
}

then recompile with default visibility:

$ gcc -c -fPIC foobar.c

bar reverts to hidden in the object file:

$ readelf -s foobar.o | grep bar
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS foobar.c
     9: 000000000000000b    11 FUNC    GLOBAL HIDDEN     1 bar

And after relinking libbar.so, bar is again absent from its dynamic symbol table:

$ gcc -shared -o libbar.so foobar.o
$ readelf --dyn-syms libbar.so | grep bar; echo Done
Done

The professional approach is to minimize the dynamic API of a DSO to exactly what is specified. With the apparatus we've discussed, that means compiling with -fvisibility=hidden and using __attribute__((visibility("default"))) to expose the specified API. A dynamic API can also be controlled - and versioned - with the GNU linker using a type of linker script called a version-script: that is a yet more professional approach.

Further reading:

  • GCC Wiki: Visibility

  • GCC Manual: Common Function Attributes -> visibility ("visibility_type")

like image 98
Mike Kinghan Avatar answered Sep 08 '25 00:09

Mike Kinghan