Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the compiler differentiate indentically-named items

In the following example:

int main(void) {
    int a=7;
    {
        int a=8;
    }
}

The generated assembly would be something like this (from Compiler Explorer) without optimizations:

main:
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $7, -4(%rbp)   // outer scope: int a=7
        movl    $8, -8(%rbp)   // inner scope: int a=8 
        movl    $0, %eax
        popq    %rbp
        ret

How does the compiler know where the variable is if there are duplicately-named variables? That is, when in the inner scope, the memory address is at %rbp-8 and when in the outer scope the address is at %rbp-4.

like image 243
samuelbrody1249 Avatar asked Nov 24 '25 15:11

samuelbrody1249


1 Answers

There are many ways to implement the local scoping rule. Here is a simple example:

  • the compiler can keep a list of nested scopes, each with its own list of symbol definitions.
  • this list initially has a single element for the global scope,
  • when it parses a function definition, it adds a new scope element in front of the scope list for the function argument names, and adds each argument name with the corresponding information in the identifier list of this scope element.
  • for each new block, it adds a new scope element in front of the scope list. for ( introduces a new scope too for definitions in its first clause.
  • upon leaving the scope (at the end of the block), it pops the scope element from the scope list.
  • when it parses a declaration or a definition, if the corresponding symbol is already in the current scope's list, it is a local redefinition, which is forbidden (except for extern forward declarations). Otherwise the symbol is added to the scope list.
  • when it encounters a symbol in an expression, it looks it up in the current scope's list of symbols, and each successive scope in the scope list until it finds it. If the symbol cannot be found, it is undefined, which is an error according to the latest C Standard. Otherwise the symbol information is used for further parsing and code generation.

The above steps are performed for type and object names, a separate list of symbols is maintained for struct, union and enum tags.

Preprocessing is performed before all of this occurs, in a separate phase of program translation.

like image 141
chqrlie Avatar answered Nov 27 '25 05:11

chqrlie



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!