Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the Python executable parse and execute scripts? [closed]

Let's say I have the following script, test.py:

import my_library

bar = 12

def foo():
    nested_bar = 21

    my_library.do_things()

    def nested_foo():
        nested_bar += 11
        not_a_variable += 1
            {$ invalid_syntax

bar = 13
foo()
bar = 14

I'm curious as to what exactly happens when I run python test.py. Obviously Python doesn't just read programs line-by-line - otherwise it wouldn't catch syntax errors before actually executing the program. But this makes the workings of the interpreter seem somewhat nebulous. I was wondering if someone would help clear things up for me. In particular, I would like to know:

  1. At what point does Python realize there is a syntax error on line 13?

  2. At what point does Python read the nested functions and add them to the scope of foo?

  3. Similarly, how does Python add the function foo to its namespace when it encounters it, without executing it?

  4. Suppose my_library were an invalid import. Would Python necessarily raise an ImportError before executing any other commands?

  5. Suppose my_library were a valid module, but it has no function do_things. At what point would Python realize this, during execution of foo() or before?

If anyone could point me to documentation on how Python parses and executes scripts it would be very much appreciated.

like image 357
Jamie Avatar asked Jan 20 '26 17:01

Jamie


2 Answers

There's some information in the tutorial's section on modules, but I don't think the documentation has a complete reference for this. So, here's what happens.

When you first run a script or import a module, Python parses the syntax into an AST and then compiles that into bytecode. It hasn't executed anything yet; it's just compiled your code into instructions for a little stack-based machine. This is where syntax errors are caught. (You can see the guts of all this in the ast module, the token module, the compile builtin, the grammar reference, and sprinkled around various other places.)

You can actually compile a module independently of running the generated code; that's what the builtin compileall method does.

So that's the first phase: compiling. Python only has one other phase, which is actually running the code. Every statement in your module, except those contained within def or lambda, is executed in order. That means that imports happen at runtime, wherever you happen to put them in your module. Which is part of the reason it's good hygiene to put them all at the top. Same for def and class: these are just statements that create a specific type of object, and they're executed as they're encountered, like anything else.

The only tricky bit here is that the phases can happen more than once — for example, an import is only executed at runtime, but if you've never imported that module before, then it has to be compiled, and now you're back in compile time. But "outside" the import it's still runtime, which is why you can catch a SyntaxError thrown by an import.

Anyway, to answer your specific questions:

  1. At compile time. When you run this as a script, or when you import it as a module, or when you compile it with compileall, or otherwise ask Python to make any sense of it. In practical terms, this can happen at any time: if you tried to import this module within a function, you'd only get a SyntaxError when calling that function, which might be halfway through your program.

  2. During the execution of foo, because def and class just create a new object and assign it to a name. But Python still knows how to create the nested function, because it's already compiled all the code within it.

  3. The same way it would add foo = lambda: 1 + 2 to a namespace without executing it. A function is just an object that contains a "code" attribute — literally just a block of Python bytecode. You can manipulate the code type as data, because it is data, independently of executing it. Try looking at a function's .__code__, read the "code objects" section of the data model, or even play around with the disassembler. (You can even execute a code object directly with custom locals and globals using exec, or change the code object a function uses!)

  4. Yes, because import is a plain old statement like any other, executed in order. But if there were other code before the import, that would run first. And if it were in a function, you wouldn't get an error until that function ran. Note that import, just like def and class, is just a fancy form of assignment.

  5. Only during the execution of foo(). Python has no way of knowing whether other code will add a do_things to your module before that point, or even change my_library to some other object entirely. Attribute lookups are always done just-in-time, when you ask for them, never in advance.

like image 135
Eevee Avatar answered Jan 22 '26 05:01

Eevee


As a general rule, python first parses the file, compiles the abstract syntax tree to byte code, then attempt to execute it sequentially. That means all statements are executed line by line. Thus, this means:

  1. Syntax errors are caught at parse time, before anything is executed. If you add some side effect to the script, e.g. create a file, you will see that it never gets executed.
  2. A function becomes defined in the scope after the definition. If you try to call nested_foo right before def nested_foo() you will see that it would fail because nested_foo has not been defined at that point.
  3. Same as 2.
  4. If python cannot import a library, where import means it tries to execute the module, then it fails with an ImportError.
  5. Since you don't try to access do_things at import time (i.e. you are not doing from my_library import do_things), an error only occurs when you attempt to call foo().
like image 33
univerio Avatar answered Jan 22 '26 07:01

univerio



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!