Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type annotating instance attributes: in init or body?

Let's consider the two following syntax variations:

class Foo:

    x: int

    def __init__(self, an_int: int):
        self.x = an_int

And

class Foo:

    def __init__(self, an_int: int):
        self.x = an_int

Apparently the following code raises a mypy error in both cases (which is expected):

obj = Foo(3)
obj.x.title()  # this is a str operation

But I really want to enforce the contract: I want to make it clear that x is an instance variable of every Foo object. So which syntax should be preferred, and why?

like image 446
JPFrancoia Avatar asked Dec 28 '25 12:12

JPFrancoia


2 Answers

This is ultimately a matter of personal preference. To use the example in the other answer, doing both:

class Foo:
    x: Union[int, str]

    def __init__(self, an_int: int) -> None:
        self.x = an_int

...and doing:

class Foo:
    def __init__(self, an_int: int) -> None:
        self.x: Union[int, str] = an_int

...will be treated in the exact same way by type checkers.

The main advantage of doing the former is that it makes the types of your attributes more obvious in the cases where your constructor is complex to the point where it's difficult to trace what type inference is being performed.

This style is also consistent with how you declare and use things like dataclasses:

from dataclasses import dataclass

@dataclass
class Foo:
    x: int
    y: Union[int, str]
    z: str

# You get an `__init__` for free. Mypy will check to make sure the types match.
# So this type checks:
a = Foo(1, "b", "c")

# ...but this doesn't:
b = Foo("bad", 3.14, 0)

This isn't really a pro or a con, just more of an observation that the standard library has, in some specific cases, embraced the former style.

The main disadvantage is that this style is somewhat verbose: you're forced into repeating the variable name two times (three, if you include the __init__ parameter), and often forced into repeating the type hint twice (once in your variable annotation and once in in the __init__ signature).

It also opens up a possible correctness issue in your code: mypy will never actually check to make sure you've assigned anything to your attribute! For example, the following code will happily type check despite that it crashes at runtime:

class Foo:
    x: int

    def __init__(self, x: int) -> None:
        # Whoops, I forgot to do 'self.x = x'
        pass

f = Foo(1)

# Type checks, but crashes at runtime!
print(f.x)

The latter style dodges these issues: if you forget to assign an attribute, mypy will complain that it doesn't exist when you try using it later.

The other main advantage of the latter style is that you can also get away with not adding an explicit type hint a lot of the time, especially if you're just assigning a parameter directly to a field. The type checker will infer the exact same type in those cases.


So given these factors, my personal preference is to:

  1. Use dataclasses (and by proxy, the former style) if I just want a simple, record-like object with an automatically generated __init__.
  2. Use the latter style if I either feel dataclasses are overkill or need to write a custom __init__, to decrease both verbosity and the odds of running into the "forgot-to-assign-an-attribute" bug.
  3. Switch back to the former style if I have a sufficiently large and complex __init__ that's somewhat difficult to read. (Or better yet, just refactor my code so I can keep the __init__ simple!)

You may end up weighing these factors differently and come up with a different set of tradeoffs, of course.


One final tangent -- when you do:

class Foo:
    x: int

...you are not actually annotating a class variable. At this point, x has no value, so doesn't actually exist as a variable.

The only thing you're creating is an annotation, which is just pure metadata and distinct from the variable itself.

But if you do:

class Foo:
    x: int = 3

...then you are creating both a class variable and an annotation. Somewhat confusingly, while you may be creating a class variable/attribute (as opposed to an instance variable/attribute), mypy and other type checker will continue assuming that type annotation is meant to annotate specifically an instance attribute.

This inconsistency usually doesn't matter in practice, especially if you follow the general best practice of avoiding mutable default values for anything. But this may cause some surprises if you're trying to do something fancy.

If you want mypy/other type checkers to understand your annotation is a class variable annotation, you need to use the ClassVar type:

# Import this from 'typing_extensions' if you're using Python 3.7 or earlier
from typing import ClassVar

class Foo:
    x: ClassVar[int] = 3
like image 108
Michael0x2a Avatar answered Dec 31 '25 00:12

Michael0x2a


If you ever want to use Any, Union, or Optional for an instance variable you should annotate them:

from typing import Union

class Foo:

    x: Union[int, str]

    def __init__(self, an_int: int):
        self.x = an_int

    def setx(self, a_str: str):
        self.x = a_str

Otherwise you can use whichever you think is easier to read. mypy will infer the type from __init__.

like image 22
Anonymous Avatar answered Dec 31 '25 00:12

Anonymous



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!