Assume x has type list[int] | list[str], that is, it is either a list containing only ints or a list containing only strings.
Then set(x) is either a set containing only ints, or a set containing only strings, so I would expect it to have the type set[int] | set[str]. However, Pylance infers the type set[int | str], which in my understanding describes a set that may contain both ints and strings, so clearly a different type. (Indeed, Pylance then agrees that these two types are not compatible.)
Is this behavior a bug or am I missing something?
def my_set(x: list[int] | list[str]) -> set[int] | set[str]:
return set(x)
Now Pylance reports at set(x):
Expression of type "set[int | str]" cannot be assigned to return type "set[int] | set[str]"
Type "set[int | str]" cannot be assigned to type "set[int] | set[str]"
"set[int | str]" is incompatible with "set[int]"
TypeVar "_T@set" is invariant
Type "int | str" cannot be assigned to type "int"
"str" is incompatible with "int"
"set[int | str]" is incompatible with "set[str]"
TypeVar "_T@set" is invariant
Type "int | str" cannot be assigned to type "str"
...Pylance(reportGeneralTypeIssues)
Use a constrained generic type variable: (PEP 484)
from typing import TypeVar
T = TypeVar("T", str, int)
def my_set(x: list[T]) -> set[T]:
return set(x)
a = my_set([1, 2, 3])
b = my_set(["x", "y", "z"])
reveal_type(a) # inferred by `mypy` as `set[int]`
reveal_type(b) # inferred by `mypy` as `set[str]`
I do agree that it is strange that the type is not inferred the way you expected it to. Not sure why that is. Even if there is no way to know, whether list[int] or list[str] is passed to set(), it is certainly either one or the other. So I would also expect the returned type to be inferred as the union of set[int] and set[str].
This type inference works with one concrete type:
reveal_type(set([1, 2, 3])) # inferred as `set[int]`
I wonder why it doesn't work with the union.
Interestingly, mypy actually completely erases all specificity, as opposed to the wrong set[int | str] you got. With mypy I get set[object].
Either way, going the generic route is the "more correct" way. That way you will have type certainty for the return value as opposed to a union.
UPDATE: I opened an issue here for mypy. In that particular case it seems to stem from the fact that type joins are used. We'll see if/how that will be resolved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With