In a pandas DataFrame, you can get/set existing column data by simply using:
df.column_name
Understandably, it has limitations (restrictions on names, doesn't work if column doesn't exist).
How do they actually implement this? I was expecting them to use __getattr__ and __setattribute__, but I don't see that in pandas.core.frame.py. It seems to pass through the query method, but beyond that I can't figure it out.
I have interest in doing something like this for my own application, and I really hate using __getattr__ and __setattribute__.
I did a bit more digging and found the answer in generic.py: it indeed uses __getattr__ and __setattr__:
def __getattr__(self, name: str):
"""After regular attribute access, try looking up the name
This allows simpler access to columns for interactive use.
"""
def __setattr__(self, name: str, value) -> None:
"""After regular attribute access, try setting the name
This allows simpler access to columns for interactive use.
"""
It functions as you might expect - it checks the attribute name against the DataFrame column names to decide what to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With