Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is dot notation implemented in pandas DataFrame?

In a pandas DataFrame, you can get/set existing column data by simply using:

df.column_name

Understandably, it has limitations (restrictions on names, doesn't work if column doesn't exist).

How do they actually implement this? I was expecting them to use __getattr__ and __setattribute__, but I don't see that in pandas.core.frame.py. It seems to pass through the query method, but beyond that I can't figure it out.

I have interest in doing something like this for my own application, and I really hate using __getattr__ and __setattribute__.

like image 544
KCharlie Avatar asked Oct 25 '25 15:10

KCharlie


1 Answers

I did a bit more digging and found the answer in generic.py: it indeed uses __getattr__ and __setattr__:

def __getattr__(self, name: str):
    """After regular attribute access, try looking up the name
    This allows simpler access to columns for interactive use.
    """

def __setattr__(self, name: str, value) -> None:
    """After regular attribute access, try setting the name
    This allows simpler access to columns for interactive use.
    """

It functions as you might expect - it checks the attribute name against the DataFrame column names to decide what to do.

like image 181
KCharlie Avatar answered Oct 27 '25 04:10

KCharlie