Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does pandas do pass-by-reference Vs pass-by-value when passing dataframe to a function?

def dropdf_copy(df):
    df = df.drop('y',axis=1)

def dropdf_inplace(df):
    df.drop('y',axis=1,inplace=True)    

def changecell(df):
    df['y'][0] = 99


x = pd.DataFrame({'x': [1,2],'y': [20,31]})

x
Out[204]: 
   x   y
0  1  20
1  2  31

dropdf_copy(x)

x
Out[206]: 
   x   y
0  1  20
1  2  31

changecell(x)

x
Out[208]: 
   x   y
0  1  99
1  2  31

In the above example dropdf() doesnt modify the original dataframe x while changecell() modifies x. I know if I add the minor change to changecell() it wont change x.

def changecell(df):
    df = df.copy()
    df['y'][0] = 99

I dont think its very elegant to inlcude df = df.copy() in every function I write.

Questions

1) Under what circumstances does pandas change the original dataframe and when it does not? Can someone give me a clear generalizable rule? I know it may have something to do with mutability Vs immutability but its not clearly explained in stackoverflow.

2) Does numpy behave simillary or its different? What about other python objects?

PS: I have done research in stackoverflow but couldnt find a clear generalizable rule for this problem.

like image 554
GeorgeOfTheRF Avatar asked Dec 04 '25 05:12

GeorgeOfTheRF


1 Answers

By default python does pass by reference. Only if a explicit copy is made in the function like assignment or a copy() function is used the original object passed is unchanged.

Example with explicit copy :

#1. Assignment 
def dropdf_copy1(df):

    df = df.drop('y',axis=1)
#2. copy()
def dropdf_copy2(df):
    df = df.copy() 
    df.drop('y',axis=1,inplace = True)

If explicit copy is not done then original object passed is changed.

def dropdf_inplace(df):
    df.drop('y',axis=1,inplace = True)
like image 143
GeorgeOfTheRF Avatar answered Dec 06 '25 23:12

GeorgeOfTheRF